path: root/read-cache.c
AgeCommit message (Collapse)Author
2007-08-10Optimize "diff --cached" performance.Junio C Hamano
The read_tree() function is called only from the call chain to run "git diff --cached" (this includes the internal call made by git-runstatus to run_diff_index()). The function vacates stage without any funky "merge" magic. The caller then goes and compares stage #1 entries from the tree with stage #0 entries from the original index. When adding the cache entries this way, it used the general purpose add_cache_entry(). This function looks for an existing entry to replace or if there is none to find where to insert the new entry, resolves D/F conflict and all the other things. For the purpose of reading entries into an empty stage, none of that processing is needed. We can instead append everything and then sort the result at the end. This commit changes read_tree() to first make sure that there is no existing cache entries at specified stage, and if that is the case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag (new), and then sort the resulting cache using qsort(). This new flag tells add_cache_entry() to omit all the checks such as "Does this path already exist? Does adding this path remove other existing entries because it turns a directory to a file?" and instead append the given cache entry straight at the end of the active cache. The caller of course is expected to sort the resulting cache at the end before using the result. Signed-off-by: Junio C Hamano <>
2007-07-31add_file_to_index: skip rehashing if the cached stat already matchesJunio C Hamano
An earlier commit 366bfcb6 broke git-add by moving read_cache() call down, because it wanted the directory walking code to grab paths that are already in the index. The change serves its purpose, but introduces a regression because the responsibility of avoiding unnecessary reindexing by matching the cached stat is shifted nowhere. This makes it the job of add_file_to_index() function. Signed-off-by: Junio C Hamano <>
2007-07-01git add: respect core.filemode with unmerged entriesJohannes Schindelin
When a merge left unmerged entries, git add failed to pick up the file mode from the index, when core.filemode == 0. If more than one unmerged entry is there, the order of stage preference is 2, 1, 3. Noticed by Johannes Sixt. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-06-07War on whitespaceJunio C Hamano
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <>
2007-05-22rename dirlink to gitlink.Martin Waitz
Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' | xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <>
2007-04-25read_cache_from(): small simplificationLuiz Fernando N. Capitulino
This change 'opens' the code block which maps the index file into memory, making the code clearer and easier to read. Signed-off-by: Luiz Fernando N. Capitulino <> Signed-off-by: Junio C Hamano <>
2007-04-23Make read-cache.c "the_index" free.Junio C Hamano
This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <>
2007-04-23Move index-related variables into a structure.Junio C Hamano
This defines a index_state structure and moves index-related global variables into it. Currently there is one instance of it, the_index, and everybody accesses it, so there is no code change. Signed-off-by: Junio C Hamano <>
2007-04-14Fix gitlink index entry filesystem matchingLinus Torvalds
The code to match up index entries with the filesystem was stupidly broken. We shouldn't compare the filesystem stat() information with S_IFDIRLNK, since that's purely a git-internal value, and not what the filesystem uses (on the filesystem, it's just a regular directory). Also, don't bother to make the stat() time comparisons etc for DIRLNK entries in ce_match_stat_basic(), since we do an exact match for these things, and the hints in the stat data simply doesn't matter. This fixes "git status" with submodules that haven't been checked out in the supermodule. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2007-04-12Teach directory traversal about subprojectsLinus Torvalds
This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2007-04-12Fix thinko in subproject entry sortingLinus Torvalds
This fixes a total thinko in my original series: subprojects do *not* sort like directories, because the index is sorted purely by full pathname, and since a subproject shows up in the index as a normal NUL-terminated string, it never has the issues with sorting with the '/' at the end. So if you have a subproject "proj" and a file "proj.c", the subproject sorts alphabetically before the file in the index (and must thus also sort that way in a tree object, since trees sort as the index). In contrast, it you have two files "proj/file" and "proj.c", the "proj.c" will sort alphabetically before "proj/file" in the index. The index itself, of course, does not actually contain an entry "proj/", but in the *tree* that gets written out, the tree entry "proj" will sort after the file entry "proj.c", which is the only real magic sorting rule. In other words: the magic sorting rule only affects tree entries, and *only* affects tree entries that point to other trees (ie are of the type S_IFDIR). Anyway, that thinko just means that we should remove the special case to make S_ISDIRLNK entries sort like S_ISDIR entries. They don't. They sort like normal files. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2007-04-10Teach core object handling functions about gitlinksLinus Torvalds
This teaches the really fundamental core SHA1 object handling routines about gitlinks. We can compare trees with gitlinks in them (although we can not actually generate patches for them yet - just raw git diffs), and they show up as commits in "git ls-tree". We also know to compare gitlinks as if they were directories (ie the normal "sort as trees" rules apply). [jc: amended a cut&paste error] Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2007-04-08Merge branch 'jc/read-tree-df' (early part)Junio C Hamano
* 'jc/read-tree-df' (early part): Fix switching to a branch with D/F when current branch has file D. Fix twoway_merge that passed d/f conflict marker to merged_entry(). Fix read-tree --prefix=dir/. unpack-trees: get rid of *indpos parameter. unpack_trees.c: pass unpack_trees_options structure to keep_entry() as well. add_cache_entry(): removal of file foo does not conflict with foo/bar
2007-04-05Rename add_file_to_index() to add_file_to_cache()Junio C Hamano
This function was not called "add_file_to_cache()" only because an ancient program, update-cache, used that name as an internal function name that does something slightly different. Now that is gone, we can take over the better name. The plan is to name all functions that operate on the default index xxx_cache(). Later patches create a variant of them that take an explicit parameter xxx_index(), and then turn xxx_cache() functions into macros that use "the_index". Signed-off-by: Junio C Hamano <>
2007-04-05Propagate cache error internal to refresh_cache() via parameter.Junio C Hamano
The function refresh_cache() is the only user of cache_errno that switches its behaviour based on what internal function refresh_cache_entry() finds; pass the error status back in a parameter passed down to it, to get rid of the global variable cache_errno. Signed-off-by: Junio C Hamano <>
2007-04-05Fix bogus error message from merge-recursive error pathJunio C Hamano
This error message should not usually trigger, but the function make_cache_entry() called by add_cacheinfo() can return early without calling into refresh_cache_entry() that sets cache_errno. Also the error message had a wrong function name reported, and it did not say anything about which path failed either. Signed-off-by: Junio C Hamano <>
2007-04-04add_cache_entry(): removal of file foo does not conflict with foo/barJunio C Hamano
Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <>
2007-03-07Cast 64 bit off_t to 32 bit size_tShawn O. Pearce
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2007-03-03Add core.symlinks to mark filesystems that do not support symbolic links.Johannes Sixt
Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <> Signed-off-by: Junio C Hamano <>
2007-02-28index_fd(): pass optional path parameter as hint for blob conversionJunio C Hamano
Signed-off-by: Junio C Hamano <>
2007-02-28index_fd(): use enum object_type instead of type name string.Junio C Hamano
Signed-off-by: Junio C Hamano <>
2007-02-27convert object type handling from a string to a numberNicolas Pitre
We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <> Signed-off-by: Junio C Hamano <>
2007-02-17Do not take mode bits from index after type change.Junio C Hamano
When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <>
2007-01-11write-cache: do not leak the serialized cache-tree data.Linus Torvalds
It is not used after getting written, and just is leaking every time we write the index out. Signed-off-by: Junio C Hamano <>
2007-01-08short i/o: fix calls to write to use xwrite or write_in_fullAndy Whitcroft
We have a number of badly checked write() calls. Often we are expecting write() to write exactly the size we requested or fail, this fails to handle interrupts or short writes. Switch to using the new write_in_full(). Otherwise we at a minimum need to check for EINTR and EAGAIN, where this is appropriate use xwrite(). Note, the changes to config handling are much larger and handled in the next patch in the sequence. Signed-off-by: Andy Whitcroft <> Signed-off-by: Junio C Hamano <>
2006-12-29Cleanup read_cache_from error handling.Shawn O. Pearce
When I converted the mmap() call to xmmap() I failed to cleanup the way this routine handles errors and left some crufty code behind. This is a small cleanup, suggested by Johannes. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2006-12-29Replace mmap with xmmap, better handling MAP_FAILED.Shawn O. Pearce
In some cases we did not even bother to check the return value of mmap() and just assume it worked. This is bad, because if we are out of virtual address space the kernel returned MAP_FAILED and we would attempt to dereference that address, segfaulting without any real error output to the user. We are replacing all calls to mmap() with xmmap() and moving all MAP_FAILED checking into that single location. If a mmap call fails we try to release enough least-recently-used pack windows to possibly succeed, then retry the mmap() attempt. If we cannot mmap even after releasing pack memory then we die() as none of our callers have any reasonable recovery strategy for a failed mmap. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2006-12-17Fix check_file_directory_conflict().Junio C Hamano
When replacing an existing file A with a directory A that has a file A/B in it in the index, 'update-index --replace --add A/B' did not properly remove the file to make room for the new directory. There was a trivial logic error, most likely a cut & paste one, dating back to quite early days of git. Signed-off-by: Junio C Hamano <>
2006-12-17git-add: remove conflicting entry when adding.Junio C Hamano
When replacing an existing file A with a directory A that has a file A/B in it in the index, 'git add' did not succeed because it forgot to pass the allow-replace flag to add_cache_entry(). It might be safer to leave this as an error and require the user to explicitly remove the existing A first before adding A/B since it is an unusual case, but doing that automatically is much easier to use. Signed-off-by: Junio C Hamano <>
2006-12-17update-index: make D/F conflict error a bit more verbose.Junio C Hamano
When you remove a directory D that has a tracked file D/F out of the way to create a file D and try to "git update-index --add D", it used to say "cannot add" which was not very helpful. This issues an extra error message to explain the situation before the final "fatal" message. Since D/F conflicts are relatively rare event, extra verbosity would not make things too noisy. Signed-off-by: Junio C Hamano <>
2006-11-23trust-executable-bit: fix breakage for symlinksJunio C Hamano
An earlier commit f28b34a broke symlinks when trust-executable-bit is not set because it incorrectly assumed that everything was a regular file. Reported by Juergen Ruehle. Signed-off-by: Junio C Hamano <>
2006-11-18sparse fix: non-ANSI function declarationRene Scharfe
The declaration of discard_cache() in cache.h already has its "void". Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2006-09-27Ignore executable bit when adding files if filemode=0.Shawn Pearce
If the user has configured core.filemode=0 then we shouldn't set the execute bit in the index when adding a new file as the user has indicated that the local filesystem can't be trusted. This means that when adding files that should be marked executable in a repository with core.filemode=0 the user must perform a 'git update-index --chmod=+x' on the file before committing the addition. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2006-08-28Merge branch 'js/c-merge-recursive'Junio C Hamano
* js/c-merge-recursive: (21 commits) discard_cache(): discard index, even if no file was mmap()ed merge-recur: do not die unnecessarily merge-recur: try to merge older merge bases first merge-recur: if there is no common ancestor, fake empty one merge-recur: do not setenv("GIT_INDEX_FILE") merge-recur: do not call git-write-tree merge-recursive: fix rename handling .gitignore: git-merge-recur is a built file. merge-recur: virtual commits shall never be parsed merge-recur: use the unpack_trees() interface instead of exec()ing read-tree merge-recur: fix thinko in unique_path() Makefile: git-merge-recur depends on xdiff libraries. merge-recur: Explain why sha_eq() and struct stage_data cannot go merge-recur: Cleanup last mixedCase variables... merge-recur: Fix compiler warning with -pedantic merge-recur: Remove dead code merge-recur: Get rid of debug code merge-recur: Convert variable names to lower_case Cumulative update of merge-recursive in C recur vs recursive: help testing without touching too many stuff. ... This is an evil merge that removes TEST script from the toplevel.
2006-08-17Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.David Rientjes
Introduces global inline: hashcmp(const unsigned char *sha1, const unsigned char *sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <> Signed-off-by: David Rientjes <> Signed-off-by: Junio C Hamano <>
2006-08-16Merge branch 'jc/racy'Junio C Hamano
* jc/racy: Remove the "delay writing to avoid runtime penalty of racy-git avoidance" Add check program "git-check-racy" Documentation/technical/racy-git.txt avoid nanosleep(2)
2006-08-16Remove the "delay writing to avoid runtime penalty of racy-git avoidance"Junio C Hamano
The work-around should not be needed. Even if it turns out we would want it later, git will remember the patch for us ;-). Signed-off-by: Junio C Hamano <>
2006-08-16Add check program "git-check-racy"Junio C Hamano
This will help counting the racily clean paths, but it should be useless for daily use. Do not even enable it in the makefile. Signed-off-by: Junio C Hamano <>
2006-08-16remove unnecessary initializationsDavid Rientjes
[jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <> Signed-off-by: Junio C Hamano <>
2006-08-15avoid nanosleep(2)Junio C Hamano
On Solaris nanosleep(2) is not available in libc; you need to link with -lrt to get it. The purpose of the loop is to wait until the next filesystem timestamp granularity, and the code uses subsecond sleep in the hope that it can shorten the delay to 0.5 seconds on average instead of a full second. It is probably not worth depending on an extra library for this. We might want to yank out the whole "racy-git avoidance is costly later at runtime, so let's delay writing the index out" codepath later, but that is a separate issue and needs some testing on large trees to figure it out. After playing with the kernel tree, I have a feeling that the whole thing may not be worth it. Signed-off-by: Junio C Hamano <>
2006-08-15read-cache.c cleanupDavid Rientjes
Removes conditional returns. Signed-off-by: David Rientjes <> Signed-off-by: Junio C Hamano <>
2006-08-13Merge branch 'master' into js/c-merge-recursiveJunio C Hamano
Adjust to hold_lock_file_for_update() change on the master.
2006-08-10discard_cache(): discard index, even if no file was mmap()edJohannes Schindelin
Since add_cacheinfo() can be called without a mapped index file, discard_cache() _has_ to discard the entries, even when cache_mmap == NULL. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-08-09read-cache: tweak racy-git delay logicJunio C Hamano
Instead of looping over the entries and writing out, use a separate loop after all entries have been written out to check how many entries are racily clean. Make sure that the newly created index file gets the right timestamp when we check by flushing the buffered data by ce_write(). Signed-off-by: Junio C Hamano <>
2006-08-07Racy git: avoid having to be always too carefulJunio C Hamano
Immediately after a bulk checkout, most of the paths in the working tree would have the same timestamp as the index file, and this would force ce_match_stat() to take slow path for all of them. When writing an index file out, if many of the paths have very new (read: the same timestamp as the index file being written out) timestamp, we are better off delaying the return from the command, to make sure that later command to touch the working tree files will leave newer timestamps than recorded in the index, thereby avoiding to take the slow path. Signed-off-by: Junio C Hamano <>
2006-07-31Fix double "close()" in ce_compare_dataLinus Torvalds
Doing an "strace" on "git diff" shows that we close() a file descriptor twice (getting EBADFD on the second one) when we end up in ce_compare_data if the index does not match the checked-out stat information. The "index_fd()" function will already have closed the fd for us, so we should not close it again. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2006-07-31Merge branch 'js/read-tree' into js/c-merge-recursiveJunio C Hamano
* js/read-tree: (107 commits) read-tree: move merge functions to the library read-trees: refactor the unpack_trees() part tar-tree: illustrate an obscure feature better git.c: allow alias expansion without a git directory setup_git_directory_gently: do not barf when GIT_DIR is given. Build on Debian GNU/kFreeBSD Call setup_git_directory() much earlier Call setup_git_directory() early Display an error from update-ref if target ref name is invalid. Fix http-fetch t4103: fix binary patch application test. git-apply -R: binary patches are irreversible for now. Teach git-apply about '-R' Makefile: ssh-pull.o depends on ssh-fetch.c log and diff family: honor config even from subdirectories git-reset: detect update-ref error and report it. lost-found: use fsck-objects --full Teach git-http-fetch the --stdin switch Teach git-local-fetch the --stdin switch Make pull() support fetching multiple targets at once ...
2006-07-26Make git-mv a builtinJohannes Schindelin
This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-07-26Extract helper bits from c-merge-recursive workJohannes Schindelin
This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <>
2006-07-14Status update on merge-recursive in CJohannes Schindelin
This is just an update for people being interested. Alex and me were busy with that project for a few days now. While it has progressed nicely, there are quite a couple TODOs in merge-recursive.c, just search for "TODO". For impatient people: yes, it passes all the tests, and yes, according to the evil test Alex did, it is faster than the Python script. But no, it is not yet finished. Biggest points are: - there are still three external calls - in the end, it should not be necessary to write the index more than once (just before exiting) - a lot of things can be refactored to make the code easier and shorter BTW we cannot just plug in git-merge-tree yet, because git-merge-tree does not handle renames at all. This patch is meant for testing, and as such, - it compile the program to git-merge-recur - it adjusts the scripts and tests to use git-merge-recur instead of git-merge-recursive - it provides "TEST", a script to execute the tests regarding -recursive - it inlines the changes to read-cache.c (read_cache_from(), discard_cache() and refresh_cache_entry()) Brought to you by Alex Riesen and Dscho Signed-off-by: Junio C Hamano <>