summaryrefslogtreecommitdiff
path: root/unpack-trees.c
AgeCommit message (Collapse)Author
2007-08-11Optimize the three-way merge of git-read-treeLinus Torvalds
As mentioned, the three-way case *should* be as trivial as the following. It passes all the tests, and I verified that a conflicting merge in the 100,000 file horror-case merged correctly (with the conflict markers) in 0.687 seconds with this, so it works, but I'm lazy and somebody else should double-check it [jc: followed all three-way merge codepaths and verified it removes when it should]. Without this patch, the merge took 8.355 seconds, so this patch really does make a huge difference for merge performance with lots and lots of files, and we're not talking percentages, we're talking orders-of-magnitude differences! Now "unpack_trees()" is just fast enough that we don't need to avoid it (although it's probably still a good idea to eventually convert it to use the traverse_trees() infrastructure some day - just to avoid having extraneous tree traversal functions). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-10Optimize the two-way merge of git-read-tree tooLinus Torvalds
This trivially optimizes the two-way merge case of git-read-tree too, which affects switching branches. When you have tons and tons of files in your repository, but there are only small differences in the branches (maybe just a couple of files changed), the biggest cost of the branch switching was actually just the index calculations. This fixes it (timings for switching between the "testing" and "master" branches in the 100,000 file testing-repo-from-hell, where the branches only differ in one small file). Before: [torvalds@woody bummer]$ time git checkout master real 0m9.919s user 0m8.461s sys 0m0.264s After: [torvalds@woody bummer]$ time git checkout testing real 0m0.576s user 0m0.348s sys 0m0.228s so it's easily an order of magnitude different. This concludes the series. I think we could/should do the three-way merge too (to speed up merges), but I'm lazy. Somebody else can do it. The rule is very simple: you need to remove the old entry if: - you want to remove the file entirely - you replace it with a "merge conflict" entry (ie a non-stage-0 entry) and you can avoid removing it if you either - keep the old one - or resolve it to a new one. and these rules should all be valid for the three-way case too. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-10Optimize the common cases of git-read-treeLinus Torvalds
This optimizes bind_merge() and oneway_merge() to not unnecessarily remove and re-add the old index entries when they can just get replaced by updated ones. This makes these operations much faster for large trees (where "large" is in the 50,000+ file range), because we don't unnecessarily move index entries around in the index array all the time. Using the "bummer" tree (a test-tree with 100,000 files) we get: Before: [torvalds@woody bummer]$ time git commit -m"Change one file" 50/500 real 0m9.470s user 0m8.729s sys 0m0.476s After: [torvalds@woody bummer]$ time git commit -m"Change one file" 50/500 real 0m1.173s user 0m0.720s sys 0m0.452s so for large trees this is easily very noticeable indeed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-10Move old index entry removal from "unpack_trees()" into the individual functionsLinus Torvalds
This makes no changes to current code, but it allows the individual merge functions to decide what to do about the old entry. They might decide to update it in place, rather than force them to always delete and re-add it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-10Start moving unpack-trees to "struct tree_desc"Linus Torvalds
This doesn't actually change any real code, but it changes the interface to unpack_trees() to take an array of "struct tree_desc" entries, the same way the tree-walk.c functions do. The reason for this is that we would be much better off if we can do the tree-unpacking using the generic "traverse_trees()" functionality instead of having to the special "unpack" infrastructure. This really is a pretty minimal diff, just to change the calling convention. It passes all the tests, and looks sane. There were only two users of "unpack_trees()": builtin-read-tree and merge-recursive, and I tried to keep the changes minimal. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-08-05unpack-trees.c: assume submodules are clean during check-outJunio C Hamano
Sven originally raised this issue: If you have a submodule checked out and you go back (or forward) to a revision of the supermodule that contains a different revision of the submodule and then switch to another revision, it will complain that the submodule is not uptodate, because git simply didn't update the submodule in the first move. The current policy is to consider it is perfectly normal that checked-out submodule is out-of-sync wrt the supermodule index. At least until we introduce a superproject repository configuration option that says "in this repository, I do care about this submodule and at any time I move around in the superproject, recursively check out the submodule to match", it is a reasonable policy, as we currently do not recursively checkout the submodules at all. The most extreme case of this policy is that the superproject index knows about the submodule but the subdirectory does not even have to be checked out. The function verify_uptodate(), called during the two-way merge aka branch switching, is about "make sure the filesystem entity that corresponds to this cache entry is up to date, lest we lose the local modifications". As we explicitly allow submodule checkout to drift from the supermodule index entry, the check should say "Ok, for submodules, not matching is the norm" for now. Later when we have the ability to mark "I care about this submodule to be always in sync with the superproject" (thereby implementing automatic recursive checkout and perhaps diff, among other things), we should check if the submodule in question is marked as such and perform the current test. Acked-by: Lars Hjemli <hjemli@gmail.com> Acked-by: Sven Verdoolaege <skimo@kotnet.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-25cleanup unpack-trees.c: shrink struct tree_entry_listRené Scharfe
Remove the two write-only fields executable and symlink from struct tree_entry_list. Also replace usage of the field directory with S_ISDIR checks on the mode field, and then remove this now obsolete field, too. Noticed by David Kastrup. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-19unpack-trees.c: assume submodules are clean during check-outSven Verdoolaege
In particular, when moving back to a commit without a given submodule and then moving back forward to a commit with the given submodule, we shouldn't complain that updating would lose untracked file in the submodule, because git currently does not checkout subprojects during superproject check-out. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-12Teach read-tree 2-way merge to ignore intermediate symlinksJunio C Hamano
Earlier in 16a4c61, we taught "read-tree -m -u" not to be confused when switching from a branch that has a path frotz/filfre to another branch that has a symlink frotz that points at xyzzy/ directory. The fix was incomplete in that it was still confused when coming back (i.e. switching from a branch with frotz -> xyzzy/ to another branch with frotz/filfre). This fix is rather expensive in that for a path that is created we would need to see if any of the leading component of that path exists as a symbolic link in the filesystem (in which case, we know that path itself does not exist, and the fact we already decided to check it out tells us that in the index we already know that symbolic link is going away as there is no D/F conflict). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-05-21Merge branch 'maint-1.5.1' into maintJunio C Hamano
* maint-1.5.1: annotate: make it work from subdirectories. git-config: Correct asciidoc documentation for --int/--bool t1300: Add tests for git-config --bool --get unpack-trees.c: verify_uptodate: remove dead code Use PATH_MAX instead of TEMPFILE_PATH_LEN branch: fix segfault when resolving an invalid HEAD
2007-05-20unpack-trees.c: verify_uptodate: remove dead codeSven Verdoolaege
This code was killed by commit fcc387db9bc453dc7e07a262873481af2ee9e5c8. Signed-off-by: Sven Verdoolaege <skimo@kotnet.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-05-12read-tree -m -u: avoid getting confused by intermediate symlinks.Junio C Hamano
When switching from a branch with both x86_64/boot/Makefile and i386/boot/Makefile to another branch that has x86_64/boot as a symlink pointing at ../i386/boot, the code incorrectly removed i386/boot/Makefile. This was because we first removed everything under x86_64/boot to make room to create a symbolic link x86_64/boot, then removed x86_64/boot/Makefile which no longer exists but now is pointing at i386/boot/Makefile, thanks to the symlink we just created. This fixes it by using the has_symlink_leading_path() function introduced previously for git-apply in the checkout codepath. Earlier, "git checkout" was broken in t4122 test due to this bug, and the test had an extra "git reset --hard" as a workaround, which is removed because it is not needed anymore. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-23delay progress display when checking out filesNicolas Pitre
Let's start displaying progress only if more than 50% of total number of files remains to be checked out after 2 seconds. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-23make progress "title" part of the common progress interfaceNicolas Pitre
If the progress bar ends up in a box, better provide a title for it too. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-23common progress display supportNicolas Pitre
Instead of having this code duplicated in multiple places, let's have a common interface for progress display. If someday someone wishes to display a cheezy progress bar instead then only one file will have to be changed. Note: I left merge-recursive.c out since it has a strange notion of progress as it apparently increase the expected total number as it goes. Someone with more intimate knowledge of what that is supposed to mean might look at converting it to the common progress interface. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10Treat D/F conflict entry more carefully in unpack-trees.c::threeway_merge()Junio C Hamano
This fixes three buglets in threeway_merge() regarding D/F conflict entries. * After finishing with path D and handling path D/F, some stages have D/F conflict entry which are obviously non-NULL. For the purpose of determining if the path D/F is missing in the ancestor, they should not be taken into account. * D/F conflict entry is a marker to say "this stage does _not_ have the path", so do not send them to keep_entry(). Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-10t1000: fix case table.Junio C Hamano
Case #10 is not handled with unpack-trees.c:threeway_merge() internally, unless under the agressive rule, and it is not a bug. As the test expects, ND (one side did not do anything, other side deleted) case was meant to be handled by the caller's policy (e.g. git-merge-one-file or git-merge-recursive). Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-04Fix switching to a branch with D/F when current branch has file D.Junio C Hamano
This loosens the over-eager verify_absent() check that gets upset to find directory D in the current working tree when switching to a branch that has a file there. The check needs to make sure that we do not lose precious working tree files as a result of removing directory D and replacing it with the file from the other branch, which is a tad expensive but this is a less common case. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-04Fix twoway_merge that passed d/f conflict marker to merged_entry().Junio C Hamano
When switching from one tree to another, we should not send a marker that says "this file does not exist in the new tree -- I am a placeholder to tell you that, and not a real blob" down to merged_entry() as the result of the merge.
2007-04-04unpack-trees: get rid of *indpos parameter.Junio C Hamano
This variable keeps track of which entry in the original index the traversal is looking at, and belongs to the unpack_trees_options structure along with other traversal status information. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-04unpack_trees.c: pass unpack_trees_options structure to keep_entry() as well.Junio C Hamano
Other decision functions, deleted_entry() and merged_entry() take one as their parameter, and this function should. I'll be introducing a separate index to build the result in, and am planning to pass it as the part of the structure. Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-03-21Initialize tree descriptors with a helper function rather than by hand.Linus Torvalds
This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-20simplify inclusion of system header files.Junio C Hamano
This is a mechanical clean-up of the way *.c files include system header files. (1) sources under compat/, platform sha-1 implementations, and xdelta code are exempt from the following rules; (2) the first #include must be "git-compat-util.h" or one of our own header file that includes it first (e.g. config.h, builtin.h, pkt-line.h); (3) system headers that are included in "git-compat-util.h" need not be included in individual C source files. (4) "git-compat-util.h" does not have to include subsystem specific header files (e.g. expat.h). Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-13Merge branch 'jc/read-tree-ignore'Junio C Hamano
* jc/read-tree-ignore: read-tree: document --exclude-per-directory Loosen "working file will be lost" check in Porcelain-ish read-tree: further loosen "working file will be lost" check.
2006-12-06read-tree: further loosen "working file will be lost" check.Junio C Hamano
This follows up commit ed93b449 where we removed overcautious "working file will be lost" check. A new option "--exclude-per-directory=.gitignore" can be used to tell the "git-read-tree" command that the user does not mind losing contents in untracked files in the working tree, if they need to be overwritten by a merge (either a two-way "switch branches" merge, or a three-way merge). Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-12-04unpack-trees: make sure "df_conflict_entry.name" is NUL terminated.Junio C Hamano
The structure that ends with a flexible array member (or 0 length array with older GCC) "char name[FLEX_ARRAY]" is allocated on the stack and we use it after clearing its entire size with memset. That does not guarantee that "name" is properly NUL terminated as we intended on platforms with more forgiving structure alignment requirements. Reported breakage on m68k by Roman Zippel. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-10-28merge: loosen overcautious "working file will be lost" check.Junio C Hamano
The three-way merge complained unconditionally when a path that does not exist in the index is involved in a merge when it existed in the working tree. If we are merging an old version that had that path tracked, but the path is not tracked anymore, and if we are merging that old version in, the result will be that the path is not tracked. In that case we should not complain. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-23Convert memcpy(a,b,20) to hashcpy(a,b).Shawn Pearce
This abstracts away the size of the hash values when copying them from memory location to memory location, much as the introduction of hashcmp abstracted away hash value comparsion. A few call sites were using char* rather than unsigned char* so I added the cast rather than open hashcpy to be void*. This is a reasonable tradeoff as most call sites already use unsigned char* and the existing hashcmp is also declared to be unsigned char*. [jc: Splitted the patch to "master" part, to be followed by a patch for merge-recursive.c which is not in "master" yet. Fixed the cast in the latter hunk to combine-diff.c which was wrong in the original. Also converted ones left-over in combine-diff.c, diff-lib.c and upload-pack.c ] Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-17Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length.David Rientjes
Introduces global inline: hashcmp(const unsigned char *sha1, const unsigned char *sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-16remove unnecessary initializationsDavid Rientjes
[jc: I needed to hand merge the changes to the updated codebase, so the result needs to be checked.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-08-15use appropriate typedefsDavid Rientjes
Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-31read-tree: move merge functions to the libraryJohannes Schindelin
This will allow merge-recursive to use the read-tree functionality without exec()ing git-read-tree. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-07-31read-trees: refactor the unpack_trees() partJohannes Schindelin
Basically, the options are passed by a struct unpack_trees_options now. That's all. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>