From ea97002fc9f682a804ac05212d069e38fa3e365c Mon Sep 17 00:00:00 2001 From: Jeff King Date: Mon, 20 Jan 2014 21:25:12 -0500 Subject: t/perf: time rev-list with UNINTERESTING commits We time a straight "rev-list --all" and its "--object" counterpart, both going all the way to the root. However, we do not time a partial history walk. This patch adds an extreme case: a walk over a very small slice of history, but with a very large set of UNINTERESTING tips. This is similar to the connectivity check run by git on a small fetch, or the walk done by any pre-receive hooks that want to check incoming commits. This test reveals a performance regression in git v1.8.4.2, caused by fbd4a70 (list-objects: mark more commits as edges in mark_edges_uninteresting, 2013-08-16): Test fbd4a703^ fbd4a703 ------------------------------------------------------------------------------------------ 0001.1: rev-list --all 0.69(0.67+0.02) 0.69(0.68+0.01) +0.0% 0001.2: rev-list --all --objects 3.47(3.44+0.02) 3.48(3.44+0.03) +0.3% 0001.4: rev-list $commit --not --all 0.04(0.04+0.00) 0.04(0.04+0.00) +0.0% 0001.5: rev-list --objects $commit --not --all 0.04(0.03+0.00) 0.27(0.24+0.02) +575.0% Signed-off-by: Jeff King Signed-off-by: Junio C Hamano diff --git a/t/perf/p0001-rev-list.sh b/t/perf/p0001-rev-list.sh index 4f71a63..16359d5 100755 --- a/t/perf/p0001-rev-list.sh +++ b/t/perf/p0001-rev-list.sh @@ -14,4 +14,16 @@ test_perf 'rev-list --all --objects' ' git rev-list --all --objects >/dev/null ' +test_expect_success 'create new unreferenced commit' ' + commit=$(git commit-tree HEAD^{tree} -p HEAD) +' + +test_perf 'rev-list $commit --not --all' ' + git rev-list $commit --not --all >/dev/null +' + +test_perf 'rev-list --objects $commit --not --all' ' + git rev-list --objects $commit --not --all >/dev/null +' + test_done -- cgit v0.10.2-6-g49f6 From 200abe7458357c83f3859ce6dbf89ea5d4d09b3d Mon Sep 17 00:00:00 2001 From: Jeff King Date: Mon, 20 Jan 2014 21:25:40 -0500 Subject: list-objects: only look at cmdline trees with edge_hint When rev-list is given a command-line like: git rev-list --objects $commit --not --all the most accurate answer is the difference between the set of objects reachable from $commit and the set reachable from all of the existing refs. However, we have not historically provided that answer, because it is very expensive to calculate. We would have to open every tree of every commit in the entire history. Instead, we find the accurate set difference of the reachable commits, and then mark the trees at the boundaries as uninteresting. This misses objects which appear in the trees of both the interesting commits and deep within the uninteresting history. Commit fbd4a70 (list-objects: mark more commits as edges in mark_edges_uninteresting, 2013-08-16) noticed that we miss those objects during pack-objects, and added code to examine the trees of all of the "--not" refs given on the command-line. Note that this is still not the complete set difference, because we look only at the tips of the command-line arguments, not all of their reachable commits. But it increases the set of boundary objects we consider, which is especially important for shallow fetches. So we are trading extra CPU time for a larger set of boundary objects, which can improve the resulting pack size for a --thin pack. This tradeoff probably makes sense in the context of pack-objects, where we have set revs->edge_hint to have the traversal feed us the set of boundary objects. For a regular rev-list, though, it is probably not a good tradeoff. It is true that it makes our list slightly closer to a true set difference, but it is a rare case where this is important. And because we do not have revs->edge_hint set, we do nothing useful with the larger set of boundary objects. This patch therefore ties the extra tree examination to the revs->edge_hint flag; it is the presence of that flag that makes the tradeoff worthwhile. Here is output from the p0001-rev-list showing the improvement in performance: Test HEAD^ HEAD ----------------------------------------------------------------------------------------- 0001.1: rev-list --all 0.69(0.65+0.02) 0.69(0.66+0.02) +0.0% 0001.2: rev-list --all --objects 3.22(3.19+0.03) 3.23(3.20+0.03) +0.3% 0001.4: rev-list $commit --not --all 0.04(0.04+0.00) 0.04(0.04+0.00) +0.0% 0001.5: rev-list --objects $commit --not --all 0.27(0.26+0.01) 0.04(0.04+0.00) -85.2% Signed-off-by: Jeff King Signed-off-by: Junio C Hamano diff --git a/list-objects.c b/list-objects.c index 05c8c5c..8b39b50 100644 --- a/list-objects.c +++ b/list-objects.c @@ -163,15 +163,17 @@ void mark_edges_uninteresting(struct rev_info *revs, show_edge_fn show_edge) } mark_edge_parents_uninteresting(commit, revs, show_edge); } - for (i = 0; i < revs->cmdline.nr; i++) { - struct object *obj = revs->cmdline.rev[i].item; - struct commit *commit = (struct commit *)obj; - if (obj->type != OBJ_COMMIT || !(obj->flags & UNINTERESTING)) - continue; - mark_tree_uninteresting(commit->tree); - if (revs->edge_hint && !(obj->flags & SHOWN)) { - obj->flags |= SHOWN; - show_edge(commit); + if (revs->edge_hint) { + for (i = 0; i < revs->cmdline.nr; i++) { + struct object *obj = revs->cmdline.rev[i].item; + struct commit *commit = (struct commit *)obj; + if (obj->type != OBJ_COMMIT || !(obj->flags & UNINTERESTING)) + continue; + mark_tree_uninteresting(commit->tree); + if (!(obj->flags & SHOWN)) { + obj->flags |= SHOWN; + show_edge(commit); + } } } } -- cgit v0.10.2-6-g49f6