summaryrefslogtreecommitdiff
path: root/builtin
diff options
context:
space:
mode:
authorJeff King <peff@peff.net>2016-10-13 16:53:44 (GMT)
committerJunio C Hamano <gitster@pobox.com>2016-10-14 18:31:32 (GMT)
commit5827a03545663f6d6b491a35edb313900608568b (patch)
treee634adc0f8ecf4bc8cc83f7a16c2800615044656 /builtin
parent0202c411edc25940cc381bf317badcdf67670be4 (diff)
downloadgit-5827a03545663f6d6b491a35edb313900608568b.zip
git-5827a03545663f6d6b491a35edb313900608568b.tar.gz
git-5827a03545663f6d6b491a35edb313900608568b.tar.bz2
fetch: use "quick" has_sha1_file for tag following
When we auto-follow tags in a fetch, we look at all of the tags advertised by the remote and fetch ones where we don't already have the tag, but we do have the object it peels to. This involves a lot of calls to has_sha1_file(), some of which we can reasonably expect to fail. Since 45e8a74 (has_sha1_file: re-check pack directory before giving up, 2013-08-30), this may cause many calls to reprepare_packed_git(), which is potentially expensive. This has gone unnoticed for several years because it requires a fairly unique setup to matter: 1. You need to have a lot of packs on the client side to make reprepare_packed_git() expensive (the most expensive part is finding duplicates in an unsorted list, which is currently quadratic). 2. You need a large number of tag refs on the server side that are candidates for auto-following (i.e., that the client doesn't have). Each one triggers a re-read of the pack directory. 3. Under normal circumstances, the client would auto-follow those tags and after one large fetch, (2) would no longer be true. But if those tags point to history which is disconnected from what the client otherwise fetches, then it will never auto-follow, and those candidates will impact it on every fetch. So when all three are true, each fetch pays an extra O(nr_tags * nr_packs^2) cost, mostly in string comparisons on the pack names. This was exacerbated by 47bf4b0 (prepare_packed_git_one: refactor duplicate-pack check, 2014-06-30) which uses a slightly more expensive string check, under the assumption that the duplicate check doesn't happen very often (and it shouldn't; the real problem here is how often we are calling reprepare_packed_git()). This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice accuracy for speed, in cases where we might be racy with a simultaneous repack. This is similar to the fix in 0eeb077 (index-pack: avoid excessive re-reading of pack directory, 2015-06-09). As with that case, it's OK for has_sha1_file() occasionally say "no I don't have it" when we do, because the worst case is not a corruption, but simply that we may fail to auto-follow a tag that points to it. Here are results from the included perf script, which sets up a situation similar to the one described above: Test HEAD^ HEAD ---------------------------------------------------------- 5550.4: fetch 11.21(10.42+0.78) 0.08(0.04+0.02) -99.3% Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'builtin')
-rw-r--r--builtin/fetch.c11
1 files changed, 7 insertions, 4 deletions
diff --git a/builtin/fetch.c b/builtin/fetch.c
index f896aa1..3e1a266 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -232,9 +232,10 @@ static void find_non_local_tags(struct transport *transport,
* as one to ignore by setting util to NULL.
*/
if (ends_with(ref->name, "^{}")) {
- if (item && !has_object_file(&ref->old_oid) &&
+ if (item &&
+ !has_object_file_with_flags(&ref->old_oid, HAS_SHA1_QUICK) &&
!will_fetch(head, ref->old_oid.hash) &&
- !has_sha1_file(item->util) &&
+ !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
!will_fetch(head, item->util))
item->util = NULL;
item = NULL;
@@ -247,7 +248,8 @@ static void find_non_local_tags(struct transport *transport,
* to check if it is a lightweight tag that we want to
* fetch.
*/
- if (item && !has_sha1_file(item->util) &&
+ if (item &&
+ !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
!will_fetch(head, item->util))
item->util = NULL;
@@ -267,7 +269,8 @@ static void find_non_local_tags(struct transport *transport,
* We may have a final lightweight tag that needs to be
* checked to see if it needs fetching.
*/
- if (item && !has_sha1_file(item->util) &&
+ if (item &&
+ !has_sha1_file_with_flags(item->util, HAS_SHA1_QUICK) &&
!will_fetch(head, item->util))
item->util = NULL;