summaryrefslogtreecommitdiff
path: root/sha1_file.c
diff options
context:
space:
mode:
authorJeff King <peff@peff.net>2017-11-21 23:17:39 (GMT)
committerJunio C Hamano <gitster@pobox.com>2017-11-22 01:50:11 (GMT)
commit87b5e236a14d4783b063041588c2f77f3cc6ee89 (patch)
treefe370056cc19a5e0e6ae39fab7c2172ad0113369 /sha1_file.c
parentc291293b2ecec8ca77dfd218fa820dd7a0137a2b (diff)
downloadgit-87b5e236a14d4783b063041588c2f77f3cc6ee89.zip
git-87b5e236a14d4783b063041588c2f77f3cc6ee89.tar.gz
git-87b5e236a14d4783b063041588c2f77f3cc6ee89.tar.bz2
sha1_file: fast-path null sha1 as a missing object
In theory nobody should ever ask the low-level object code for a null sha1. It's used as a sentinel for "no such object" in lots of places, so leaking through to this level is a sign that the higher-level code is not being careful about its error-checking. In practice, though, quite a few code paths seem to rely on the null sha1 lookup failing as a way to quietly propagate non-existence (e.g., by feeding it to lookup_commit_reference_gently(), which then returns NULL). When this happens, we do two inefficient things: 1. We actually search for the null sha1 in packs and in the loose object directory. 2. When we fail to find it, we re-scan the pack directory in case a simultaneous repack happened to move it from loose to packed. This can be very expensive if you have a large number of packs. Only the second one actually causes noticeable performance problems, so we could treat them independently. But for the sake of simplicity (both of code and of reasoning about it), it makes sense to just declare that the null sha1 cannot be a real on-disk object, and looking it up will always return "no such object". There's no real loss of functionality to do so Its use as a sentinel value means that anybody who is unlucky enough to hit the 2^-160th chance of generating an object with that sha1 is already going to find the object largely unusable. In an ideal world, we'd simply fix all of the callers to notice the null sha1 and avoid passing it to us. But a simple experiment to catch this with a BUG() shows that there are a large number of code paths that do so. So in the meantime, let's fix the performance problem by taking a fast exit from the object lookup when we see a null sha1. p5551 shows off the improvement (when a fetched ref is new, the "old" sha1 is 0{40}, which ends up being passed for fast-forward checks, the status table abbreviations, etc): Test HEAD^ HEAD -------------------------------------------------------- 5551.4: fetch 5.51(5.03+0.48) 0.17(0.10+0.06) -96.9% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'sha1_file.c')
-rw-r--r--sha1_file.c3
1 files changed, 3 insertions, 0 deletions
diff --git a/sha1_file.c b/sha1_file.c
index bd5f82e..fbb73f5 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2971,6 +2971,9 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
lookup_replace_object(sha1) :
sha1;
+ if (is_null_sha1(real))
+ return -1;
+
if (!oi)
oi = &blank_oi;