path: root/builtin-rev-list.c
diff options
authorJunio C Hamano <>2007-04-16 07:42:29 (GMT)
committerJunio C Hamano <>2007-04-17 07:14:59 (GMT)
commitb9849a1ab63143c3b70e339491a897ef62a4173b (patch)
tree10cb647e7941d2553ea5a3464dd36ddbd8abe1b9 /builtin-rev-list.c
parente3c6f240fd9c5bdeb33f2d47adc859f37935e2df (diff)
Make sure quickfetch is not fooled with a previous, incomplete fetch.
This updates git-rev-list --objects to be a bit more careful when listing a blob object to make sure the blob actually exists, and uses it to make sure the quick-fetch optimization we introduced earlier is not fooled by a previous incomplete fetch. The quick-fetch optimization works by running this command: git rev-list --objects <<commit-list>> --not --all where <<commit-list>> is a list of commits that we are going to fetch from the other side. If there is any object missing to complete the <<commit-list>>, the rev-list would fail and die (say, the commit was in our repository, but its tree wasn't -- then it will barf while trying to list the blobs the tree contains because it cannot read that tree). Usually we do not have the objects (otherwise why would we fetching?), but in one important special case we do: when the remote repository is used as an alternate object store (i.e. pointed by .git/objects/info/alternates). We could check .git/objects/info/alternates to see if the remote we are interacting with is one of them (or is used as an alternate, recursively, by one of them), but that check is more cumbersome than it is worth. The above check however did not catch missing blob, because object listing code did not read nor check blob objects, knowing that blobs do not contain any further references to other objects. This commit fixes it with practically unmeasurable overhead. I've benched this with git rev-list --objects --all >/dev/null in the kernel repository, with three different implementations of the "check-blob". - Checking with has_sha1_file() has negligible (unmeasurable) performance penalty. - Checking with sha1_object_info() makes it somewhat slower, perhaps by 5%. - Checking with read_sha1_file() to cause a fully re-validation is prohibitively expensive (about 4 times as much runtime). In my original patch, I had this as a command line option, but the overhead is small enough that it is not really worth it. Signed-off-by: Junio C Hamano <>
Diffstat (limited to 'builtin-rev-list.c')
1 files changed, 4 insertions, 0 deletions
diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index 09774f9..c0329dc 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -113,6 +113,10 @@ static void show_object(struct object_array_entry *p)
* confuse downstream git-pack-objects very badly.
const char *ep = strchr(p->name, '\n');
+ if (p->item->type == OBJ_BLOB && !has_sha1_file(p->item->sha1))
+ die("missing blob object '%s'", sha1_to_hex(p->item->sha1));
if (ep) {
printf("%s %.*s\n", sha1_to_hex(p->item->sha1),
(int) (ep - p->name),