path: root/diff.c
diff options
authorJeff King <>2010-04-02 00:14:24 (GMT)
committerJunio C Hamano <>2010-04-02 07:11:20 (GMT)
commitb3373982667dc983b8dacf33861d25b20bafb995 (patch)
tree852fb52d64888301061e03ec99f4906385cb3aa9 /diff.c
parentd9bae1a178f0f8b198ea611e874975214ad6f990 (diff)
diff: avoid useless filespec population
builtin_diff calls fill_mmfile fairly early, which in turn calls diff_populate_filespec, which actually retrieves the file's blob contents into a buffer. Long ago, this was sensible as we would need to look at the blobs eventually. These days, however, we may not ever want those blobs if we end up using a textconv cache, and for large binary files (exactly the sort for which you might have a textconv cache), just retrieving the objects can be costly. This patch just pushes the fill_mmfile call a bit later, so we can avoid populating the filespec in some cases. There is one thing to note that looks like a bug but isn't. We push the fill_mmfile down into the first branch of a conditional. It seems like we would need it on the other branch, too, but we don't; fill_textconv does it for us (in fact, before this, we were just writing over the results of the fill_mmfile on that branch). Here's a timing sample on a commit with 45 changed jpgs and avis. The result is fully textconv cached, but we still wasted a lot of time just pulling the blobs from storage. The total size of the blobs (source and dest) is about 180M. [before] $ time git show >/dev/null real 0m0.352s user 0m0.148s sys 0m0.200s [after] $ time git show >/dev/null real 0m0.009s user 0m0.004s sys 0m0.004s And that's on a warm cache. On a cold cache, the "after" case is not much worse, but the "before" case has to do an extra 180M of I/O. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
Diffstat (limited to 'diff.c')
1 files changed, 4 insertions, 5 deletions
diff --git a/diff.c b/diff.c
index 72d8503..4cb6d9a 100644
--- a/diff.c
+++ b/diff.c
@@ -1680,12 +1680,11 @@ static void builtin_diff(const char *name_a,
- if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0)
- die("unable to read files to diff");
if (!DIFF_OPT_TST(o, TEXT) &&
- ( (diff_filespec_is_binary(one) && !textconv_one) ||
- (diff_filespec_is_binary(two) && !textconv_two) )) {
+ ( (!textconv_one && diff_filespec_is_binary(one)) ||
+ (!textconv_two && diff_filespec_is_binary(two)) )) {
+ if (fill_mmfile(&mf1, one) < 0 || fill_mmfile(&mf2, two) < 0)
+ die("unable to read files to diff");
/* Quite common confusing case */
if (mf1.size == mf2.size &&
!memcmp(mf1.ptr, mf2.ptr, mf1.size))