path: root/Documentation/gitattributes.txt
diff options
authorJeff King <>2010-04-02 00:12:15 (GMT)
committerJunio C Hamano <>2010-04-02 07:05:31 (GMT)
commitd9bae1a178f0f8b198ea611e874975214ad6f990 (patch)
tree33918127aca49cf9c33f9d83371e4725641f5333 /Documentation/gitattributes.txt
parent840383b2c2bd7179604f5c2595bf95e22a4e0c84 (diff)
diff: cache textconv output
Running a textconv filter can take a long time. It's particularly bad for a large file which needs to be spooled to disk, but even for small files, the fork+exec overhead can add up for something like "git log -p". This patch uses the notes-cache mechanism to keep a fast cache of textconv output. Caches are stored in refs/notes/textconv/$x, where $x is the userdiff driver defined in gitattributes. Caching is enabled only if diff.$x.cachetextconv is true. In my test repo, on a commit with 45 jpg and avi files changed and a textconv to show their exif tags: [before] $ time git show >/dev/null real 0m13.724s user 0m12.057s sys 0m1.624s [after, first run] $ git config diff.mfo.cachetextconv true $ time git show >/dev/null real 0m14.252s user 0m12.197s sys 0m1.800s [after, subsequent runs] $ time git show >/dev/null real 0m0.352s user 0m0.148s sys 0m0.200s So for a slight (3.8%) cost on the first run, we achieve an almost 40x speed up on subsequent runs. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
Diffstat (limited to 'Documentation/gitattributes.txt')
1 files changed, 20 insertions, 0 deletions
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index d892e64..a8500d1 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -414,6 +414,26 @@ because it quickly conveys the changes you have made), you
should generate it separately and send it as a comment _in
addition to_ the usual binary diff that you might send.
+Because text conversion can be slow, especially when doing a
+large number of them with `git log -p`, git provides a mechanism
+to cache the output and use it in future diffs. To enable
+caching, set the "cachetextconv" variable in your diff driver's
+config. For example:
+[diff "jpg"]
+ textconv = exif
+ cachetextconv = true
+This will cache the result of running "exif" on each blob
+indefinitely. If you change the textconv config variable for a
+diff driver, git will automatically invalidate the cache entries
+and re-run the textconv filter. If you want to invalidate the
+cache manually (e.g., because your version of "exif" was updated
+and now produces better output), you can remove the cache
+manually with `git update-ref -d refs/notes/textconv/jpg` (where
+"jpg" is the name of the diff driver, as in the example above).
Performing a three-way merge