authorJunio C Hamano <>2014-01-10 18:32:25 (GMT)
committerJunio C Hamano <>2014-01-10 18:32:25 (GMT)
commit8a334727fc5b2ea0b173ca0af4cfb2209a88139a (patch)
tree6ba05e0d0f1e677844c72d5d6c9aee79c07d9d57 /Documentation
parent061614b309453e98cba5dcc8f922d7bef791451d (diff)
parent615b8f1a8d41e6c27f308e74eacb5ef9e99a26af (diff)
Merge branch 'rt/bfg-ad-in-filter-branch-doc'
* rt/bfg-ad-in-filter-branch-doc: docs: add filter-branch notes on The BFG
@@ -393,7 +393,7 @@ git filter-branch --index-filter \
Checklist for Shrinking a Repository
-git-filter-branch is often used to get rid of a subset of files,
+git-filter-branch can be used to get rid of a subset of files,
usually with some combination of `--index-filter` and
`--subdirectory-filter`. People expect the resulting repository to
be smaller than the original, but you need a few more steps to
@@ -429,6 +429,37 @@ warned.
(or if your git-gc is not new enough to support arguments to
`--prune`, use `git repack -ad; git prune` instead).
+git-filter-branch allows you to make complex shell-scripted rewrites
+of your Git history, but you probably don't need this flexibility if
+you're simply _removing unwanted data_ like large files or passwords.
+For those operations you may want to consider
+link:[The BFG Repo-Cleaner],
+a JVM-based alternative to git-filter-branch, typically at least
+10-50x faster for those use-cases, and with quite different
+* Any particular version of a file is cleaned exactly _once_. The BFG,
+ unlike git-filter-branch, does not give you the opportunity to
+ handle a file differently based on where or when it was committed
+ within your history. This constraint gives the core performance
+ benefit of The BFG, and is well-suited to the task of cleansing bad
+ data - you don't care _where_ the bad data is, you just want it
+ _gone_.
+* By default The BFG takes full advantage of multi-core machines,
+ cleansing commit file-trees in parallel. git-filter-branch cleans
+ commits sequentially (ie in a single-threaded manner), though it
+ _is_ possible to write filters that include their own parallellism,
+ in the scripts executed against each commit.
+* The link:[command options]
+ are much more restrictive than git-filter branch, and dedicated just
+ to the tasks of removing unwanted data- e.g:
+ `--strip-blobs-bigger-than 1M`.
Part of the linkgit:git[1] suite