summaryrefslogtreecommitdiff
path: root/Documentation/git-gc.txt
diff options
context:
space:
mode:
authorMatt McCutchen <matt@mattmccutchen.net>2016-11-15 19:08:51 (GMT)
committerJunio C Hamano <gitster@pobox.com>2016-11-16 21:42:17 (GMT)
commitf1350d0c125a1e6a73e3b6461fa90c77843c5f74 (patch)
tree41e3d58e1632d5d3d417bb7b79075a3015dcd4fb /Documentation/git-gc.txt
parent0b65a8dbdb38962e700ee16776a3042beb489060 (diff)
downloadgit-f1350d0c125a1e6a73e3b6461fa90c77843c5f74.zip
git-f1350d0c125a1e6a73e3b6461fa90c77843c5f74.tar.gz
git-f1350d0c125a1e6a73e3b6461fa90c77843c5f74.tar.bz2
git-gc.txt: expand discussion of races with other processes
In general, "git gc" may delete objects that another concurrent process is using but hasn't created a reference to. Git has some mitigations, but they fall short of a complete solution. Document this in the git-gc(1) man page and add a reference from the documentation of the gc.pruneExpire config variable. Based on a write-up by Jeff King: http://marc.info/?l=git&m=147922960131779&w=2 Signed-off-by: Matt McCutchen <matt@mattmccutchen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'Documentation/git-gc.txt')
-rw-r--r--Documentation/git-gc.txt34
1 files changed, 26 insertions, 8 deletions
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index fa15104..4e6ef8f 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -63,11 +63,10 @@ automatic consolidation of packs.
--prune=<date>::
Prune loose objects older than date (default is 2 weeks ago,
overridable by the config variable `gc.pruneExpire`).
- --prune=all prunes loose objects regardless of their age (do
- not use --prune=all unless you know exactly what you are doing.
- Unless the repository is quiescent, you will lose newly created
- objects that haven't been anchored with the refs and end up
- corrupting your repository). --prune is on by default.
+ --prune=all prunes loose objects regardless of their age and
+ increases the risk of corruption if another process is writing to
+ the repository concurrently; see "NOTES" below. --prune is on by
+ default.
--no-prune::
Do not prune any loose objects.
@@ -138,17 +137,36 @@ default is "2 weeks ago".
Notes
-----
-'git gc' tries very hard to be safe about the garbage it collects. In
+'git gc' tries very hard not to delete objects that are referenced
+anywhere in your repository. In
particular, it will keep not only objects referenced by your current set
of branches and tags, but also objects referenced by the index,
remote-tracking branches, refs saved by 'git filter-branch' in
refs/original/, or reflogs (which may reference commits in branches
that were later amended or rewound).
-
-If you are expecting some objects to be collected and they aren't, check
+If you are expecting some objects to be deleted and they aren't, check
all of those locations and decide whether it makes sense in your case to
remove those references.
+On the other hand, when 'git gc' runs concurrently with another process,
+there is a risk of it deleting an object that the other process is using
+but hasn't created a reference to. This may just cause the other process
+to fail or may corrupt the repository if the other process later adds a
+reference to the deleted object. Git has two features that significantly
+mitigate this problem:
+
+. Any object with modification time newer than the `--prune` date is kept,
+ along with everything reachable from it.
+
+. Most operations that add an object to the database update the
+ modification time of the object if it is already present so that #1
+ applies.
+
+However, these features fall short of a complete solution, so users who
+run commands concurrently have to live with some risk of corruption (which
+seems to be low in practice) unless they turn off automatic garbage
+collection with 'git config gc.auto 0'.
+
HOOKS
-----