summaryrefslogtreecommitdiff
path: root/t/perf/p5600-partial-clone.sh
diff options
context:
space:
mode:
authorRafael Silva <rafaeloliveira.cs@gmail.com>2021-04-21 19:32:12 (GMT)
committerJunio C Hamano <gitster@pobox.com>2021-04-28 04:36:13 (GMT)
commita643157d5ac8dddcbf9bfd4fbbd1af914fbb1378 (patch)
tree4e472c97217e778daeb3b6da72db3b9d8613de79 /t/perf/p5600-partial-clone.sh
parentc1fa951d7ea9e943f001ac7c7502995273db5776 (diff)
downloadgit-a643157d5ac8dddcbf9bfd4fbbd1af914fbb1378.zip
git-a643157d5ac8dddcbf9bfd4fbbd1af914fbb1378.tar.gz
git-a643157d5ac8dddcbf9bfd4fbbd1af914fbb1378.tar.bz2
repack: avoid loosening promisor objects in partial clones
When `git repack -A -d` is run in a partial clone, `pack-objects` is invoked twice: once to repack all promisor objects, and once to repack all non-promisor objects. The latter `pack-objects` invocation is with --exclude-promisor-objects and --unpack-unreachable, which loosens all objects unused during this invocation. Unfortunately, this includes promisor objects. Because the -d argument to `git repack` subsequently deletes all loose objects also in packs, these just-loosened promisor objects will be immediately deleted. However, this extra disk churn is unnecessary in the first place. For example, in a newly-cloned partial repo that filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up unpacking all trees and commits into the filesystem because every object, in this particular case, is a promisor object. Depending on the repo size, this increases the disk usage considerably: In my copy of the linux.git, the object directory peaked 26GB of more disk usage. In order to avoid this extra disk churn, pass the names of the promisor packfiles as --keep-pack arguments to the second invocation of `pack-objects`. This informs `pack-objects` that the promisor objects are already in a safe packfile and, therefore, do not need to be loosened. For testing, we need to validate whether any object was loosened. However, the "evidence" (loosened objects) is deleted during the process which prevents us from inspecting the object directory. Instead, let's teach `pack-objects` to count loosened objects and emit via trace2 thus allowing inspecting the debug events after the process is finished. This new event is used on the added regression test. Lastly, add a new perf test to evaluate the performance impact made by this changes (tested on git.git): Test HEAD^ HEAD ---------------------------------------------------------- 5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2% For a bigger repository, such as linux.git, the improvement is even bigger: Test HEAD^ HEAD ------------------------------------------------------------------- 5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1% These improvements are particular big because every object in the newly-cloned partial repository is a promisor object. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 't/perf/p5600-partial-clone.sh')
-rwxr-xr-xt/perf/p5600-partial-clone.sh4
1 files changed, 4 insertions, 0 deletions
diff --git a/t/perf/p5600-partial-clone.sh b/t/perf/p5600-partial-clone.sh
index ca785a3..a965f2c 100755
--- a/t/perf/p5600-partial-clone.sh
+++ b/t/perf/p5600-partial-clone.sh
@@ -35,4 +35,8 @@ test_perf 'count non-promisor commits' '
git -C bare.git rev-list --all --count --exclude-promisor-objects
'
+test_perf 'gc' '
+ git -C bare.git gc
+'
+
test_done