summaryrefslogtreecommitdiff
path: root/midx.c
diff options
context:
space:
mode:
authorDerrick Stolee <derrickstolee@github.com>2022-07-19 15:26:06 (GMT)
committerJunio C Hamano <gitster@pobox.com>2022-07-19 15:38:17 (GMT)
commit068fa54c0034be0b953d798628b06815f9cfaff0 (patch)
tree0c150de0ca27ce1bba559829e887f942e221e17e /midx.c
parent90b2bb710da47f12967a6f6a32640c197ca82685 (diff)
downloadgit-068fa54c0034be0b953d798628b06815f9cfaff0.zip
git-068fa54c0034be0b953d798628b06815f9cfaff0.tar.gz
git-068fa54c0034be0b953d798628b06815f9cfaff0.tar.bz2
midx: reduce memory pressure while writing bitmaps
We noticed that some 'git multi-pack-index write --bitmap' processes were running with very high memory. It turns out that a lot of this memory is required to store a list of every object in the written multi-pack-index, with a second copy that has additional information used for the bitmap writing logic. Using 'valgrind --tool=massif' before this change, the following chart shows how memory load increased and was maintained throughout the process: GB 4.102^ :: | @ @::@@::@@::::::::@::::::@@:#:::::::::::::@@:: : | :::::@@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :::: :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :::: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | : :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @ :: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : | @::: :: : :: @@:@: @ ::@ ::: ::::@: ::: @@:#:::::: :: : :@ :: : 0 +---------------------------------------------------------------> It turns out that the 'struct write_midx_context' data is persisting through the life of the process, including the 'entries' array. This array is used last inside find_commits_for_midx_bitmap() within write_midx_bitmap(). If we free (and nullify) the array at that point, we can free a decent chunk of memory before the bitmap logic adds more to the memory footprint. Here is the massif memory load chart after this change: GB 3.111^ # | # :::::::::::@::::::::::::::@ | # ::::::::::::::::::::::::: : :: : @:: ::::: :: ::@ | @# :::::::::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | :::@#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ | :: @#::: ::: :::::: :::: :: : :::::::: : :: : @:: ::::: :: ::@ 0 +---------------------------------------------------------------> The previous change introduced a refactoring of write_midx_bitmap() to make it more clear how much of the 'struct write_midx_context' instance is needed at different parts of the process. In addition, the following defensive programming measures were put in place: 1. Using FREE_AND_NULL() we will at least get a segfault from reading a NULL pointer instead of a use-after-free. 2. 'entries_nr' is also set to zero to make any loop that would iterate over the entries be trivial. 3. Add significant comments in write_midx_internal() to add warnings for future authors who might accidentally add references to this cleared memory. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'midx.c')
-rw-r--r--midx.c13
1 files changed, 13 insertions, 0 deletions
diff --git a/midx.c b/midx.c
index e2dd808..772ab7d 100644
--- a/midx.c
+++ b/midx.c
@@ -1451,6 +1451,15 @@ static int write_midx_internal(const char *object_dir,
commits = find_commits_for_midx_bitmap(&commits_nr, refs_snapshot, &ctx);
+ /*
+ * The previous steps translated the information from
+ * 'entries' into information suitable for constructing
+ * bitmaps. We no longer need that array, so clear it to
+ * reduce memory pressure.
+ */
+ FREE_AND_NULL(ctx.entries);
+ ctx.entries_nr = 0;
+
if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata,
commits, commits_nr, ctx.pack_order,
refs_snapshot, flags) < 0) {
@@ -1459,6 +1468,10 @@ static int write_midx_internal(const char *object_dir,
goto cleanup;
}
}
+ /*
+ * NOTE: Do not use ctx.entries beyond this point, since it might
+ * have been freed in the previous if block.
+ */
if (ctx.m)
close_object_store(the_repository->objects);