diff options
author | Taylor Blau <me@ttaylorr.com> | 2023-06-07 10:16:17 (GMT) |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2023-06-12 20:54:38 (GMT) |
commit | 73320e49add4b56aba9bf5236a216498fa8ccc22 (patch) | |
tree | 5972c11b69f182de81f8f5b2b572d893aa854c21 /builtin/repack.c | |
parent | fe86abd7511a9a6862d5706c6fa1d9b57a63ba09 (diff) | |
download | git-73320e49add4b56aba9bf5236a216498fa8ccc22.zip git-73320e49add4b56aba9bf5236a216498fa8ccc22.tar.gz git-73320e49add4b56aba9bf5236a216498fa8ccc22.tar.bz2 |
builtin/repack.c: only collect fully-formed packs
To partition the set of packs based on which ones are "kept" (either
they have a .keep file, or were otherwise marked via the `--keep-pack`
option) and "non-kept" ones (anything else), `git repack` uses its
`collect_pack_filenames()` function.
Ordinarily, we would rely on a convenience function such as
`get_all_packs()` to enumerate and partition the set of packs. But
`collect_pack_filenames()` uses `readdir()` directly to read the
contents of the "$GIT_DIR/objects/pack" directory, and adds each entry
ending in ".pack" to the appropriate list (either kept, or non-kept as
above).
This is subtly racy, since `collect_pack_filenames()` may see a pack
that is not fully staged (i.e., it is missing its ".idx" file).
Ordinarily, this doesn't cause a problem. But it can cause issues when
generating a cruft pack.
This is because `git repack` feeds (among other things) the list of
existing kept packs down to `git pack-objects --cruft` to indicate that
any kept packs will not be removed from the repository (so that the
cruft pack machinery can avoid packing objects that appear in those
packs as cruft).
But `read_cruft_objects()` lists packfiles by calling `get_all_packs()`.
So if a ".pack" file exists (necessary to get that pack to appear to
`collect_pack_filenames()`), but doesn't have a corresponding ".idx"
file (necessary to get that pack to appear via `get_all_packs()`), we'll
complain with:
fatal: could not find pack '.tmp-5841-pack-a6b0150558609c323c496ced21de6f4b66589260.pack'
Fix the above by teaching `collect_pack_filenames()` to only collect
packs with their corresponding `*.idx` files in place, indicating that
those packs have been fully staged.
There are a couple of things worth noting:
- Since each entry in the `extra_keep` list (which contains the
`--keep-pack` names) has a `*.pack` suffix, we'll have to swap the
suffix from ".pack" to ".idx", and compare that instead.
- Since we use the the `fname_kept_list` to figure out which packs to
delete (with `git repack -d`), we would have previously deleted a
`*.pack` with no index (since the existince of a ".pack" file is
necessary and sufficient to include that pack in the list of
existing non-kept packs).
Now we will leave it alone (since that pack won't appear in the
list). This is far more correct behavior, since we don't want
to race with a pack being staged. Deleting a partially staged pack
is unlikely, however, since the window of time between staging a
pack and moving its .idx file into place is miniscule.
Note that this window does *not* include the time it takes to
receive and index the pack, since the incoming data goes into
"$GIT_DIR/objects/tmp_pack_XXXXXX", which does not end in ".pack"
and is thus ignored by collect_pack_filenames().
In the future, this function should probably be rewritten as a callback
to `for_each_file_in_pack_dir()`, but this is the simplest change we
could do in the short-term.
Reported-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'builtin/repack.c')
-rw-r--r-- | builtin/repack.c | 14 |
1 files changed, 10 insertions, 4 deletions
diff --git a/builtin/repack.c b/builtin/repack.c index 0541c3c..1e21a21 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -95,8 +95,8 @@ static int repack_config(const char *var, const char *value, void *cb) } /* - * Adds all packs hex strings to either fname_nonkept_list or - * fname_kept_list based on whether each pack has a corresponding + * Adds all packs hex strings (pack-$HASH) to either fname_nonkept_list + * or fname_kept_list based on whether each pack has a corresponding * .keep file or not. Packs without a .keep file are not to be kept * if we are going to pack everything into one file. */ @@ -107,6 +107,7 @@ static void collect_pack_filenames(struct string_list *fname_nonkept_list, DIR *dir; struct dirent *e; char *fname; + struct strbuf buf = STRBUF_INIT; if (!(dir = opendir(packdir))) return; @@ -115,11 +116,15 @@ static void collect_pack_filenames(struct string_list *fname_nonkept_list, size_t len; int i; - if (!strip_suffix(e->d_name, ".pack", &len)) + if (!strip_suffix(e->d_name, ".idx", &len)) continue; + strbuf_reset(&buf); + strbuf_add(&buf, e->d_name, len); + strbuf_addstr(&buf, ".pack"); + for (i = 0; i < extra_keep->nr; i++) - if (!fspathcmp(e->d_name, extra_keep->items[i].string)) + if (!fspathcmp(buf.buf, extra_keep->items[i].string)) break; fname = xmemdupz(e->d_name, len); @@ -136,6 +141,7 @@ static void collect_pack_filenames(struct string_list *fname_nonkept_list, } } closedir(dir); + strbuf_release(&buf); string_list_sort(fname_kept_list); } |