summaryrefslogtreecommitdiff
path: root/Documentation/technical/multi-pack-index.txt
AgeCommit message (Collapse)Author
2022-01-27midx.c: make changing the preferred pack safeTaylor Blau
The previous patch demonstrates a bug where a MIDX's auxiliary object order can become out of sync with a MIDX bitmap. This is because of two confounding factors: - First, the object order is stored in a file which is named according to the multi-pack index's checksum, and the MIDX does not store the object order. This means that the object order can change without altering the checksum. - But the .rev file is moved into place with finalize_object_file(), which link(2)'s the file into place instead of renaming it. For us, that means that a modified .rev file will not be moved into place if MIDX's checksum was unchanged. This fix is to force the MIDX's checksum to change when the preferred pack changes but the set of packs contained in the MIDX does not. In other words, when the object order changes, the MIDX's checksum needs to change with it (regardless of whether the MIDX is tracking the same or different packs). This prevents a race whereby changing the object order (but not the packs themselves) enables a reader to see the new .rev file with the old MIDX, or similarly seeing the new bitmap with the old object order. But why can't we just stop hardlinking the .rev into place instead adding additional data to the MIDX? Suppose that's what we did. Then when we go to generate the new bitmap, we'll load the old MIDX bitmap, along with the MIDX that it references. That's fine, since the new MIDX isn't moved into place until after the new bitmap is generated. But the new object order *has* been moved into place. So we'll read the old bitmaps in the new order when generating the new bitmap file, meaning that without this secondary change, bitmap generation itself would become a victim of the race described here. This can all be prevented by forcing the MIDX's checksum to change when the object order does. By embedding the entire object order into the MIDX, we do just that. That is, the MIDX's checksum will change in response to any perturbation of the underlying object order. In t5326, this will cause the MIDX's checksum to update (even without changing the set of packs in the MIDX), preventing the stale read problem. Note that this makes it safe to continue to link(2) the MIDX .rev file into place, since it is now impossible to have a .rev file that is out-of-sync with the MIDX whose checksum it references. (But we will do away with MIDX .rev files later in this series anyway, so this is somewhat of a moot point). In theory, it is possible to store a "fingerprint" of the full object order here, so long as that fingerprint changes at least as often as the full object order does. Some possibilities here include storing the identity of the preferred pack, along with the mtimes of the non-preferred packs in a consistent order. But storing a limited part of the information makes it difficult to reason about whether or not there are gaps between the two that would cause us to get bitten by this bug again. Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-10Merge branch 'jt/midx-doc-fix'Junio C Hamano
Docfix. * jt/midx-doc-fix: Doc: no midx and partial clone relation
2021-12-10Merge branch 'tl/midx-docfix'Junio C Hamano
Doc mark-up fix. * tl/midx-docfix: midx: fix a formatting issue in "multi-pack-index.txt"
2021-11-22Doc: no midx and partial clone relationJonathan Tan
The multi-pack index treats promisor packfiles (that is, packfiles that have an accompanying .promisor file) the same as other packfiles. Remove a section in the documentation that seems to indicate otherwise. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18midx: fix a formatting issue in "multi-pack-index.txt"Teng Long
There is a formatting issue in "multi-pack-index.html", corresponding to the nesting bulleted list of a wrong usage in "multi-pack-index.txt" and this commit fix the problem. In ASCIIDOC, it doesn't treat an indented character as the beginning of a sub-list. If we want to write a nested bulleted list, we could just use ASTERISK without any DASH like: " * Level 1 list item ** Level 2 list item *** Level 3 list item ** Level 2 list item * Level 1 list item ** Level 2 list item * Level 1 list item " The DASH can be used for bulleted list too, But the DASH is suggested only to be used as the marker for the first level because the DASH doesn’t work well or a best practice for nested lists, like (dash is as level 2 below): " * Level 1 list item - Level 2 list item * Level 1 list item " ASTERISK is recommanded to use because it works intuitively and clearly ("marker length = nesting level") in nested lists, but the DASH can't. However, when you want to write a non-nested bulleted lists, DASH works too, like: " - Level 1 list item - Level 1 list item - Level 1 list item " Reviewed-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Teng Long <dyroneteng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-06Merge branch 'ew/midx-doc-update'Junio C Hamano
Doc tweak. * ew/midx-doc-update: doc/technical: update note about core.multiPackIndex
2021-09-24doc/technical: update note about core.multiPackIndexEric Wong
MIDX files are used by default since commit 18e449f86b74 (midx: enable core.multiPackIndex by default, 2020-09-25) Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24Documentation: describe MIDX-based bitmapsTaylor Blau
Update the technical documentation to describe the multi-pack bitmap format. This patch merely introduces the new format, and describes its high-level ideas. Git does not yet know how to read nor write these multi-pack variants, and so the subsequent patches will: - Introduce code to interpret multi-pack bitmaps, according to this document. - Then, introduce code to write multi-pack bitmaps from the 'git multi-pack-index write' sub-command. Finally, the implementation will gain tests in subsequent patches (as opposed to inline with the patch teaching Git how to write multi-pack bitmaps) to avoid a cyclic dependency. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-01midx: allow marking a pack as preferredTaylor Blau
When multiple packs in the multi-pack index contain the same object, the MIDX machinery must make a choice about which pack it associates with that object. Prior to this patch, the lowest-ordered[1] pack was always selected. Pack selection for duplicate objects is relatively unimportant today, but it will become important for multi-pack bitmaps. This is because we can only invoke the pack-reuse mechanism when all of the bits for reused objects come from the reuse pack (in order to ensure that all reused deltas can find their base objects in the same pack). To encourage the pack selection process to prefer one pack over another (the pack to be preferred is the one a caller would like to later use as a reuse pack), introduce the concept of a "preferred pack". When provided, the MIDX code will always prefer an object found in a preferred pack over any other. No format changes are required to store the preferred pack, since it will be able to be inferred with a corresponding MIDX bitmap, by looking up the pack associated with the object in the first bit position (this ordering is described in detail in a subsequent commit). [1]: the ordering is specified by MIDX internals; for our purposes we can consider the "lowest ordered" pack to be "the one with the most-recent mtime. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-17Merge branch 'jb/midx-doc-update'Junio C Hamano
Doc update. * jb/midx-doc-update: docs: multi-pack-index: remove note about future 'verify' work
2020-12-14docs: multi-pack-index: remove note about future 'verify' workJohannes Berg
This was implemented in the 'git multi-pack-index' command and merged in 468b3221 (Merge branch 'ds/multi-pack-verify', 2018-10-10). And there's no 'git midx' command. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-08Merge branch 'jb/doc-multi-pack-idx-fix'Junio C Hamano
Typofix. * jb/doc-multi-pack-idx-fix: multi-pack-index: correct configuration in documentation
2020-01-04multi-pack-index: correct configuration in documentationJohannes Berg
It's core.multiPackIndex, not pack.multiIndex. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-30doc: replace public-inbox links with lore.kernel.orgJeff King
Since we're now recommending lore.kernel.org (and because the public-inbox.org domain might eventually go away), let's update our internal references to use it, too. That future-proofs our references, and sets the example we want people to follow. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-12multi-pack-index: add design documentDerrick Stolee
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>