path: root/Documentation/technical
diff options
authorJunio C Hamano <>2014-07-16 18:25:40 (GMT)
committerJunio C Hamano <>2014-07-16 18:25:40 (GMT)
commit788cef81d40070d5755490441abad1a27bc120b7 (patch)
treeebc44e5c83def9855aac4559e87f201882d8f955 /Documentation/technical
parente0a064a10749a8039f28e9bd3de0ffb5cf99e678 (diff)
parent3e52f70b1505956bf91cb7b107e526d1c67da0a6 (diff)
Merge branch 'nd/split-index'
An experiment to use two files (the base file and incremental changes relative to it) to represent the index to reduce I/O cost of rewriting a large index when only small part of the working tree changes. * nd/split-index: (32 commits) t1700: new tests for split-index mode t2104: make sure split index mode is off for the version test read-cache: force split index mode with GIT_TEST_SPLIT_INDEX read-tree: note about dropping split-index mode or index version read-tree: force split-index mode off on --index-output rev-parse: add --shared-index-path to get shared index path update-index --split-index: do not split if $GIT_DIR is read only update-index: new options to enable/disable split index mode split-index: strip pathname of on-disk replaced entries split-index: do not invalidate cache-tree at read time split-index: the reading part split-index: the writing part read-cache: mark updated entries for split index read-cache: save deleted entries in split index read-cache: mark new entries for split index read-cache: split-index mode read-cache: save index SHA-1 after reading entry.c: update cache_changed if refresh_cache is set in checkout_entry() cache-tree: mark istate->cache_changed on prime_cache_tree() cache-tree: mark istate->cache_changed on cache tree update ...
Diffstat (limited to 'Documentation/technical')
1 files changed, 35 insertions, 0 deletions
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index f352a9b..fe6f316 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -129,6 +129,9 @@ Git index format
(Version 4) In version 4, the padding after the pathname does not
+ Interpretation of index entries in split index mode is completely
+ different. See below for details.
== Extensions
=== Cached tree
@@ -198,3 +201,35 @@ Git index format
- At most three 160-bit object names of the entry in stages from 1 to 3
(nothing is written for a missing stage).
+=== Split index
+ In split index mode, the majority of index entries could be stored
+ in a separate file. This extension records the changes to be made on
+ top of that to produce the final index.
+ The signature for this extension is { 'l', 'i, 'n', 'k' }.
+ The extension consists of:
+ - 160-bit SHA-1 of the shared index file. The shared index file path
+ is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the
+ index does not require a shared index file.
+ - An ewah-encoded delete bitmap, each bit represents an entry in the
+ shared index. If a bit is set, its corresponding entry in the
+ shared index will be removed from the final index. Note, because
+ a delete operation changes index entry positions, but we do need
+ original positions in replace phase, it's best to just mark
+ entries for removal, then do a mass deletion after replacement.
+ - An ewah-encoded replace bitmap, each bit represents an entry in
+ the shared index. If a bit is set, its corresponding entry in the
+ shared index will be replaced with an entry in this index
+ file. All replaced entries are stored in sorted order in this
+ index. The first "1" bit in the replace bitmap corresponds to the
+ first index entry, the second "1" bit to the second entry and so
+ on. Replaced entries may have empty path names to save space.
+ The remaining index entries after replaced ones will be added to the
+ final index. These added entries are also sorted by entry namme then
+ stage.