summaryrefslogtreecommitdiff
path: root/Documentation/technical/bitmap-format.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/technical/bitmap-format.txt')
-rw-r--r--Documentation/technical/bitmap-format.txt33
1 files changed, 33 insertions, 0 deletions
diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
index 7a86bd7..f8c18a0 100644
--- a/Documentation/technical/bitmap-format.txt
+++ b/Documentation/technical/bitmap-format.txt
@@ -21,6 +21,12 @@ GIT bitmap v1 format
requirement for the bitmap index format, also present in JGit,
that greatly reduces the complexity of the implementation.
+ - BITMAP_OPT_HASH_CACHE (0x4)
+ If present, the end of the bitmap file contains
+ `N` 32-bit name-hash values, one per object in the
+ pack. The format and meaning of the name-hash is
+ described below.
+
4-byte entry count (network byte order)
The total count of entries (bitmapped commits) in this bitmap index.
@@ -129,3 +135,30 @@ The bitstream represented by the above chunk is then:
The next word after `L_M` (if any) must again be a RLW, for the next
chunk. For efficient appending to the bitstream, the EWAH stores a
pointer to the last RLW in the stream.
+
+
+== Appendix B: Optional Bitmap Sections
+
+These sections may or may not be present in the `.bitmap` file; their
+presence is indicated by the header flags section described above.
+
+Name-hash cache
+---------------
+
+If the BITMAP_OPT_HASH_CACHE flag is set, the end of the bitmap contains
+a cache of 32-bit values, one per object in the pack. The value at
+position `i` is the hash of the pathname at which the `i`th object
+(counting in index order) in the pack can be found. This can be fed
+into the delta heuristics to compare objects with similar pathnames.
+
+The hash algorithm used is:
+
+ hash = 0;
+ while ((c = *name++))
+ if (!isspace(c))
+ hash = (hash >> 2) + (c << 24);
+
+Note that this hashing scheme is tied to the BITMAP_OPT_HASH_CACHE flag.
+If implementations want to choose a different hashing scheme, they are
+free to do so, but MUST allocate a new header flag (because comparing
+hashes made under two different schemes would be pointless).