summaryrefslogtreecommitdiff
path: root/Documentation/technical/index-format.txt
AgeCommit message (Collapse)Author
2018-10-11ieot: add Index Entry Offset Table (IEOT) extensionBen Peart
This patch enables addressing the CPU cost of loading the index by adding additional data to the index that will allow us to efficiently multi- thread the loading and conversion of cache entries. It accomplishes this by adding an (optional) index extension that is a table of offsets to blocks of cache entries in the index file. To make this work for V4 indexes, when writing the cache entries, it periodically "resets" the prefix-compression by encoding the current entry as if the path name for the previous entry is completely different and saves the offset of that entry in the IEOT. Basically, with V4 indexes, it generates offsets into blocks of prefix-compressed entries. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-11eoie: add End of Index Entry (EOIE) extensionBen Peart
The End of Index Entry (EOIE) is used to locate the end of the variable length index entries and the beginning of the extensions. Code can take advantage of this to quickly locate the index extensions without having to parse through all of the index entries. The EOIE extension is always written out to the index file including to the shared index when using the split index feature. Because it is always written out, the SHA checksums in t/t1700-split-index.sh were updated to reflect its inclusion. It is written as an optional extension to ensure compatibility with other git implementations that do not yet support it. It is always written out to ensure it is available as often as possible to speed up index operations. Because it must be able to be loaded before the variable length cache entries and other index extensions, this extension must be written last. The signature for this extension is { 'E', 'O', 'I', 'E' }. The extension consists of: - 32-bit offset to the end of the index entries - 160-bit SHA-1 over the extension types and their sizes (but not their contents). E.g. if we have "TREE" extension that is N-bytes long, "REUC" extension that is M-bytes long, followed by "EOIE", then the hash would be: SHA-1("TREE" + <binary representation of N> + "REUC" + <binary representation of M>) Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01fsmonitor: add documentation for the fsmonitor extension.Ben Peart
This includes the core.fsmonitor setting, the fsmonitor integration hook, and the fsmonitor index extension. Also add documentation for the new fsmonitor options to ls-files and update-index. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-29Merge branch 'jc/em-dash-in-doc'Junio C Hamano
AsciiDoc markup fixes. * jc/em-dash-in-doc: Documentation: AsciiDoc spells em-dash as double-dashes, not triple
2015-10-22Documentation: AsciiDoc spells em-dash as double-dashes, not tripleJunio C Hamano
Again, we do not usually process release notes with AsciiDoc, but it is better to be consistent. This incidentally reveals breakages left by an ancient 5e00439f (Documentation: build html for all files in technical and howto, 2012-10-23). The index-format documentation was originally written to be read as straight text without formatting and when the commit forced everything in Documentation/ to go through AsciiDoc, it did not do any adjustment--hence the double-dashes will be seen in the resulting text that is rendered as preformatted fixed-width without converted into em-dashes. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-17Merge branch 'ta/docfix-index-format-tech'Junio C Hamano
* ta/docfix-index-format-tech: typofix for index-format.txt
2015-07-28typofix for index-format.txtThomas Ackermann
Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: guard and disable on system changesNguyễn Thái Ngọc Duy
If the user enables untracked cache, then - move worktree to an unsupported filesystem - or simply upgrade OS - or move the whole (portable) disk from one machine to another - or access a shared fs from another machine there's no guarantee that untracked cache can still function properly. Record the worktree location and OS footprint in the cache. If it changes, err on the safe side and disable the cache. The user can 'update-index --untracked-cache' again to make sure all conditions are met. This adds a new requirement that setup_git_directory* must be called before read_cache() because we need worktree location by then, or the cache is dropped. This change does not cover all bases, you can fool it if you try hard. The point is to stop accidents. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: save to an index extensionNguyễn Thái Ngọc Duy
Helped-by: Stefan Beller <sbeller@google.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22Merge branch 'nd/split-index'Junio C Hamano
A typofix to the documentation of a feature already in the release. * nd/split-index: index-format.txt: add a missing closing quote
2014-12-11index-format.txt: add a missing closing quoteNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-04Documentation: typofixesThomas Ackermann
In addition to fixing trivial and obvious typos, be careful about the following points: - Spell ASCII, URL and CRC in ALL CAPS; - Spell Linux as Capitalized; - Do not omit periods in "i.e." and "e.g.". Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13read-cache: split-index modeNguyễn Thái Ngọc Duy
This split-index mode is designed to keep write cost proportional to the number of changes the user has made, not the size of the work tree. (Read cost is another matter, to be dealt separately.) This mode stores index info in a pair of $GIT_DIR/index and $GIT_DIR/sharedindex.<SHA-1>. sharedindex is large and unchanged over time while "index" is smaller and updated often. Format details are in index-format.txt, although not everything is implemented in this patch. Shared indexes are not automatically removed, because it's unclear if the shared index is needed by any (even temporary) indexes by just looking at it. After a while you'll collect stale shared indexes. The good news is one shared index is useable for long, until $GIT_DIR/index becomes too big and sluggish that the new shared index must be created. The safest way to clean shared indexes is to turn off split index mode, so shared files are all garbage, delete them all, then turn on split index mode again. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-22typofix: documentationOndřej Bílka
Signed-off-by: Ondřej Bílka <neleai@seznam.cz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-19Merge branch 'nd/doc-index-format'Junio C Hamano
Update the index format documentation to mention the v4 format. * nd/doc-index-format: update-index: list supported idx versions and their features read-cache.c: use INDEX_FORMAT_{LB,UB} in verify_hdr() index-format.txt: mention of v4 is missing in some places
2013-02-22index-format.txt: mention of v4 is missing in some placesNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01Documentation: the name of the system is 'Git', not 'git'Thomas Ackermann
Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01Documentation: avoid poor-man's small caps GITThomas Ackermann
In the earlier days, we used to spell the name of the system as GIT, to simulate as if it were typeset with capital G and IT in small caps. Later we stopped doing so at around 1.6.5 days. Let's stop doing so throughout the documentation. The name to refer to the whole system (and the concept it embodies) is "Git"; the command end-users type is "git". And document this in the coding guideline. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-21Merge branch 'nd/index-format-doc'Junio C Hamano
* nd/index-format-doc: index-format.txt: clarify what is "invalid"
2012-12-13index-format.txt: clarify what is "invalid"Nguyễn Thái Ngọc Duy
A cache-tree entry with a negative entry count is considered invalid by the current Git; it records that we do not know the object name of a tree that would result by writing the directory covered by the cache-tree as a tree object. Clarify that any entry with a negative entry count is invalid, but the implementations must write -1 there. This way, we can later decide to allow writers to use negative values other than -1 to encode optional information on such invalidated entries without harming interoperability; we do not know what will be encoded and how, so we keep these other negative values as reserved for now. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16Documentation/technical: convert plain text files to asciidocThomas Ackermann
These were not originally meant for asciidoc, but they are already so close. Mark them up in asciidoc. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-27index-v4: document the entry formatJunio C Hamano
Document the format so that others can learn from and build on top of the series. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-01Documentation: clarify the invalidated tree entry formatCarlos Martín Nieto
When the entry_count is -1, the tree is invalidated and therefore has not associated hash (or object name). Explicitly state that the next entry starts after the newline. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-23doc: technical details about the index file formatJunio C Hamano
* Clarify "string of unsigned bytes"; * Blob has two variants (regular file vs symlink), not (blob vs symlink); * Clarify permission mode bits; * Clarify ce_namelen() "too long to fit in the length field" case; * Clarify "." etc are forbidden as path components; * Match the description with the internal wording "cache-tree"; * All types of extension begin with signature and length as explained in the first part. Don't repeat the "length" part in the description of each extension (can be mistaken as if there is a separate 32-bit size field inside the extension), but state what the signature for each extension is. * Don't say "Extension tag", as we have said "Extension signature" in the first part---be consistent; * Clarify the invalidation of cache-tree entries; * Correct description on subtree_nr field in the cache-tree; * Clarify the order of entries in cache-tree; Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-27doc: technical details about the index file formatNguyễn Thái Ngọc Duy
This bases on the original work by Robin Rosenberg. Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>