path: root/Documentation
diff options
Diffstat (limited to 'Documentation')
23 files changed, 992 insertions, 37 deletions
diff --git a/Documentation/Makefile b/Documentation/Makefile
index 6232143..fa9e5c0 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -78,6 +78,7 @@ TECH_DOCS += technical/pack-heuristics
TECH_DOCS += technical/pack-protocol
TECH_DOCS += technical/protocol-capabilities
TECH_DOCS += technical/protocol-common
+TECH_DOCS += technical/protocol-v2
TECH_DOCS += technical/racy-git
TECH_DOCS += technical/send-pack-pipeline
TECH_DOCS += technical/shallow
diff --git a/Documentation/RelNotes/2.18.0.txt b/Documentation/RelNotes/2.18.0.txt
index 5f16516..31c3f6d 100644
--- a/Documentation/RelNotes/2.18.0.txt
+++ b/Documentation/RelNotes/2.18.0.txt
@@ -21,6 +21,29 @@ UI, Workflows & Features
* When built with more recent cURL, GIT_SSL_VERSION can now specify
"tlsv1.3" as its value.
+ * "git gui" learned that "~/.ssh/" and
+ "~/.ssh/" are also possible SSH key files.
+ (merge 2e2f0288ef bb/git-gui-ssh-key-files later to maint).
+ * "git gui" performs commit upon CTRL/CMD+ENTER but the
+ CTRL/CMD+KP_ENTER (i.e. enter key on the numpad) did not have the
+ same key binding. It now does.
+ (merge 28a1d94a06 bp/git-gui-bind-kp-enter later to maint).
+ * "git gui" has been taught to work with old versions of tk (like
+ 8.5.7) that do not support "ttk::style theme use" as a way to query
+ the current theme.
+ (merge 4891961105 cb/git-gui-ttk-style later to maint).
+ * "git rebase" has learned to honor "--signoff" option when using
+ backends other than "am" (but not "--preserve-merges").
+ * "git branch --list" during an interrupted "rebase -i" now lets
+ users distinguish the case where a detached HEAD is being rebased
+ and a normal branch is being rebased.
+ * "git mergetools" learned talking to guiffy.
Performance, Internal Implementation, Development Support etc.
@@ -76,6 +99,25 @@ Performance, Internal Implementation, Development Support etc.
* Small test-helper programs have been consolidated into a single
+ * API clean-up around ref-filter code.
+ * Shell completion (in contrib) that gives list of paths have been
+ optimized somewhat.
+ * The index file is updated to record the fsmonitor section after a
+ full scan was made, to avoid wasting the effort that has already
+ spent.
+ * Performance measuring framework in t/perf learned to help bisecting
+ performance regressions.
+ * Some multi-word source filenames are being renamed to separate
+ words with dashes instead of underscores.
+ * An reusable "memory pool" implementation has been extracted from
+ fast-import.c, which in turn has become the first user of the
+ mem-pool API.
Also contains various documentation updates and code clean-ups.
@@ -115,8 +157,44 @@ Fixes since v2.17
(merge a0d51e8d0e eb/cred-helper-ignore-sigpipe later to maint).
+ * "git rebase --keep-empty" still removed an empty commit if the
+ other side contained an empty commit (due to the "does an
+ equivalent patch exist already?" check), which has been corrected.
+ (merge 3d946165e1 pw/rebase-keep-empty-fixes later to maint).
+ * Some codepaths, including the refs API, get and keep relative
+ paths, that go out of sync when the process does chdir(2). The
+ chdir-notify API is introduced to let these codepaths adjust these
+ cached paths to the new current directory.
+ (merge fb9c2d2703 jk/relative-directory-fix later to maint).
+ * "cd sub/dir && git commit ../path" ought to record the changes to
+ the file "sub/path", but this regressed long time ago.
+ (merge 86238e07ef bw/commit-partial-from-subdirectory-fix later to maint).
+ * Recent introduction of "--log-destination" option to "git daemon"
+ did not work well when the daemon was run under "--inetd" mode.
+ (merge e67d906d73 lw/daemon-log-destination later to maint).
+ * Small fix to the autoconf build procedure.
+ (merge 249482daf0 es/fread-reads-dir-autoconf-fix later to maint).
+ * Fix an unexploitable (because the oversized contents are not under
+ attacker's control) buffer overflow.
+ (merge d8579accfa bp/fsmonitor-bufsize-fix later to maint).
* Other minor doc, test and build updates and code cleanups.
(merge 248f66ed8e nd/trace-with-env later to maint).
(merge 14ced5562c ys/bisect-object-id-missing-conversion-fix later to maint).
(merge 5988eb631a ab/doc-hash-brokenness later to maint).
(merge a4d4e32a70 pk/test-avoid-pipe-hiding-exit-status later to maint).
+ (merge 05e293c1ac jk/flockfile-stdio later to maint).
+ (merge e9184b0789 jk/t5561-missing-curl later to maint).
+ (merge b1801b85a3 nd/worktree-move later to maint).
+ (merge bbd374dd20 ak/bisect-doc-typofix later to maint).
+ (merge 4855f06fb3 mn/send-email-credential-doc later to maint).
+ (merge 8523b1e355 en/doc-typoes later to maint).
+ (merge 43b44ccfe7 js/t5404-path-fix later to maint).
+ (merge decf711fc1 ps/test-chmtime-get later to maint).
+ (merge 22d11a6e8e es/worktree-docs later to maint).
+ (merge 92a5dbbc22 tg/use-git-contacts later to maint).
diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index a1d0fec..945f8ed 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -260,8 +260,8 @@ that starts with `-----BEGIN PGP SIGNED MESSAGE-----`. That is
not a text/plain, it's something else.
Send your patch with "To:" set to the mailing list, with "cc:" listing
-people who are involved in the area you are touching (the output from
-`git blame $path` and `git shortlog --no-merges $path` would help to
+people who are involved in the area you are touching (the `git
+contacts` command in `contrib/contacts/` can help to
identify them), to solicit comments and reviews.
:1: footnote:[The current maintainer:]
diff --git a/Documentation/config.txt b/Documentation/config.txt
index 2659153..8213c12 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -530,6 +530,12 @@ core.autocrlf::
This variable can be set to 'input',
in which case no output conversion is performed.
+ A comma and/or whitespace separated list of encodings that Git
+ performs UTF-8 round trip checks on if they are used in an
+ `working-tree-encoding` attribute (see linkgit:gitattributes[5]).
+ The default value is `SHIFT-JIS`.
If false, symbolic links are checked out as small plain files that
contain the link text. linkgit:git-update-index[1] and
@@ -898,6 +904,10 @@ core.notesRef::
This setting defaults to "refs/notes/commits", and it can be overridden by
the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1].
+ Enable git commit graph feature. Allows reading from the
+ commit-graph file.
Enable "sparse checkout" feature. See section "Sparse checkout" in
linkgit:git-read-tree[1] for more information.
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index e3a44f0..f466600 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -568,7 +568,7 @@ the normal order.
Patterns have the same syntax and semantics as patterns used for
-fnmantch(3) without the FNM_PATHNAME flag, except a pathname also
+fnmatch(3) without the FNM_PATHNAME flag, except a pathname also
matches a pattern if removing any number of the final pathname
components matches the pattern. For example, the pattern "`foo*bar`"
matches "`fooasdfbar`" and "`foo/bar/baz/asdf`" but not "`foobarx`".
@@ -592,7 +592,7 @@ endif::git-format-patch[]
Treat all files as text.
- Ignore carrige-return at the end of line when doing a comparison.
+ Ignore carriage-return at the end of line when doing a comparison.
Ignore changes in whitespace at EOL.
diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
index 4a1417b..4b45d83 100644
--- a/Documentation/git-bisect.txt
+++ b/Documentation/git-bisect.txt
@@ -165,8 +165,8 @@ To get a reminder of the currently used terms, use
git bisect terms
-You can get just the old (respectively new) term with `git bisect term
---term-old` or `git bisect term --term-good`.
+You can get just the old (respectively new) term with `git bisect terms
+--term-old` or `git bisect terms --term-good`.
If you would like to use your own terms instead of "bad"/"good" or
"new"/"old", you can choose any names you like (except existing bisect
diff --git a/Documentation/git-commit-graph.txt b/Documentation/git-commit-graph.txt
new file mode 100644
index 0000000..4c97b55
--- /dev/null
+++ b/Documentation/git-commit-graph.txt
@@ -0,0 +1,94 @@
+git-commit-graph - Write and verify Git commit graph files
+'git commit-graph read' [--object-dir <dir>]
+'git commit-graph write' <options> [--object-dir <dir>]
+Manage the serialized commit graph file.
+ Use given directory for the location of packfiles and commit graph
+ file. This parameter exists to specify the location of an alternate
+ that only has the objects directory, not a full .git directory. The
+ commit graph file is expected to be at <dir>/info/commit-graph and
+ the packfiles are expected to be in <dir>/pack.
+Write a commit graph file based on the commits found in packfiles.
+With the `--stdin-packs` option, generate the new commit graph by
+walking objects only in the specified pack-indexes. (Cannot be combined
+with --stdin-commits.)
+With the `--stdin-commits` option, generate the new commit graph by
+walking commits starting at the commits specified in stdin as a list
+of OIDs in hex, one OID per line. (Cannot be combined with
+With the `--append` option, include all commits that are present in the
+existing commit-graph file.
+Read a graph file given by the commit-graph file and output basic
+details about the graph file. Used for debugging purposes.
+* Write a commit graph file for the packed commits in your local .git folder.
+$ git commit-graph write
+* Write a graph file, extending the current graph file using commits
+* in <pack-index>.
+$ echo <pack-index> | git commit-graph write --stdin-packs
+* Write a graph file containing all reachable commits.
+$ git show-ref -s | git commit-graph write --stdin-commits
+* Write a graph file containing all commits in the current
+* commit-graph file along with those reachable from HEAD.
+$ git rev-parse HEAD | git commit-graph write --stdin-commits --append
+* Read basic information from the commit-graph file.
+$ git commit-graph read
+Part of the linkgit:git[1] suite
diff --git a/Documentation/git-fetch-pack.txt b/Documentation/git-fetch-pack.txt
index f7ebe36..c975884 100644
--- a/Documentation/git-fetch-pack.txt
+++ b/Documentation/git-fetch-pack.txt
@@ -88,7 +88,7 @@ be in a separate packet, and the list must end with a flush packet.
infinite even if there is an ancestor-chain that long.
- Deepen or shorten the history of a shallow'repository to
+ Deepen or shorten the history of a shallow repository to
include all reachable commits after <date>.
diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index dffa14a..085d177 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -121,7 +121,7 @@ refname::
stripping with positive <N>, or it becomes the full refname if
stripping with negative <N>. Neither is an error.
-`strip` can be used as a synomym to `lstrip`.
+`strip` can be used as a synonym to `lstrip`.
The type of the object (`blob`, `tree`, `commit`, `tag`).
diff --git a/Documentation/git-mktree.txt b/Documentation/git-mktree.txt
index c3616e7..27fe2b3 100644
--- a/Documentation/git-mktree.txt
+++ b/Documentation/git-mktree.txt
@@ -14,7 +14,7 @@ SYNOPSIS
Reads standard input in non-recursive `ls-tree` output format, and creates
-a tree object. The order of the tree entries is normalised by mktree so
+a tree object. The order of the tree entries is normalized by mktree so
pre-sorting the input is not required. The object name of the tree object
built is written to the standard output.
diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 3277ca1..dd85206 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -364,9 +364,10 @@ default is `--no-fork-point`, otherwise the default is `--fork-point`.
Incompatible with the --interactive option.
- This flag is passed to 'git am' to sign off all the rebased
- commits (see linkgit:git-am[1]). Incompatible with the
- --interactive option.
+ Add a Signed-off-by: trailer to all the rebased commits. Note
+ that if `--interactive` is given then only commits marked to be
+ picked, edited or reworded will have the trailer added. Incompatible
+ with the `--preserve-merges` option.
diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index 71ef97b..60cf96f 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -255,7 +255,7 @@ must be used for each option.
Some email servers (e.g. limit the number emails to be
- sent per session (connection) and this will lead to a faliure when
+ sent per session (connection) and this will lead to a failure when
sending many messages. With this option, send-email will disconnect after
sending $<num> messages and wait for a few seconds (see --relogin-delay)
and reconnect, to work around such a limit. You may want to
@@ -473,16 +473,7 @@ edit ~/.gitconfig to specify your account settings:
If you have multifactor authentication setup on your gmail account, you will
need to generate an app-specific password for use with 'git send-email'. Visit
- to setup an
-app-specific password. Once setup, you can store it with the credentials
- $ git credential fill
- protocol=smtp
- password=app-password
+ to create it.
Once your commits are ready to be sent to the mailing list, run the
following commands:
@@ -491,6 +482,11 @@ following commands:
$ edit outgoing/0000-*
$ git send-email outgoing/*
+The first time you run it, you will be prompted for your credentials. Enter the
+app-specific or your regular password as appropriate. If you have credential
+helper configured (see linkgit:git-credential[1]), the password will be saved in
+the credential store so you won't have to type it the next time.
Note: the following perl modules are required
Net::SMTP::SSL, MIME::Base64 and Authen::SASL
diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index 6c230c0..c16e27e 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -113,7 +113,7 @@ The possible options are:
- 'matching' - Shows ignored files and directories matching an
ignore pattern.
-When 'matching' mode is specified, paths that explicity match an
+When 'matching' mode is specified, paths that explicitly match an
ignored pattern are shown. If a directory matches an ignore pattern,
then it is shown, but not paths contained in the ignored directory. If
a directory does not match an ignore pattern, but all contents are
diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
index e7eb24a..2755ca9 100644
--- a/Documentation/git-worktree.txt
+++ b/Documentation/git-worktree.txt
@@ -27,11 +27,12 @@ out more than one branch at a time. With `git worktree add` a new working
tree is associated with the repository. This new working tree is called a
"linked working tree" as opposed to the "main working tree" prepared by "git
init" or "git clone". A repository has one main working tree (if it's not a
-bare repository) and zero or more linked working trees.
+bare repository) and zero or more linked working trees. When you are done
+with a linked working tree, remove it with `git worktree remove`.
-When you are done with a linked working tree you can simply delete it.
-The working tree's administrative files in the repository (see
-"DETAILS" below) will eventually be removed automatically (see
+If a working tree is deleted without using `git worktree remove`, then
+its associated administrative files, which reside in the repository
+(see "DETAILS" below), will eventually be removed automatically (see
`gc.worktreePruneExpire` in linkgit:git-config[1]), or you can run
`git worktree prune` in the main or any linked working tree to
clean up any stale administrative files.
@@ -106,7 +107,7 @@ OPTIONS
By default, `add` refuses to create a new working tree when
`<commit-ish>` is a branch name and is already checked out by
another working tree and `remove` refuses to remove an unclean
- working tree. This option overrides that safeguard.
+ working tree. This option overrides these safeguards.
-b <new-branch>::
-B <new-branch>::
@@ -232,7 +233,7 @@ The worktree list command has two output formats. The default format shows the
details on a single line with columns. For example:
-S git worktree list
+$ git worktree list
/path/to/bare-source (bare)
/path/to/linked-worktree abcd1234 [master]
/path/to/other-linked-worktree 1234abc (detached HEAD)
@@ -247,7 +248,7 @@ if the value is true. An empty line indicates the end of a worktree. For
-S git worktree list --porcelain
+$ git worktree list --porcelain
worktree /path/to/bare-source
@@ -278,8 +279,7 @@ $ pushd ../temp
# ... hack hack hack ...
$ git commit -a -m 'emergency fix for boss'
$ popd
-$ rm -rf ../temp
-$ git worktree prune
+$ git worktree remove ../temp
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 1094fe2..ee210be 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -279,6 +279,94 @@ few exceptions. Even though...
catch potential problems early, safety triggers.
+Git recognizes files encoded in ASCII or one of its supersets (e.g.
+UTF-8, ISO-8859-1, ...) as text files. Files encoded in certain other
+encodings (e.g. UTF-16) are interpreted as binary and consequently
+built-in Git text processing tools (e.g. 'git diff') as well as most Git
+web front ends do not visualize the contents of these files by default.
+In these cases you can tell Git the encoding of a file in the working
+directory with the `working-tree-encoding` attribute. If a file with this
+attribute is added to Git, then Git reencodes the content from the
+specified encoding to UTF-8. Finally, Git stores the UTF-8 encoded
+content in its internal data structure (called "the index"). On checkout
+the content is reencoded back to the specified encoding.
+Please note that using the `working-tree-encoding` attribute may have a
+number of pitfalls:
+- Alternative Git implementations (e.g. JGit or libgit2) and older Git
+ versions (as of March 2018) do not support the `working-tree-encoding`
+ attribute. If you decide to use the `working-tree-encoding` attribute
+ in your repository, then it is strongly recommended to ensure that all
+ clients working with the repository support it.
+ For example, Microsoft Visual Studio resources files (`*.rc`) or
+ PowerShell script files (`*.ps1`) are sometimes encoded in UTF-16.
+ If you declare `*.ps1` as files as UTF-16 and you add `foo.ps1` with
+ a `working-tree-encoding` enabled Git client, then `foo.ps1` will be
+ stored as UTF-8 internally. A client without `working-tree-encoding`
+ support will checkout `foo.ps1` as UTF-8 encoded file. This will
+ typically cause trouble for the users of this file.
+ If a Git client, that does not support the `working-tree-encoding`
+ attribute, adds a new file `bar.ps1`, then `bar.ps1` will be
+ stored "as-is" internally (in this example probably as UTF-16).
+ A client with `working-tree-encoding` support will interpret the
+ internal contents as UTF-8 and try to convert it to UTF-16 on checkout.
+ That operation will fail and cause an error.
+- Reencoding content to non-UTF encodings can cause errors as the
+ conversion might not be UTF-8 round trip safe. If you suspect your
+ encoding to not be round trip safe, then add it to
+ `core.checkRoundtripEncoding` to make Git check the round trip
+ encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character
+ set) is known to have round trip issues with UTF-8 and is checked by
+ default.
+- Reencoding content requires resources that might slow down certain
+ Git operations (e.g 'git checkout' or 'git add').
+Use the `working-tree-encoding` attribute only if you cannot store a file
+in UTF-8 encoding and if you want Git to be able to process the content
+as text.
+As an example, use the following attributes if your '*.ps1' files are
+UTF-16 encoded with byte order mark (BOM) and you want Git to perform
+automatic line ending conversion based on your platform.
+*.ps1 text working-tree-encoding=UTF-16
+Use the following attributes if your '*.ps1' files are UTF-16 little
+endian encoded without BOM and you want Git to use Windows line endings
+in the working directory. Please note, it is highly recommended to
+explicitly define the line endings with `eol` if the `working-tree-encoding`
+attribute is used to avoid ambiguity.
+*.ps1 text working-tree-encoding=UTF-16LE eol=CRLF
+You can get a list of all available encodings on your platform with the
+following command:
+iconv --list
+If you do not know the encoding of a file, then you can use the `file`
+command to guess the encoding:
+file foo.ps1
diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index 4b8c93e..9d1459a 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -102,6 +102,14 @@ Capabilities for Pushing
Supported commands: 'connect'.
+ Experimental; for internal use only.
+ Can attempt to connect to a remote server for communication
+ using git's wire-protocol version 2. See the documentation
+ for the stateless-connect command for more information.
+Supported commands: 'stateless-connect'.
Can discover remote refs and push local commits and the
history leading up to them to new or existing remote refs.
@@ -136,6 +144,14 @@ Capabilities for Fetching
Supported commands: 'connect'.
+ Experimental; for internal use only.
+ Can attempt to connect to a remote server for communication
+ using git's wire-protocol version 2. See the documentation
+ for the stateless-connect command for more information.
+Supported commands: 'stateless-connect'.
Can discover remote refs and transfer objects reachable from
them to the local object store.
@@ -375,6 +391,22 @@ Supported if the helper has the "export" capability.
Supported if the helper has the "connect" capability.
+'stateless-connect' <service>::
+ Experimental; for internal use only.
+ Connects to the given remote service for communication using
+ git's wire-protocol version 2. Valid replies to this command
+ are empty line (connection established), 'fallback' (no smart
+ transport support, fall back to dumb transports) and just
+ exiting with error message printed (can't connect, don't bother
+ trying to fall back). After line feed terminating the positive
+ (empty) response, the output of the service starts. Messages
+ (both request and response) must consist of zero or more
+ PKT-LINEs, terminating in a flush packet. The client must not
+ expect the server to store any state in between request-response
+ pairs. After the connection ends, the remote helper exits.
+Supported if the helper has the "stateless-connect" capability.
If a fatal error occurs, the program writes the error message to
stderr and exits. The caller should expect that a suitable error
message has been printed if the child closes the connection without
diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index 6b8888d..6c2d23d 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -463,7 +463,7 @@ exclude;;
Pushing a <<def_branch,branch>> means to get the branch's
<<def_head_ref,head ref>> from a remote <<def_repository,repository>>,
- find out if it is a direct ancestor to the branch's local
+ find out if it is an ancestor to the branch's local
head ref, and in that case, putting all
objects, which are <<def_reachable,reachable>> from the local
head ref, and which are missing from the remote
diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 7fae00f..4f44ca2 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -53,7 +53,7 @@ The notable options are:
not be returned even if all of its contents are ignored. In
this case, the contents are returned as individual entries.
-If this is set, files and directories that explicity match an ignore
+If this is set, files and directories that explicitly match an ignore
pattern are reported. Implicity ignored directories (directories that
do not match an ignore pattern, but whose contents are all ignored)
are not reported, instead all of the contents are reported.
diff --git a/Documentation/technical/api-object-access.txt b/Documentation/technical/api-object-access.txt
index a1162e5..5b29622 100644
--- a/Documentation/technical/api-object-access.txt
+++ b/Documentation/technical/api-object-access.txt
@@ -1,7 +1,7 @@
object access API
-Talk about <sha1_file.c> and <object.h> family, things like
+Talk about <sha1-file.c> and <object.h> family, things like
* read_sha1_file()
* read_object_with_reference()
diff --git a/Documentation/technical/api-submodule-config.txt b/Documentation/technical/api-submodule-config.txt
index ee907c4..fb06089 100644
--- a/Documentation/technical/api-submodule-config.txt
+++ b/Documentation/technical/api-submodule-config.txt
@@ -38,7 +38,7 @@ Data Structures
-`void submodule_free()`::
+`void submodule_free(struct repository *r)`::
Use these to free the internally cached values.
diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt
new file mode 100644
index 0000000..ad6af81
--- /dev/null
+++ b/Documentation/technical/commit-graph-format.txt
@@ -0,0 +1,97 @@
+Git commit graph format
+The Git commit graph stores a list of commit OIDs and some associated
+metadata, including:
+- The generation number of the commit. Commits with no parents have
+ generation number 1; commits with parents have generation number
+ one more than the maximum generation number of its parents. We
+ reserve zero as special, and can be used to mark a generation
+ number invalid or as "not computed".
+- The root tree OID.
+- The commit date.
+- The parents of the commit, stored using positional references within
+ the graph file.
+These positional references are stored as unsigned 32-bit integers
+corresponding to the array position withing the list of commit OIDs. We
+use the most-significant bit for special purposes, so we can store at most
+(1 << 31) - 1 (around 2 billion) commits.
+== Commit graph files have the following format:
+In order to allow extensions that add extra data to the graph, we organize
+the body into "chunks" and provide a binary lookup table at the beginning
+of the body. The header includes certain values, such as number of chunks
+and hash type.
+All 4-byte numbers are in network order.
+ 4-byte signature:
+ The signature is: {'C', 'G', 'P', 'H'}
+ 1-byte version number:
+ Currently, the only valid version is 1.
+ 1-byte Hash Version (1 = SHA-1)
+ We infer the hash length (H) from this value.
+ 1-byte number (C) of "chunks"
+ 1-byte (reserved for later use)
+ Current clients should ignore this value.
+ (C + 1) * 12 bytes listing the table of contents for the chunks:
+ First 4 bytes describe the chunk id. Value 0 is a terminating label.
+ Other 8 bytes provide the byte-offset in current file for chunk to
+ start. (Chunks are ordered contiguously in the file, so you can infer
+ the length using the next chunk position if necessary.) Each chunk
+ ID appears at most once.
+ The remaining data in the body is described one chunk at a time, and
+ these chunks may be given in any order. Chunks are required unless
+ otherwise specified.
+ OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
+ The ith entry, F[i], stores the number of OIDs with first
+ byte at most i. Thus F[255] stores the total
+ number of commits (N).
+ OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
+ The OIDs for all commits in the graph, sorted in ascending order.
+ Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
+ * The first H bytes are for the OID of the root tree.
+ * The next 8 bytes are for the positions of the first two parents
+ of the ith commit. Stores value 0xffffffff if no parent in that
+ position. If there are more than two parents, the second value
+ has its most-significant bit on and the other bits store an array
+ position into the Large Edge List chunk.
+ * The next 8 bytes store the generation number of the commit and
+ the commit time in seconds since EPOCH. The generation number
+ uses the higher 30 bits of the first 4 bytes, while the commit
+ time uses the 32 bits of the second 4 bytes, along with the lowest
+ 2 bits of the lowest byte, storing the 33rd and 34th bit of the
+ commit time.
+ Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
+ This list of 4-byte values store the second through nth parents for
+ all octopus merges. The second parent value in the commit data stores
+ an array position within this list along with the most-significant bit
+ on. Starting at that array position, iterate through this list of commit
+ positions for the parents until reaching a value with the most-significant
+ bit on. The other bits correspond to the position of the last parent.
+ H-byte HASH-checksum of all of the above.
diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
new file mode 100644
index 0000000..0550c6d
--- /dev/null
+++ b/Documentation/technical/commit-graph.txt
@@ -0,0 +1,163 @@
+Git Commit Graph Design Notes
+Git walks the commit graph for many reasons, including:
+1. Listing and filtering commit history.
+2. Computing merge bases.
+These operations can become slow as the commit count grows. The merge
+base calculation shows up in many user-facing commands, such as 'merge-base'
+or 'status' and can take minutes to compute depending on history shape.
+There are two main costs here:
+1. Decompressing and parsing commits.
+2. Walking the entire graph to satisfy topological order constraints.
+The commit graph file is a supplemental data structure that accelerates
+commit graph walks. If a user downgrades or disables the 'core.commitGraph'
+config setting, then the existing ODB is sufficient. The file is stored
+as "commit-graph" either in the .git/objects/info directory or in the info
+directory of an alternate.
+The commit graph file stores the commit graph structure along with some
+extra metadata to speed up graph walks. By listing commit OIDs in lexi-
+cographic order, we can identify an integer position for each commit and
+refer to the parents of a commit using those integer positions. We use
+binary search to find initial commits and then use the integer positions
+for fast lookups during the walk.
+A consumer may load the following info for a commit from the graph:
+1. The commit OID.
+2. The list of parents, along with their integer position.
+3. The commit date.
+4. The root tree OID.
+5. The generation number (see definition below).
+Values 1-4 satisfy the requirements of parse_commit_gently().
+Define the "generation number" of a commit recursively as follows:
+ * A commit with no parents (a root commit) has generation number one.
+ * A commit with at least one parent has generation number one more than
+ the largest generation number among its parents.
+Equivalently, the generation number of a commit A is one more than the
+length of a longest path from A to a root commit. The recursive definition
+is easier to use for computation and observing the following property:
+ If A and B are commits with generation numbers N and M, respectively,
+ and N <= M, then A cannot reach B. That is, we know without searching
+ that B is not an ancestor of A because it is further from a root commit
+ than A.
+ Conversely, when checking if A is an ancestor of B, then we only need
+ to walk commits until all commits on the walk boundary have generation
+ number at most N. If we walk commits using a priority queue seeded by
+ generation numbers, then we always expand the boundary commit with highest
+ generation number and can easily detect the stopping condition.
+This property can be used to significantly reduce the time it takes to
+walk commits and determine topological relationships. Without generation
+numbers, the general heuristic is the following:
+ If A and B are commits with commit time X and Y, respectively, and
+ X < Y, then A _probably_ cannot reach B.
+This heuristic is currently used whenever the computation is allowed to
+violate topological relationships due to clock skew (such as "git log"
+with default order), but is not used when the topological order is
+required (such as merge base calculations, "git log --graph").
+In practice, we expect some commits to be created recently and not stored
+in the commit graph. We can treat these commits as having "infinite"
+generation number and walk until reaching commits with known generation
+Design Details
+- The commit graph file is stored in a file named 'commit-graph' in the
+ .git/objects/info directory. This could be stored in the info directory
+ of an alternate.
+- The core.commitGraph config setting must be on to consume graph files.
+- The file format includes parameters for the object ID hash function,
+ so a future change of hash algorithm does not require a change in format.
+Future Work
+- The commit graph feature currently does not honor commit grafts. This can
+ be remedied by duplicating or refactoring the current graft logic.
+- The 'commit-graph' subcommand does not have a "verify" mode that is
+ necessary for integration with fsck.
+- The file format includes room for precomputed generation numbers. These
+ are not currently computed, so all generation numbers will be marked as
+ 0 (or "uncomputed"). A later patch will include this calculation.
+- After computing and storing generation numbers, we must make graph
+ walks aware of generation numbers to gain the performance benefits they
+ enable. This will mostly be accomplished by swapping a commit-date-ordered
+ priority queue with one ordered by generation number. The following
+ operations are important candidates:
+ - paint_down_to_common()
+ - 'log --topo-order'
+- Currently, parse_commit_gently() requires filling in the root tree
+ object for a commit. This passes through lookup_tree() and consequently
+ lookup_object(). Also, it calls lookup_commit() when loading the parents.
+ These method calls check the ODB for object existence, even if the
+ consumer does not need the content. For example, we do not need the
+ tree contents when computing merge bases. Now that commit parsing is
+ removed from the computation time, these lookup operations are the
+ slowest operations keeping graph walks from being fast. Consider
+ loading these objects without verifying their existence in the ODB and
+ only loading them fully when consumers need them. Consider a method
+ such as "ensure_tree_loaded(commit)" that fully loads a tree before
+ using commit->tree.
+- The current design uses the 'commit-graph' subcommand to generate the graph.
+ When this feature stabilizes enough to recommend to most users, we should
+ add automatic graph writes to common operations that create many commits.
+ For example, one could compute a graph on 'clone', 'fetch', or 'repack'
+ commands.
+- A server could provide a commit graph file as part of the network protocol
+ to avoid extra calculations by clients. This feature is only of benefit if
+ the user is willing to trust the file, because verifying the file is correct
+ is as hard as computing it from scratch.
+Related Links
+ Chromium work item for: Serialized Commit Graph
+ An abandoned patch that introduced generation numbers.
+ Discussion about generation numbers on commits and how they interact
+ with fsck.
+ More discussion about generation numbers and not storing them inside
+ commit objects. A valuable quote:
+ "I think we should be moving more in the direction of keeping
+ repo-local caches for optimizations. Reachability bitmaps have been
+ a big performance win. I think we should be doing the same with our
+ properties of commits. Not just generation numbers, but making it
+ cheap to access the graph structure without zlib-inflating whole
+ commit objects (i.e., packv4 or something like the "metapacks" I
+ proposed a few years ago)."
+ A patch to remove the ahead-behind calculation from 'status'.
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
new file mode 100644
index 0000000..136179d
--- /dev/null
+++ b/Documentation/technical/protocol-v2.txt
@@ -0,0 +1,395 @@
+ Git Wire Protocol, Version 2
+This document presents a specification for a version 2 of Git's wire
+protocol. Protocol v2 will improve upon v1 in the following ways:
+ * Instead of multiple service names, multiple commands will be
+ supported by a single service
+ * Easily extendable as capabilities are moved into their own section
+ of the protocol, no longer being hidden behind a NUL byte and
+ limited by the size of a pkt-line
+ * Separate out other information hidden behind NUL bytes (e.g. agent
+ string as a capability and symrefs can be requested using 'ls-refs')
+ * Reference advertisement will be omitted unless explicitly requested
+ * ls-refs command to explicitly request some refs
+ * Designed with http and stateless-rpc in mind. With clear flush
+ semantics the http remote helper can simply act as a proxy
+In protocol v2 communication is command oriented. When first contacting a
+server a list of capabilities will advertised. Some of these capabilities
+will be commands which a client can request be executed. Once a command
+has completed, a client can reuse the connection and request that other
+commands be executed.
+ Packet-Line Framing
+All communication is done using packet-line framing, just as in v1. See
+`Documentation/technical/pack-protocol.txt` and
+`Documentation/technical/protocol-common.txt` for more information.
+In protocol v2 these special packets will have the following semantics:
+ * '0000' Flush Packet (flush-pkt) - indicates the end of a message
+ * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+ Initial Client Request
+In general a client can request to speak protocol v2 by sending
+`version=2` through the respective side-channel for the transport being
+used which inevitably sets `GIT_PROTOCOL`. More information can be
+found in `pack-protocol.txt` and `http-protocol.txt`. In all cases the
+response from the server is the capability advertisement.
+ Git Transport
+When using the git:// transport, you can request to use protocol v2 by
+sending "version=2" as an extra parameter:
+ 003egit-upload-pack /project.git\\0\0version=2\0
+ SSH and File Transport
+When using either the ssh:// or file:// transport, the GIT_PROTOCOL
+environment variable must be set explicitly to include "version=2".
+ HTTP Transport
+When using the http:// or https:// transport a client makes a "smart"
+info/refs request as described in `http-protocol.txt` and requests that
+v2 be used by supplying "version=2" in the `Git-Protocol` header.
+ C: Git-Protocol: version=2
+ C:
+ C: GET $GIT_URL/info/refs?service=git-upload-pack HTTP/1.0
+A v2 server would reply:
+ S: 200 OK
+ S: <Some headers>
+ S: ...
+ S:
+ S: 000eversion 2\n
+ S: <capability-advertisement>
+Subsequent requests are then made directly to the service
+`$GIT_URL/git-upload-pack`. (This works the same for git-receive-pack).
+ Capability Advertisement
+A server which decides to communicate (based on a request from a client)
+using protocol version 2, notifies the client by sending a version string
+in its initial response followed by an advertisement of its capabilities.
+Each capability is a key with an optional value. Clients must ignore all
+unknown keys. Semantics of unknown values are left to the definition of
+each key. Some capabilities will describe commands which can be requested
+to be executed by the client.
+ capability-advertisement = protocol-version
+ capability-list
+ flush-pkt
+ protocol-version = PKT-LINE("version 2" LF)
+ capability-list = *capability
+ capability = PKT-LINE(key[=value] LF)
+ key = 1*(ALPHA | DIGIT | "-_")
+ value = 1*(ALPHA | DIGIT | " -_.,?\/{}[]()<>!@#$%^&*+=:;")
+ Command Request
+After receiving the capability advertisement, a client can then issue a
+request to select the command it wants with any particular capabilities
+or arguments. There is then an optional section where the client can
+provide any command specific parameters or queries. Only a single
+command can be requested at a time.
+ request = empty-request | command-request
+ empty-request = flush-pkt
+ command-request = command
+ capability-list
+ [command-args]
+ flush-pkt
+ command = PKT-LINE("command=" key LF)
+ command-args = delim-pkt
+ *command-specific-arg
+ command-specific-args are packet line framed arguments defined by
+ each individual command.
+The server will then check to ensure that the client's request is
+comprised of a valid command as well as valid capabilities which were
+advertised. If the request is valid the server will then execute the
+command. A server MUST wait till it has received the client's entire
+request before issuing a response. The format of the response is
+determined by the command being executed, but in all cases a flush-pkt
+indicates the end of the response.
+When a command has finished, and the client has received the entire
+response from the server, a client can either request that another
+command be executed or can terminate the connection. A client may
+optionally send an empty request consisting of just a flush-pkt to
+indicate that no more requests will be made.
+ Capabilities
+There are two different types of capabilities: normal capabilities,
+which can be used to to convey information or alter the behavior of a
+request, and commands, which are the core actions that a client wants to
+perform (fetch, push, etc).
+Protocol version 2 is stateless by default. This means that all commands
+must only last a single round and be stateless from the perspective of the
+server side, unless the client has requested a capability indicating that
+state should be maintained by the server. Clients MUST NOT require state
+management on the server side in order to function correctly. This
+permits simple round-robin load-balancing on the server side, without
+needing to worry about state management.
+ agent
+The server can advertise the `agent` capability with a value `X` (in the
+form `agent=X`) to notify the client that the server is running version
+`X`. The client may optionally send its own agent string by including
+the `agent` capability with a value `Y` (in the form `agent=Y`) in its
+request to the server (but it MUST NOT do so if the server did not
+advertise the agent capability). The `X` and `Y` strings may contain any
+printable ASCII characters except space (i.e., the byte range 32 < x <
+127), and are typically of the form "package/version" (e.g.,
+"git/"). The agent strings are purely informative for statistics
+and debugging purposes, and MUST NOT be used to programmatically assume
+the presence or absence of particular features.
+ ls-refs
+`ls-refs` is the command used to request a reference advertisement in v2.
+Unlike the current reference advertisement, ls-refs takes in arguments
+which can be used to limit the refs sent from the server.
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features: "<command>=<feature 1> <feature 2>"
+ls-refs takes in the following arguments:
+ symrefs
+ In addition to the object pointed by it, show the underlying ref
+ pointed by it when showing a symbolic ref.
+ peel
+ Show peeled tags.
+ ref-prefix <prefix>
+ When specified, only references having a prefix matching one of
+ the provided prefixes are displayed.
+The output of ls-refs is as follows:
+ output = *ref
+ flush-pkt
+ ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+ ref-attribute = (symref | peeled)
+ symref = "symref-target:" symref-target
+ peeled = "peeled:" obj-id
+ fetch
+`fetch` is the command used to fetch a packfile in v2. It can be looked
+at as a modified version of the v1 fetch where the ref-advertisement is
+stripped out (since the `ls-refs` command fills that role) and the
+message format is tweaked to eliminate redundancies and permit easy
+addition of future extensions.
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features: "<command>=<feature 1> <feature 2>"
+A `fetch` request can take the following arguments:
+ want <oid>
+ Indicates to the server an object which the client wants to
+ retrieve. Wants can be anything and are not limited to
+ advertised objects.
+ have <oid>
+ Indicates to the server an object which the client has locally.
+ This allows the server to make a packfile which only contains
+ the objects that the client needs. Multiple 'have' lines can be
+ supplied.
+ done
+ Indicates to the server that negotiation should terminate (or
+ not even begin if performing a clone) and that the server should
+ use the information supplied in the request to construct the
+ packfile.
+ thin-pack
+ Request that a thin pack be sent, which is a pack with deltas
+ which reference base objects not contained within the pack (but
+ are known to exist at the receiving end). This can reduce the
+ network traffic significantly, but it requires the receiving end
+ to know how to "thicken" these packs by adding the missing bases
+ to the pack.
+ no-progress
+ Request that progress information that would normally be sent on
+ side-band channel 2, during the packfile transfer, should not be
+ sent. However, the side-band channel 3 is still used for error
+ responses.
+ include-tag
+ Request that annotated tags should be sent if the objects they
+ point to are being sent.
+ ofs-delta
+ Indicate that the client understands PACKv2 with delta referring
+ to its base by position in pack rather than by an oid. That is,
+ they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
+If the 'shallow' feature is advertised the following arguments can be
+included in the clients request as well as the potential addition of the
+'shallow-info' section in the server's response as explained below.
+ shallow <oid>
+ A client must notify the server of all commits for which it only
+ has shallow copies (meaning that it doesn't have the parents of
+ a commit) by supplying a 'shallow <oid>' line for each such
+ object so that the server is aware of the limitations of the
+ client's history. This is so that the server is aware that the
+ client may not have all objects reachable from such commits.
+ deepen <depth>
+ Requests that the fetch/clone should be shallow having a commit
+ depth of <depth> relative to the remote side.
+ deepen-relative
+ Requests that the semantics of the "deepen" command be changed
+ to indicate that the depth requested is relative to the client's
+ current shallow boundary, instead of relative to the requested
+ commits.
+ deepen-since <timestamp>
+ Requests that the shallow clone/fetch should be cut at a
+ specific time, instead of depth. Internally it's equivalent to
+ doing "git rev-list --max-age=<timestamp>". Cannot be used with
+ "deepen".
+ deepen-not <rev>
+ Requests that the shallow clone/fetch should be cut at a
+ specific revision specified by '<rev>', instead of a depth.
+ Internally it's equivalent of doing "git rev-list --not <rev>".
+ Cannot be used with "deepen", but can be used with
+ "deepen-since".
+The response of `fetch` is broken into a number of sections separated by
+delimiter packets (0001), with each section beginning with its section
+ output = *section
+ section = (acknowledgments | shallow-info | packfile)
+ (flush-pkt | delim-pkt)
+ acknowledgments = PKT-LINE("acknowledgments" LF)
+ (nak | *ack)
+ (ready)
+ ready = PKT-LINE("ready" LF)
+ nak = PKT-LINE("NAK" LF)
+ ack = PKT-LINE("ACK" SP obj-id LF)
+ shallow-info = PKT-LINE("shallow-info" LF)
+ *PKT-LINE((shallow | unshallow) LF)
+ shallow = "shallow" SP obj-id
+ unshallow = "unshallow" SP obj-id
+ packfile = PKT-LINE("packfile" LF)
+ *PKT-LINE(%x01-03 *%x00-ff)
+ acknowledgments section
+ * If the client determines that it is finished with negotiations
+ by sending a "done" line, the acknowledgments sections MUST be
+ omitted from the server's response.
+ * Always begins with the section header "acknowledgments"
+ * The server will respond with "NAK" if none of the object ids sent
+ as have lines were common.
+ * The server will respond with "ACK obj-id" for all of the
+ object ids sent as have lines which are common.
+ * A response cannot have both "ACK" lines as well as a "NAK"
+ line.
+ * The server will respond with a "ready" line indicating that
+ the server has found an acceptable common base and is ready to
+ make and send a packfile (which will be found in the packfile
+ section of the same response)
+ * If the server has found a suitable cut point and has decided
+ to send a "ready" line, then the server can decide to (as an
+ optimization) omit any "ACK" lines it would have sent during
+ its response. This is because the server will have already
+ determined the objects it plans to send to the client and no
+ further negotiation is needed.
+ shallow-info section
+ * If the client has requested a shallow fetch/clone, a shallow
+ client requests a fetch or the server is shallow then the
+ server's response may include a shallow-info section. The
+ shallow-info section will be included if (due to one of the
+ above conditions) the server needs to inform the client of any
+ shallow boundaries or adjustments to the clients already
+ existing shallow boundaries.
+ * Always begins with the section header "shallow-info"
+ * If a positive depth is requested, the server will compute the
+ set of commits which are no deeper than the desired depth.
+ * The server sends a "shallow obj-id" line for each commit whose
+ parents will not be sent in the following packfile.
+ * The server sends an "unshallow obj-id" line for each commit
+ which the client has indicated is shallow, but is no longer
+ shallow as a result of the fetch (due to its parents being
+ sent in the following packfile).
+ * The server MUST NOT send any "unshallow" lines for anything
+ which the client has not indicated was shallow as a part of
+ its request.
+ * This section is only included if a packfile section is also
+ included in the response.
+ packfile section
+ * This section is only included if the client has sent 'want'
+ lines in its request and either requested that no more
+ negotiation be done by sending 'done' or if the server has
+ decided it has found a sufficient cut point to produce a
+ packfile.
+ * Always begins with the section header "packfile"
+ * The transmission of the packfile begins immediately after the
+ section header
+ * The data transfer of the packfile is always multiplexed, using
+ the same semantics of the 'side-band-64k' capability from
+ protocol version 1. This means that each packet, during the
+ packfile data stream, is made up of a leading 4-byte pkt-line
+ length (typical of the pkt-line format), followed by a 1-byte
+ stream code, followed by the actual data.
+ The stream code can be one of:
+ 1 - pack data
+ 2 - progress messages
+ 3 - fatal error message just before stream aborts