path: root/Documentation/technical
diff options
Diffstat (limited to 'Documentation/technical')
9 files changed, 223 insertions, 18 deletions
diff --git a/Documentation/technical/api-argv-array.txt b/Documentation/technical/api-argv-array.txt
index a6b7d83..1a79781 100644
--- a/Documentation/technical/api-argv-array.txt
+++ b/Documentation/technical/api-argv-array.txt
@@ -53,11 +53,3 @@ Functions
Free all memory associated with the array and return it to the
initial, empty state.
- Detach the argv array from the `struct argv_array`, transferring
- ownership of the allocated array and strings.
- Free the memory allocated by a `struct argv_array` that was later
- detached and is now no longer needed.
diff --git a/Documentation/technical/api-builtin.txt b/Documentation/technical/api-builtin.txt
index e3d6e7a..22a39b9 100644
--- a/Documentation/technical/api-builtin.txt
+++ b/Documentation/technical/api-builtin.txt
@@ -22,11 +22,14 @@ Git:
where options is the bitwise-or of:
- Make sure there is a Git directory to work on, and if there is a
- work tree, chdir to the top of it if the command was invoked
- in a subdirectory. If there is no work tree, no chdir() is
- done.
+ If there is not a Git directory to work on, abort. If there
+ is a work tree, chdir to the top of it if the command was
+ invoked in a subdirectory. If there is no work tree, no
+ chdir() is done.
+ If there is a Git directory, chdir as per RUN_SETUP, otherwise,
+ don't chdir anywhere.
diff --git a/Documentation/technical/api-hashmap.txt b/Documentation/technical/api-hashmap.txt
index b977ae8..ad7a5bd 100644
--- a/Documentation/technical/api-hashmap.txt
+++ b/Documentation/technical/api-hashmap.txt
@@ -8,11 +8,19 @@ Data Structures
`struct hashmap`::
- The hash table structure.
+ The hash table structure. Members can be used as follows, but should
+ not be modified directly:
-The `size` member keeps track of the total number of entries. The `cmpfn`
-member is a function used to compare two entries for equality. The `table` and
-`tablesize` members store the hash table and its size, respectively.
+The `size` member keeps track of the total number of entries (0 means the
+hashmap is empty).
+`tablesize` is the allocated size of the hash table. A non-0 value indicates
+that the hashmap is initialized. It may also be useful for statistical purposes
+(i.e. `size / tablesize` is the current load factor).
+`cmpfn` stores the comparison function specified in `hashmap_init()`. In
+advanced scenarios, it may be useful to change this, e.g. to switch between
+case-sensitive and case-insensitive lookup.
`struct hashmap_entry`::
@@ -58,6 +66,15 @@ Functions
`strihash` and `memihash` are case insensitive versions.
+`unsigned int sha1hash(const unsigned char *sha1)`::
+ Converts a cryptographic hash (e.g. SHA-1) into an int-sized hash code
+ for use in hash tables. Cryptographic hashes are supposed to have
+ uniform distribution, so in contrast to `memhash()`, this just copies
+ the first `sizeof(int)` bytes without shuffling any bits. Note that
+ the results will be different on big-endian and little-endian
+ platforms, so they should not be stored or transferred over the net.
`void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, size_t initial_size)`::
Initializes a hashmap structure.
@@ -101,6 +118,20 @@ hashmap_entry) that has at least been initialized with the proper hash code
If an entry with matching hash code is found, `key` and `keydata` are passed
to `hashmap_cmp_fn` to decide whether the entry matches the key.
+`void *hashmap_get_from_hash(const struct hashmap *map, unsigned int hash, const void *keydata)`::
+ Returns the hashmap entry for the specified hash code and key data,
+ or NULL if not found.
+`map` is the hashmap structure.
+`hash` is the hash code of the entry to look up.
+If an entry with matching hash code is found, `keydata` is passed to
+`hashmap_cmp_fn` to decide whether the entry matches the key. The
+`entry_or_key` parameter points to a bogus hashmap_entry structure that
+should not be used in the comparison.
`void *hashmap_get_next(const struct hashmap *map, const void *entry)`::
Returns the next equal hashmap entry, or NULL if not found. This can be
@@ -162,6 +193,21 @@ more entries.
`hashmap_iter_first` is a combination of both (i.e. initializes the iterator
and returns the first entry, if any).
+`const char *strintern(const char *string)`::
+`const void *memintern(const void *data, size_t len)`::
+ Returns the unique, interned version of the specified string or data,
+ similar to the `String.intern` API in Java and .NET, respectively.
+ Interned strings remain valid for the entire lifetime of the process.
+Can be used as `[x]strdup()` or `xmemdupz` replacement, except that interned
+strings / data must not be modified or freed.
+Interned strings are best used for short strings with high probability of
+Uses a hashmap to store the pool of interned strings.
Usage example
diff --git a/Documentation/technical/api-run-command.txt b/Documentation/technical/api-run-command.txt
index 5d7d7f2..69510ae 100644
--- a/Documentation/technical/api-run-command.txt
+++ b/Documentation/technical/api-run-command.txt
@@ -109,6 +109,13 @@ terminated), of which .argv[0] is the program name to run (usually
without a path). If the command to run is a git command, set argv[0] to
the command name without the 'git-' prefix and set .git_cmd = 1.
+Note that the ownership of the memory pointed to by .argv stays with the
+caller, but it should survive until `finish_command` completes. If the
+.argv member is NULL, `start_command` will point it at the .args
+`argv_array` (so you may use one or the other, but you must use exactly
+one). The memory in .args will be cleaned up automatically during
+`finish_command` (or during `start_command` when it is unsuccessful).
The members .in, .out, .err are used to redirect stdin, stdout,
stderr as follows:
diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt
index 1d00e4d..f9c06a7 100644
--- a/Documentation/technical/api-strbuf.txt
+++ b/Documentation/technical/api-strbuf.txt
@@ -121,10 +121,28 @@ Functions
* Related to the contents of the buffer
+ Strip whitespace from the beginning and end of a string.
+ Equivalent to performing `strbuf_rtrim()` followed by `strbuf_ltrim()`.
Strip whitespace from the end of a string.
+ Strip whitespace from the beginning of a string.
+ Replace the contents of the strbuf with a reencoded form. Returns -1
+ on error, 0 on success.
+ Lowercase each character in the buffer using `tolower`.
Compare two buffers. Returns an integer less than, equal to, or greater
diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
index 20be348..d51a657 100644
--- a/Documentation/technical/api-string-list.txt
+++ b/Documentation/technical/api-string-list.txt
@@ -68,6 +68,11 @@ Functions
* General ones (works with sorted and unsorted lists as well)
+ Initialize the members of the string_list, set `strdup_strings`
+ member according to the value of the second parameter.
Apply a function to each item in a list, retaining only the
@@ -200,3 +205,5 @@ Represents the list itself.
You should not tamper with it.
. Setting the `strdup_strings` member to 1 will strdup() the strings
before adding them, see above.
+. The `compare_strings_fn` member is used to specify a custom compare
+ function, otherwise `strcmp()` is used as the default function.
diff --git a/Documentation/technical/api-trace.txt b/Documentation/technical/api-trace.txt
new file mode 100644
index 0000000..097a651
--- /dev/null
+++ b/Documentation/technical/api-trace.txt
@@ -0,0 +1,97 @@
+trace API
+The trace API can be used to print debug messages to stderr or a file. Trace
+code is inactive unless explicitly enabled by setting `GIT_TRACE*` environment
+The trace implementation automatically adds `timestamp file:line ... \n` to
+all trace messages. E.g.:
+23:59:59.123456 git.c:312 trace: built-in: git 'foo'
+00:00:00.000001 builtin/foo.c:99 foo: some message
+Data Structures
+`struct trace_key`::
+ Defines a trace key (or category). The default (for API functions that
+ don't take a key) is `GIT_TRACE`.
+E.g. to define a trace key controlled by environment variable `GIT_TRACE_FOO`:
+static struct trace_key trace_foo = TRACE_KEY_INIT(FOO);
+static void trace_print_foo(const char *message)
+ trace_print_key(&trace_foo, message);
+Note: don't use `const` as the trace implementation stores internal state in
+the `trace_key` structure.
+`int trace_want(struct trace_key *key)`::
+ Checks whether the trace key is enabled. Used to prevent expensive
+ string formatting before calling one of the printing APIs.
+`void trace_disable(struct trace_key *key)`::
+ Disables tracing for the specified key, even if the environment
+ variable was set.
+`void trace_printf(const char *format, ...)`::
+`void trace_printf_key(struct trace_key *key, const char *format, ...)`::
+ Prints a formatted message, similar to printf.
+`void trace_argv_printf(const char **argv, const char *format, ...)``::
+ Prints a formatted message, followed by a quoted list of arguments.
+`void trace_strbuf(struct trace_key *key, const struct strbuf *data)`::
+ Prints the strbuf, without additional formatting (i.e. doesn't
+ choke on `%` or even `\0`).
+`uint64_t getnanotime(void)`::
+ Returns nanoseconds since the epoch (01/01/1970), typically used
+ for performance measurements.
+Currently there are high precision timer implementations for Linux (using
+`clock_gettime(CLOCK_MONOTONIC)`) and Windows (`QueryPerformanceCounter`).
+Other platforms use `gettimeofday` as time source.
+`void trace_performance(uint64_t nanos, const char *format, ...)`::
+`void trace_performance_since(uint64_t start, const char *format, ...)`::
+ Prints the elapsed time (in nanoseconds), or elapsed time since
+ `start`, followed by a formatted message. Enabled via environment
+ variable `GIT_TRACE_PERFORMANCE`. Used for manual profiling, e.g.:
+uint64_t start = getnanotime();
+/* code section to measure */
+trace_performance_since(start, "foobar");
+uint64_t t = 0;
+for (;;) {
+ /* ignore */
+ t -= getnanotime();
+ /* code section to measure */
+ t += getnanotime();
+ /* ignore */
+trace_performance(t, "frotz");
diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt
index 58c7e87..229f845 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -374,7 +374,7 @@ C: Send one `$GIT_URL/git-upload-pack` request:
C: 0000
The stream is organized into "commands", with each command
-appearing by itself in a pkt-line. Within a command line
+appearing by itself in a pkt-line. Within a command line,
the text leading up to the first space is the command name,
and the remainder of the line to the first LF is the value.
Command lines are terminated with an LF as the last byte of
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index f352a9b..fe6f316 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -129,6 +129,9 @@ Git index format
(Version 4) In version 4, the padding after the pathname does not
+ Interpretation of index entries in split index mode is completely
+ different. See below for details.
== Extensions
=== Cached tree
@@ -198,3 +201,35 @@ Git index format
- At most three 160-bit object names of the entry in stages from 1 to 3
(nothing is written for a missing stage).
+=== Split index
+ In split index mode, the majority of index entries could be stored
+ in a separate file. This extension records the changes to be made on
+ top of that to produce the final index.
+ The signature for this extension is { 'l', 'i, 'n', 'k' }.
+ The extension consists of:
+ - 160-bit SHA-1 of the shared index file. The shared index file path
+ is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the
+ index does not require a shared index file.
+ - An ewah-encoded delete bitmap, each bit represents an entry in the
+ shared index. If a bit is set, its corresponding entry in the
+ shared index will be removed from the final index. Note, because
+ a delete operation changes index entry positions, but we do need
+ original positions in replace phase, it's best to just mark
+ entries for removal, then do a mass deletion after replacement.
+ - An ewah-encoded replace bitmap, each bit represents an entry in
+ the shared index. If a bit is set, its corresponding entry in the
+ shared index will be replaced with an entry in this index
+ file. All replaced entries are stored in sorted order in this
+ index. The first "1" bit in the replace bitmap corresponds to the
+ first index entry, the second "1" bit to the second entry and so
+ on. Replaced entries may have empty path names to save space.
+ The remaining index entries after replaced ones will be added to the
+ final index. These added entries are also sorted by entry namme then
+ stage.