summaryrefslogtreecommitdiff
path: root/Documentation/git-cat-file.txt
AgeCommit message (Collapse)Author
2022-02-18cat-file: add --batch-command modeJohn Cai
Add a new flag --batch-command that accepts commands and arguments from stdin, similar to git-update-ref --stdin. At GitLab, we use a pair of long running cat-file processes when accessing object content. One for iterating over object metadata with --batch-check, and the other to grab object contents with --batch. However, if we had --batch-command, we wouldn't need to keep both processes around, and instead just have one --batch-command process where we can flip between getting object info, and getting object contents. Since we have a pair of cat-file processes per repository, this means we can get rid of roughly half of long lived git cat-file processes. Given there are many repositories being accessed at any given time, this can lead to huge savings. git cat-file --batch-command will enter an interactive command mode whereby the user can enter in commands and their arguments that get queued in memory: <command1> [arg1] [arg2] LF <command2> [arg1] [arg2] LF When --buffer mode is used, commands will be queued in memory until a flush command is issued that execute them: flush LF The reason for a flush command is that when a consumer process (A) talks to a git cat-file process (B) and interactively writes to and reads from it in --buffer mode, (A) needs to be able to control when the buffer is flushed to stdout. Currently, from (A)'s perspective, the only way is to either 1. kill (B)'s process 2. send an invalid object to stdin. 1. is not ideal from a performance perspective as it will require spawning a new cat-file process each time, and 2. is hacky and not a good long term solution. With this mechanism of queueing up commands and letting (A) issue a flush command, process (A) can control when the buffer is flushed and can guarantee it will receive all of the output when in --buffer mode. --batch-command also will not allow (B) to flush to stdout until a flush is received. This patch adds the basic structure for adding command which can be extended in the future to add more commands. It also adds the following two commands (on top of the flush command): contents <object> LF info <object> LF The contents command takes an <object> argument and prints out the object contents. The info command takes an <object> argument and prints out the object metadata. These can be used in the following way with --buffer: info <object> LF contents <object> LF contents <object> LF info <object> LF flush LF info <object> LF flush LF When used without --buffer: info <object> LF contents <object> LF contents <object> LF info <object> LF info <object> LF Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-12cat-file: don't whitespace-pad "(...)" in SYNOPSIS and usage outputÆvar Arnfjörð Bjarmason
Fix up whitespace issues around "(... | ...)" in the SYNOPSIS and usage. These were introduced in ab/cat-file series. See e145efa6059 (Merge branch 'ab/cat-file' into next, 2022-01-05). In particular 57d6a1cf96, 5a40417876 and 97fe7250753 in that series. We'll now correctly emit this usage output: $ git cat-file -h usage: git cat-file <type> <object> or: git cat-file (-e | -p) <object> or: git cat-file (-t | -s) [--allow-unknown-type] <object> [...] Before this the last line of that would be inconsistent with the preceding "(-e | -p)": or: git cat-file ( -t | -s ) [--allow-unknown-type] <object> Reported-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-30cat-file docs: fix SYNOPSIS and "-h" outputÆvar Arnfjörð Bjarmason
There were various inaccuracies in the previous SYNOPSIS output, e.g. "--path" is not something that can optionally go with any options except --textconv or --filters, as the output implied. The opening line of the DESCRIPTION section is also "In its first form[...]", which refers to "git cat-file <type> <object>", but the SYNOPSIS section wasn't showing that as the first form! That part of the documentation made sense in d83a42f34a6 (Documentation: minor grammatical fixes in git-cat-file.txt, 2009-03-22) when it was introduced, but since then various options that were added have made that intro make no sense in the context it was in. Now the two will match again. The usage output here is not properly aligned on "master" currently, but will be with my in-flight 4631cfc20bd (parse-options: properly align continued usage output, 2021-09-21), so let's indent things correctly in the C code in anticipation of that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08cat-file: disable refs/replace with --batch-all-objectsJeff King
When we're enumerating all objects in the object database, it doesn't make sense to respect refs/replace. The point of this option is to enumerate all of the objects in the database at a low level. By definition we'd already show the replacement object's contents (under its real oid), and showing those contents under another oid is almost certainly working against what the user is trying to do. Note that you could make the same argument for something like: git show-index <foo.idx | awk '{print $2}' | git cat-file --batch but there we can't know in cat-file exactly what the user intended, because we don't know the source of the input. They could be trying to do low-level debugging, or they could be doing something more high-level (e.g., imagine a porcelain built around cat-file for its object accesses). So in those cases, we'll have to rely on the user specifying "git --no-replace-objects" to tell us what to do. One _could_ make an argument that "cat-file --batch" is sufficiently low-level plumbing that it should not respect replace-objects at all (and the caller should do any replacement if they want it). But we have been doing so for some time. The history is a little tangled: - looking back as far as v1.6.6, we would not respect replace refs for --batch-check, but would for --batch (because the former used sha1_object_info(), and the replace mechanism only affected actual object reads) - this discrepancy was made even weirder by 98e2092b50 (cat-file: teach --batch to stream blob objects, 2013-07-10), where we always output the header using the --batch-check code, and then printed the object separately. This could lead to "cat-file --batch" dying (when it notices the size or type changed for a non-blob object) or even producing bogus output (in streaming mode, we didn't notice that we wrote the wrong number of bytes). - that persisted until 1f7117ef7a (sha1_file: perform object replacement in sha1_object_info_extended(), 2013-12-11), which then respected replace refs for both forms. So it has worked reliably this way for over 7 years, and we should make sure it continues to do so. That could also be an argument that --batch-all-objects should not change behavior (which this patch is doing), but I really consider the current behavior to be an unintended bug. It's a side effect of how the code is implemented (feeding the oids back into oid_object_info() rather than looking at what we found while reading the loose and packed object storage). The implementation is straight-forward: we just disable the global read_replace_refs flag when we're in --batch-all-objects mode. It would perhaps be a little cleaner to change the flag we pass to oid_object_info_extended(), but that's not enough. We also read objects via read_object_file() and stream_blob_to_fd(). The former could switch to its _extended() form, but the streaming code has no mechanism for disabling replace refs. Setting the global flag works, and as a bonus, it's impossible to have any "oops, we're sometimes replacing the object and sometimes not" bugs in the output (like the ones caused by 98e2092b50 above). The tests here cover the regular-input and --batch-all-objects cases, for both --batch-check and --batch. There is a test in t6050 that covers the regular-input case with --batch already, but this new one goes much further in actually verifying the output (plus covering --batch-check explicitly). This is perhaps a little overkill and the tests would be simpler just covering --batch-check, but I wanted to make sure we're checking that --batch output is consistent between the header and the content. The global-flag technique used here makes that easy to get right, but this is future-proofing us against regressions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08cat-file: mention --unordered along with --batch-all-objectsJeff King
The note on ordering for --batch-all-objects was written when that was the only possible ordering. These days we have --unordered, too, so let's point to it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-04git-cat-file.txt: remove references to "sha1"Denton Liu
As part of the hash-transition, git can operate on more than just SHA-1 repositories. Replace "sha1"-specific documentation with hash-agnostic terminology. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-04git-cat-file.txt: monospace args, placeholders and filenamesDenton Liu
In modern documentation, args, placeholders and filenames are monospaced. Apply monospace formatting to these objects. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01cat-file: add missing [=<format>] to usage/synopsisChristian Couder
When displaying cat-file usage, the fact that a <format> can be specified is only visible when lookling at the --batch and --batch-check options which are shown like this: --batch[=<format>] show info and content of objects fed from the standard input --batch-check[=<format>] show info about objects fed from the standard input It seems more coherent and improves discovery to also show it on the usage line. In the documentation the DESCRIPTION tells us that "The output format can be overridden using the optional <format> argument", but we can't see the <format> argument in the SYNOPSIS above the description which is confusing. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-07Merge branch 'dt/cat-file-batch-ambiguous'Junio C Hamano
"git cat-file --batch" reported a dangling symbolic link by mistake, when it wanted to report that a given name is ambiguous. * dt/cat-file-batch-ambiguous: t1512: test ambiguous cat-file --batch and --batch-output Do not print 'dangling' for cat-file in case of ambiguity
2019-01-18Do not print 'dangling' for cat-file in case of ambiguityDavid Turner
The return values -1 and -2 from get_oid could mean two different things, depending on whether they were from an enum returned by get_tree_entry_follow_symlinks, or from a different code path. This caused 'dangling' to be printed from a git cat-file in the case of an ambiguous (-2) result. Unify the results of get_oid* and get_tree_entry_follow_symlinks to be one common type, with unambiguous values. Signed-off-by: David Turner <novalis@novalis.org> Reported-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-10Use "whitespace" consistentlyPhillip Wood
Most of the messages and documentation use 'whitespace' rather than 'white space' or 'white spaces' convert to latter two to the former for consistency. Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13cat-file: support "unordered" output for --batch-all-objectsJeff King
If you're going to access the contents of every object in a packfile, it's generally much more efficient to do so in pack order, rather than in hash order. That increases the locality of access within the packfile, which in turn is friendlier to the delta base cache, since the packfile puts related deltas next to each other. By contrast, hash order is effectively random, since the sha1 has no discernible relationship to the content. This patch introduces an "--unordered" option to cat-file which iterates over packs in pack-order under the hood. You can see the results when dumping all of the file content: $ time ./git cat-file --batch-all-objects --buffer --batch | wc -c 6883195596 real 0m44.491s user 0m42.902s sys 0m5.230s $ time ./git cat-file --unordered \ --batch-all-objects --buffer --batch | wc -c 6883195596 real 0m6.075s user 0m4.774s sys 0m3.548s Same output, different order, way faster. The same speed-up applies even if you end up accessing the object content in a different process, like: git cat-file --batch-all-objects --buffer --batch-check | grep blob | git cat-file --batch='%(objectname) %(rest)' | wc -c Adding "--unordered" to the first command drops the runtime in git.git from 24s to 3.5s. Side note: there are actually further speedups available for doing it all in-process now. Since we are outputting the object content during the actual pack iteration, we know where to find the object and could skip the extra lookup done by oid_object_info(). This patch stops short of that optimization since the underlying API isn't ready for us to make those sorts of direct requests. So if --unordered is so much better, why not make it the default? Two reasons: 1. We've promised in the documentation that --batch-all-objects outputs in hash order. Since cat-file is plumbing, people may be relying on that default, and we can't change it. 2. It's actually _slower_ for some cases. We have to compute the pack revindex to walk in pack order. And our de-duplication step uses an oidset, rather than a sort-and-dedup, which can end up being more expensive. If we're just accessing the type and size of each object, for example, like: git cat-file --batch-all-objects --buffer --batch-check my best-of-five warm cache timings go from 900ms to 1100ms using --unordered. Though it's possible in a cold-cache or under memory pressure that we could do better, since we'd have better locality within the packfile. And one final question: why is it "--unordered" and not "--pack-order"? The answer is again two-fold: 1. "pack order" isn't a well-defined thing across the whole set of objects. We're hitting loose objects, as well as objects in multiple packs, and the only ordering we're promising is _within_ a single pack. The rest is apparently random. 2. The point here is optimization. So we don't want to promise any particular ordering, but only to say that we will choose an ordering which is likely to be efficient for accessing the object content. That leaves the door open for further changes in the future without having to add another compatibility option. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-10cat-file doc: document that -e will return some outputÆvar Arnfjörð Bjarmason
The -e option added in 7950571ad7 ("A few more options for git-cat-file", 2005-12-03) has always errored out with message on stderr saying that the provided object is malformed, like this: $ git cat-file -e malformed; echo $? fatal: Not a valid object name malformed 128 A reader of this documentation may be misled into thinking that if ! git cat-file -e "$object" [...] as opposed to: if ! git cat-file -e "$object" 2>/dev/null [...] is sufficient to implement a truly silent test that checks whether some arbitrary $object string was both valid, and pointed to an object that exists. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14doc: fix minor typos (extra/duplicated words)Evan Zacks
Following are several fixes for duplicated words ("of of") and one case where an extra article ("a") slipped in. Signed-off-by: Evan Zacks <zackse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-11cat-file: support --textconv/--filters in batch modeJohannes Schindelin
With this patch, --batch can be combined with --textconv or --filters. For this to work, the input needs to have the form <object name><single white space><path> so that the filters can be chosen appropriately. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-11cat-file --textconv/--filters: allow specifying the path separatelyJohannes Schindelin
There are circumstances when it is relatively easy to figure out the object name for a given path, but not the name of the containing tree. For example, when looking at a diff generated by Git, the object names are recorded, but not the revision. As a matter of fact, the revisions from which the diff was generated may not even exist locally. In such a case, the user would have to generate a fake revision just to be able to use --textconv or --filters. Let's simplify this dramatically, because we do not really need that revision at all: all we care about is that we know the path. In the scenario described above, we do know the path, and we just want to specify it separately from the object name. Example usage: git cat-file --textconv --path=main.c 0f1937fd Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-11cat-file: introduce the --filters optionJohannes Schindelin
The --filters option applies the convert_to_working_tree() filter for the path when showing the contents of a regular file blob object; the contents are written out as-is for other types of objects. This feature comes in handy when a 3rd-party tool wants to work with the contents of files from past revisions as if they had been checked out, but without detouring via temporary files. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-24cat-file: fix a grammo in the man pageJohannes Schindelin
"... has be ..." -> "... has to be ..." Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28doc: typeset long command-line options as literalMatthieu Moy
Similarly to the previous commit, use backquotes instead of forward-quotes, for long options. This was obtained with: perl -pi -e "s/'(--[a-z][a-z=<>-]*)'/\`\$1\`/g" *.txt and manual tweak to remove false positive in ascii-art (o'--o'--o' to describe rewritten history). Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28doc: typeset short command-line options as literalMatthieu Moy
It was common in our documentation to surround short option names with forward quotes, which renders as italic in HTML. Instead, use backquotes which renders as monospace. This is one more step toward conformance to Documentation/CodingGuidelines. This was obtained with: perl -pi -e "s/'(-[a-z])'/\`\$1\`/g" *.txt Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-16usage: do not insist that standard input must come from a fileJunio C Hamano
The synopsys text and the usage string of subcommands that read list of things from the standard input are often shown like this: git gostak [--distim] < <list-of-doshes> This is problematic in a number of ways: * The way to use these commands is more often to feed them the output from another command, not feed them from a file. * Manual pages outside Git, commands that operate on the data read from the standard input, e.g "sort", "grep", "sed", etc., are not described with such a "< redirection-from-file" in their synopsys text. Our doing so introduces inconsistency. * We do not insist on where the output should go, by saying git gostak [--distim] < <list-of-doshes> > <output> * As it is our convention to enclose placeholders inside <braket>, the redirection operator followed by a placeholder filename becomes very hard to read, both in the documentation and in the help text. Let's clean them all up, after making sure that the documentation clearly describes the modes that take information from the standard input and what kind of things are expected on the input. [jc: stole example for fmt-merge-msg from Jonathan] Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-26cat-file: sort and de-dup output of --batch-all-objectsJeff King
The sorting we could probably live without, but printing duplicates is just a hassle for the user, who must then de-dup themselves (or risk a wrong answer if they are doing something like counting objects with a particular property). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22cat-file: add --batch-all-objects optionJeff King
It can sometimes be useful to examine all objects in the repository. Normally this is done with "git rev-list --all --objects", but: 1. That shows only reachable objects. You may want to look at all available objects. 2. It's slow. We actually open each object to walk the graph. If your operation is OK with seeing unreachable objects, it's an order of magnitude faster to just enumerate the loose directories and pack indices. You can do this yourself using "ls" and "git show-index", but it's non-obvious. This patch adds an option to "cat-file --batch-check" to operate on all available objects (rather than reading names from stdin). This is based on a proposal by Charles Bailey to provide a separate "git list-all-objects" command. That is more orthogonal, as it splits enumerating the objects from getting information about them. However, in practice you will either: a. Feed the list of objects directly into cat-file anyway, so you can find out information about them. Keeping it in a single process is more efficient. b. Ask the listing process to start telling you more information about the objects, in which case you will reinvent cat-file's batch-check formatter. Adding a cat-file option is simple and efficient. And if you really do want just the object names, you can always do: git cat-file --batch-check='%(objectname)' --batch-all-objects Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22cat-file: add --buffer optionJeff King
We use a direct write() to output the results of --batch and --batch-check. This is good for processes feeding the input and reading the output interactively, but it introduces measurable overhead if you do not want this feature. For example, on linux.git: $ git rev-list --objects --all | cut -d' ' -f1 >objects $ time git cat-file --batch-check='%(objectsize)' \ <objects >/dev/null real 0m5.440s user 0m5.060s sys 0m0.384s This patch adds an option to use regular stdio buffering: $ time git cat-file --batch-check='%(objectsize)' \ --buffer <objects >/dev/null real 0m4.975s user 0m4.888s sys 0m0.092s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-01Merge branch 'dt/cat-file-follow-symlinks'Junio C Hamano
"git cat-file --batch(-check)" learned the "--follow-symlinks" option that follows an in-tree symbolic link when asked about an object via extended SHA-1 syntax, e.g. HEAD:RelNotes that points at Documentation/RelNotes/2.5.0.txt. With the new option, the command behaves as if HEAD:Documentation/RelNotes/2.5.0.txt was given as input instead. * dt/cat-file-follow-symlinks: cat-file: add --follow-symlinks to --batch sha1_name: get_sha1_with_context learns to follow symlinks tree-walk: learn get_tree_entry_follow_symlinks
2015-05-20cat-file: add --follow-symlinks to --batchDavid Turner
This wires the in-repo-symlink following code through to the cat-file builtin. In the event of an out-of-repo link, cat-file will print the link in a new format. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-06cat-file: teach cat-file a '--allow-unknown-type' optionKarthik Nayak
'git cat-file' throws an error while trying to print the type or size of a broken/corrupt object. This is because these objects are usually of unknown types. Teach git cat-file a '--allow-unknown-type' option where it prints the type or size of a broken/corrupt object without throwing an error. Modify '-t' and '-s' options to call sha1_object_info_extended() directly to support the '--allow-unknown-type' option. Add documentation for 'cat-file --allow-unknown-type'. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> cat-file: add documentation for '--allow-unknown-type' option. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-26cat-file: provide %(deltabase) batch formatJeff King
It can be useful for debugging or analysis to see which objects are stored as delta bases on top of others. This information is available by running `git verify-pack`, but that is extremely expensive (and is harder than necessary to parse). Instead, let's make it available as a cat-file query format, which makes it fast and simple to get the bases for a subset of the objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17Merge branch 'rh/ishes-doc'Junio C Hamano
We liberally use "committish" and "commit-ish" (and "treeish" and "tree-ish"); as these are non-words, let's unify these terms to their dashed form. More importantly, clarify the documentation on object peeling using these terms. * rh/ishes-doc: glossary: fix and clarify the definition of 'ref' revisions.txt: fix and clarify <rev>^{<type>} glossary: more precise definition of tree-ish (a.k.a. treeish) use 'commit-ish' instead of 'committish' use 'tree-ish' instead of 'treeish' glossary: define commit-ish (a.k.a. committish) glossary: mention 'treeish' as an alternative to 'tree-ish'
2013-09-04use 'tree-ish' instead of 'treeish'Richard Hansen
Replace 'treeish' in documentation and comments with 'tree-ish' to match gitglossary(7). The only remaining instances of 'treeish' are: * variable, function, and macro names * "(also treeish)" in the definition of tree-ish in gitglossary(7) Signed-off-by: Richard Hansen <rhansen@bbn.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-05cat-file: only split on whitespace when %(rest) is usedJeff King
Commit c334b87b (cat-file: split --batch input lines on whitespace, 2013-07-11) taught `cat-file --batch-check` to split input lines on the first whitespace, and stash everything after the first token into the %(rest) output format element. It claimed: Object names cannot contain spaces, so any input with spaces would have resulted in a "missing" line. But that is not correct. Refs, object sha1s, and various peeling suffixes cannot contain spaces, but some object names can. In particular: 1. Tree paths like "[<tree>]:path with whitespace" 2. Reflog specifications like "@{2 days ago}" 3. Commit searches like "rev^{/grep me}" or ":/grep me" To remain backwards compatible, we cannot split on whitespace by default, hence we will ship 1.8.4 with the commit reverted. Resurrect its attempt but in a weaker form; only do the splitting when "%(rest)" is used in the output format. Since that element did not exist at all before c334b87, old scripts cannot be affected. The existence of object names with spaces does mean that you cannot reliably do: echo ":path with space and other data" | git cat-file --batch-check="%(objectname) %(rest)" as it would split the path and feed only ":path" to get_sha1. But that command is nonsensical. If you wanted to see "and other data" in "%(rest)", git cannot possibly know where the filename ends and the "rest" begins. It might be more robust to have something like "-z" to separate the input elements. But this patch is still a reasonable step before having that. It makes the easy cases easy; people who do not care about %(rest) do not have to consider it, and the %(rest) code handles the spaces and newlines of "rev-list --objects" correctly. Hard cases remain hard but possible (if you might get whitespace in your input, you do not get to use %(rest) and must split and join the output yourself using more flexible tools). And most importantly, it does not preclude us from having different splitting rules later if a "-z" (or similar) option is added. So we can make the hard cases easier later, if we choose to. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-02Revert "cat-file: split --batch input lines on whitespace"Junio C Hamano
This reverts commit c334b87b30c1464a1ab563fe1fb8de5eaf0e5bac; the update assumed that people only used the command to read from "rev-list --objects" output, whose lines begin with a 40-hex object name followed by a whitespace, but it turns out that scripts feed random extended SHA-1 expressions (e.g. "HEAD:$pathname") in which a whitespace has to be kept.
2013-07-12cat-file: split --batch input lines on whitespaceJeff King
If we get an input line to --batch or --batch-check that looks like "HEAD foo bar", we will currently feed the whole thing to get_sha1(). This means that to use --batch-check with `rev-list --objects`, one must pre-process the input, like: git rev-list --objects HEAD | cut -d' ' -f1 | git cat-file --batch-check Besides being more typing and slightly less efficient to invoke `cut`, the result loses information: we no longer know which path each object was found at. This patch teaches cat-file to split input lines at the first whitespace. Everything to the left of the whitespace is considered an object name, and everything to the right is made available as the %(reset) atom. So you can now do: git rev-list --objects HEAD | git cat-file --batch-check='%(objectsize) %(rest)' to collect object sizes at particular paths. Even if %(rest) is not used, we always do the whitespace split (which means you can simply eliminate the `cut` command from the first example above). This whitespace split is backwards compatible for any reasonable input. Object names cannot contain spaces, so any input with spaces would have resulted in a "missing" line. The only input hurt is if somebody really expected input of the form "HEAD is a fine-looking ref!" to fail; it will now parse HEAD, and make "is a fine-looking ref!" available as %(rest). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12cat-file: add %(objectsize:disk) format atomJeff King
This atom is just like %(objectsize), except that it shows the on-disk size of the object rather than the object's true size. In other words, it makes the "disk_size" query of sha1_object_info_extended available via the command-line. This can be used for rough attribution of disk usage to particular refs, though see the caveats in the documentation. This patch does not include any tests, as the exact numbers returned are volatile and subject to zlib and packing decisions. We cannot even reliably guarantee that the on-disk size is smaller than the object content (though in general this should be the case for non-trivial objects). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12cat-file: add --batch-check=<format>Jeff King
The `cat-file --batch-check` command can be used to quickly get information about a large number of objects. However, it provides a fixed set of information. This patch adds an optional <format> option to --batch-check to allow a caller to specify which items they are interested in, and in which order to output them. This is not very exciting for now, since we provide the same limited set that you could already get. However, it opens the door to adding new format items in the future without breaking backwards compatibility (or forcing callers to pay the cost to calculate uninteresting items). Since the --batch option shares code with --batch-check, it receives the same feature, though it is less likely to be of interest there. The format atom names are chosen to match their counterparts in for-each-ref. Though we do not (yet) share any code with for-each-ref's formatter, this keeps the interface as consistent as possible, and may help later on if the implementations are unified. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15The name of the hash function is "SHA-1", not "SHA1"Thomas Ackermann
Use "SHA-1" instead of "SHA1" whenever we talk about the hash function. When used as a programming symbol, we keep "SHA1". Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-11doc: drop author/documentation sections from most pagesJeff King
The point of these sections is generally to: 1. Give credit where it is due. 2. Give the reader an idea of where to ask questions or file bug reports. But they don't do a good job of either case. For (1), they are out of date and incomplete. A much more accurate answer can be gotten through shortlog or blame. For (2), the correct contact point is generally git@vger, and even if you wanted to cc the contact point, the out-of-date and incomplete fields mean you're likely sending to somebody useless. So let's drop the fields entirely from all manpages except git(1) itself. We already point people to the mailing list for bug reports there, and we can update the Authors section to give credit to the major contributors and point to shortlog and blame for more information. Each page has a "This is part of git" footer, so people can follow that to the main git manpage.
2010-10-14Documentation: gitrevisions is in section 7Jonathan Nieder
Fix references to gitrevisions(1) in the manual pages and HTML documentation. In practice, this will not matter much unless someone tries to use a hard copy of the git reference manual. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-05Documentation: link to gitrevisions rather than git-rev-parseMichael J Gruber
Currently, whenever we need documentation for revisions and ranges, we link to the git-rev-parse man page, i.e. a plumbing man page, which has this along with the documentation of all rev-parse modes. Link to the new gitrevisions man page instead in all cases except - when the actual git-rev-parse command is referred to or - in very technical context (git-send-pack). Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-24git-cat-file.txt: Document --textconvMichael J Gruber
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-25fix cat-file usage message and documentationJeff King
cat-file with an object on the command line requires an option to tell it what to output (type, size, pretty-print, etc). However, the square brackets in the usage imply that those options are not required. This patch switches them to parentheses to indicate "required but grouped-OR" (curly braces might also work, but this follows the convention used already by "git stash"). While we're at it, let's change the <sha1> specifier in the usage to <object>. That's what the documentation uses, and it does actually use the regular object lookup. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-23Documentation: minor grammatical fixes in git-cat-file.txtDavid J. Mellor
Signed-off-by: David J. Mellor <dmellor@whistlingcat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-06Documentation: typos / spelling fixesMike Ralphson
Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-02Documentation: be consistent about "git-" versus "git "Jonathan Nieder
Since the git-* commands are not installed in $(bindir), using "git-command <parameters>" in examples in the documentation is not a good idea. On the other hand, it is nice to be able to refer to each command using one hyphenated word. (There is no escaping it, anyway: man page names cannot have spaces in them.) This patch retains the dash in naming an operation, command, program, process, or action. Complete command lines that can be entered at a shell (i.e., without options omitted) are made to use the dashless form. The changes consist only of replacing some spaces with hyphens and vice versa. After a "s/ /-/g", the unpatched and patched versions are identical. Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-10Documentation/git-cat-file.txt: add missing line breakLea Wiemann
Without [verse], the line break between the two synopsis lines does not make it into the man page. Signed-off-by: Lea Wiemann <LeWiemann@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-06documentation: move git(7) to git(1)Christian Couder
As the "git" man page describes the "git" command at the end-user level, it seems better to move it to man section 1. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-06git-cat-file: Add --batch optionAdam Roben
--batch is similar to --batch-check, except that the contents of each object is also printed. The output's form is: <sha1> SP <type> SP <size> LF <contents> LF Signed-off-by: Adam Roben <aroben@apple.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-05-06git-cat-file: Add --batch-check optionAdam Roben
This new option allows multiple objects to be specified on stdin. For each object specified, a line of the following form is printed: <sha1> SP <type> SP <size> LF If the object does not exist in the repository, a line of the following form is printed: <object> SP missing LF Signed-off-by: Adam Roben <aroben@apple.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-01-07Documentation: rename gitlink macro to linkgitDan McGee
Between AsciiDoc 8.2.2 and 8.2.3, the following change was made to the stock Asciidoc configuration: @@ -149,7 +153,10 @@ # Inline macros. # Backslash prefix required for escape processing. # (?s) re flag for line spanning. -(?su)[\\]?(?P<name>\w(\w|-)*?):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])= + +# Explicit so they can be nested. +(?su)[\\]?(?P<name>(http|https|ftp|file|mailto|callto|image|link)):(?P<target>\S*?)(\[(?P<attrlist>.*?)\])= + # Anchor: [[[id]]]. Bibliographic anchor. (?su)[\\]?\[\[\[(?P<attrlist>[\w][\w-]*?)\]\]\]=anchor3 # Anchor: [[id,xreflabel]] This default regex now matches explicit values, and unfortunately in this case gitlink was being matched by just 'link', causing the wrong inline macro template to be applied. By renaming the macro, we can avoid being matched by the wrong regex. Signed-off-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-07War on whitespaceJunio C Hamano
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com>