summaryrefslogtreecommitdiff
path: root/pretty.c
AgeCommit message (Collapse)Author
2022-12-09pretty: fix out-of-bounds read when left-flushing with stealingPatrick Steinhardt
With the `%>>(<N>)` pretty formatter, you can ask git-log(1) et al to steal spaces. To do so we need to look ahead of the next token to see whether there are spaces there. This loop takes into account ANSI sequences that end with an `m`, and if it finds any it will skip them until it finds the first space. While doing so it does not take into account the buffer's limits though and easily does an out-of-bounds read. Add a test that hits this behaviour. While we don't have an easy way to verify this, the test causes the following failure when run with `SANITIZE=address`: ==37941==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000000baf at pc 0x55ba6f88e0d0 bp 0x7ffc84c50d20 sp 0x7ffc84c50d10 READ of size 1 at 0x603000000baf thread T0 #0 0x55ba6f88e0cf in format_and_pad_commit pretty.c:1712 #1 0x55ba6f88e7b4 in format_commit_item pretty.c:1801 #2 0x55ba6f9b1ae4 in strbuf_expand strbuf.c:429 #3 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869 #4 0x55ba6f890ccf in pretty_print_commit pretty.c:2161 #5 0x55ba6f7884c8 in show_log log-tree.c:781 #6 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117 #7 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508 #8 0x55ba6f41035b in cmd_log_walk builtin/log.c:549 #9 0x55ba6f4131a2 in cmd_log builtin/log.c:883 #10 0x55ba6f2ea993 in run_builtin git.c:466 #11 0x55ba6f2eb397 in handle_builtin git.c:721 #12 0x55ba6f2ebb07 in run_argv git.c:788 #13 0x55ba6f2ec8a7 in cmd_main git.c:923 #14 0x55ba6f581682 in main common-main.c:57 #15 0x7f2d08c3c28f (/usr/lib/libc.so.6+0x2328f) #16 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349) #17 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115 0x603000000baf is located 1 bytes to the left of 24-byte region [0x603000000bb0,0x603000000bc8) allocated by thread T0 here: #0 0x7f2d08ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85 #1 0x55ba6fa5b494 in xrealloc wrapper.c:136 #2 0x55ba6f9aefdc in strbuf_grow strbuf.c:99 #3 0x55ba6f9b0a06 in strbuf_add strbuf.c:298 #4 0x55ba6f9b1a25 in strbuf_expand strbuf.c:418 #5 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869 #6 0x55ba6f890ccf in pretty_print_commit pretty.c:2161 #7 0x55ba6f7884c8 in show_log log-tree.c:781 #8 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117 #9 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508 #10 0x55ba6f41035b in cmd_log_walk builtin/log.c:549 #11 0x55ba6f4131a2 in cmd_log builtin/log.c:883 #12 0x55ba6f2ea993 in run_builtin git.c:466 #13 0x55ba6f2eb397 in handle_builtin git.c:721 #14 0x55ba6f2ebb07 in run_argv git.c:788 #15 0x55ba6f2ec8a7 in cmd_main git.c:923 #16 0x55ba6f581682 in main common-main.c:57 #17 0x7f2d08c3c28f (/usr/lib/libc.so.6+0x2328f) #18 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349) #19 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115 SUMMARY: AddressSanitizer: heap-buffer-overflow pretty.c:1712 in format_and_pad_commit Shadow bytes around the buggy address: 0x0c067fff8120: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd 0x0c067fff8130: fd fd fa fa fd fd fd fd fa fa fd fd fd fa fa fa 0x0c067fff8140: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa 0x0c067fff8150: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa fd fd 0x0c067fff8160: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa =>0x0c067fff8170: fd fd fd fa fa[fa]00 00 00 fa fa fa 00 00 00 fa 0x0c067fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c067fff8190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c067fff81a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c067fff81b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c067fff81c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Luckily enough, this would only cause us to copy the out-of-bounds data into the formatted commit in case we really had an ANSI sequence preceding our buffer. So this bug likely has no security consequences. Fix it regardless by not traversing past the buffer's start. Reported-by: Patrick Steinhardt <ps@pks.im> Reported-by: Eric Sesterhenn <eric.sesterhenn@x41-dsec.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09pretty: fix out-of-bounds write caused by integer overflowPatrick Steinhardt
When using a padding specifier in the pretty format passed to git-log(1) we need to calculate the string length in several places. These string lengths are stored in `int`s though, which means that these can easily overflow when the input lengths exceeds 2GB. This can ultimately lead to an out-of-bounds write when these are used in a call to memcpy(3P): ==8340==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1ec62f97fe at pc 0x7f2127e5f427 bp 0x7ffd3bd63de0 sp 0x7ffd3bd63588 WRITE of size 1 at 0x7f1ec62f97fe thread T0 #0 0x7f2127e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 #1 0x5628e96aa605 in format_and_pad_commit pretty.c:1762 #2 0x5628e96aa7f4 in format_commit_item pretty.c:1801 #3 0x5628e97cdb24 in strbuf_expand strbuf.c:429 #4 0x5628e96ab060 in repo_format_commit_message pretty.c:1869 #5 0x5628e96acd0f in pretty_print_commit pretty.c:2161 #6 0x5628e95a44c8 in show_log log-tree.c:781 #7 0x5628e95a76ba in log_tree_commit log-tree.c:1117 #8 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508 #9 0x5628e922c35b in cmd_log_walk builtin/log.c:549 #10 0x5628e922f1a2 in cmd_log builtin/log.c:883 #11 0x5628e9106993 in run_builtin git.c:466 #12 0x5628e9107397 in handle_builtin git.c:721 #13 0x5628e9107b07 in run_argv git.c:788 #14 0x5628e91088a7 in cmd_main git.c:923 #15 0x5628e939d682 in main common-main.c:57 #16 0x7f2127c3c28f (/usr/lib/libc.so.6+0x2328f) #17 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349) #18 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115 0x7f1ec62f97fe is located 2 bytes to the left of 4831838265-byte region [0x7f1ec62f9800,0x7f1fe62f9839) allocated by thread T0 here: #0 0x7f2127ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85 #1 0x5628e98774d4 in xrealloc wrapper.c:136 #2 0x5628e97cb01c in strbuf_grow strbuf.c:99 #3 0x5628e97ccd42 in strbuf_addchars strbuf.c:327 #4 0x5628e96aa55c in format_and_pad_commit pretty.c:1761 #5 0x5628e96aa7f4 in format_commit_item pretty.c:1801 #6 0x5628e97cdb24 in strbuf_expand strbuf.c:429 #7 0x5628e96ab060 in repo_format_commit_message pretty.c:1869 #8 0x5628e96acd0f in pretty_print_commit pretty.c:2161 #9 0x5628e95a44c8 in show_log log-tree.c:781 #10 0x5628e95a76ba in log_tree_commit log-tree.c:1117 #11 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508 #12 0x5628e922c35b in cmd_log_walk builtin/log.c:549 #13 0x5628e922f1a2 in cmd_log builtin/log.c:883 #14 0x5628e9106993 in run_builtin git.c:466 #15 0x5628e9107397 in handle_builtin git.c:721 #16 0x5628e9107b07 in run_argv git.c:788 #17 0x5628e91088a7 in cmd_main git.c:923 #18 0x5628e939d682 in main common-main.c:57 #19 0x7f2127c3c28f (/usr/lib/libc.so.6+0x2328f) #20 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349) #21 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115 SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy Shadow bytes around the buggy address: 0x0fe458c572a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0fe458c572b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0fe458c572c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0fe458c572d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0fe458c572e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x0fe458c572f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa] 0x0fe458c57300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe458c57310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe458c57320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe458c57330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe458c57340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==8340==ABORTING The pretty format can also be used in `git archive` operations via the `export-subst` attribute. So this is what in our opinion makes this a critical issue in the context of Git forges which allow to download an archive of user supplied Git repositories. Fix this vulnerability by using `size_t` instead of `int` to track the string lengths. Add tests which detect this vulnerability when Git is compiled with the address sanitizer. Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com> Original-patch-by: Joern Schneeweisz <jschneeweisz@gitlab.com> Modified-by: Taylor Blau <me@ttalorr.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-14Merge branch 'ab/unused-annotation'Junio C Hamano
Undoes 'jk/unused-annotation' topic and redoes it to work around Coccinelle rules misfiring false positives in unrelated codepaths. * ab/unused-annotation: git-compat-util.h: use "deprecated" for UNUSED variables git-compat-util.h: use "UNUSED", not "UNUSED(var)"
2022-09-14Merge branch 'jk/unused-annotation'Junio C Hamano
Annotate function parameters that are not used (but cannot be removed for structural reasons), to prepare us to later compile with -Wunused warning turned on. * jk/unused-annotation: is_path_owned_by_current_uid(): mark "report" parameter as unused run-command: mark unused async callback parameters mark unused read_tree_recursive() callback parameters hashmap: mark unused callback parameters config: mark unused callback parameters streaming: mark unused virtual method parameters transport: mark bundle transport_options as unused refs: mark unused virtual method parameters refs: mark unused reflog callback parameters refs: mark unused each_ref_fn parameters git-compat-util: add UNUSED macro
2022-09-01git-compat-util.h: use "UNUSED", not "UNUSED(var)"Ævar Arnfjörð Bjarmason
As reported in [1] the "UNUSED(var)" macro introduced in 2174b8c75de (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of 9b240347543 (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29format-patch: allow forcing the use of in-body From: headerJunio C Hamano
Users may be authoring and committing their commits under the same e-mail address they use to send their patches from, in which case they shouldn't need to use the in-body From: line in their outgoing e-mails. At the receiving end, "git am" will use the address on the "From:" header of the incoming e-mail and all should be well. Some mailing lists, however, mangle the From: address from what the original sender had; in such a situation, the user may want to add the in-body "From:" header even for their own patches. "git format-patch --[no-]force-in-body-from" was invented for such users. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-29pretty: separate out the logic to decide the use of in-body fromJunio C Hamano
When pretty-printing the log message for a given commit in the e-mail format (e.g. "git format-patch"), we add an in-body "From:" header when the author identity of the commit is different from the identity of the person whose identity appears in the header of the e-mail (the latter is passed with them "--from" option). Split out the logic into a helper function, as we would want to extend the condition further. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19config: mark unused callback parametersJeff King
The callback passed to git_config() must conform to a particular interface. But most callbacks don't actually look at the extra "void *data" parameter. Let's mark the unused parameters to make -Wunused-parameter happy. Note there's one unusual case here in get_remote_default() where we actually ignore the "value" parameter. That's because it's only checking whether the option is found at all, and not parsing its value. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-11gpg-interface: add function for converting trust level to stringJaydeep Das
Add new helper function `gpg_trust_level_to_str()` which will convert a given member of `enum signature_trust_level` to its corresponding string (in lowercase). For example, `TRUST_ULTIMATE` will yield the string "ultimate". This will abstract out some code in `pretty.c` relating to gpg signature trust levels. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Jaydeep Das <jaydeepjd.8914@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15Merge branch 'es/pretty-describe-more'Junio C Hamano
Extend "git log --format=%(describe)" placeholder to allow passing selected command-line options to the underlying "git describe" command. * es/pretty-describe-more: pretty: add abbrev option to %(describe) pretty: add tag option to %(describe) pretty.c: rework describe options parsing for better extensibility
2021-11-01Merge branch 'hm/paint-hits-in-log-grep'Junio C Hamano
"git log --grep=string --author=name" learns to highlight hits just like "git grep string" does. * hm/paint-hits-in-log-grep: grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data pretty: colorize pattern matches in commit messages grep: refactor next_match() and match_one_pattern() for external use
2021-11-01pretty: add abbrev option to %(describe)Eli Schwartz
The %(describe) placeholder by default, like `git describe`, uses a seven-character abbreviated commit object name. This may not be sufficient to fully describe all commits in a given repository, resulting in a placeholder replacement changing its length because the repository grew in size. This could cause the output of git-archive to change. Add the --abbrev option to `git describe` to the placeholder interface in order to provide tools to the user for fine-tuning project defaults and ensure reproducible archives. One alternative would be to just always specify --abbrev=40 but this may be a bit too biased... Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-01pretty: add tag option to %(describe)Eli Schwartz
The %(describe) placeholder by default, like `git describe`, only supports annotated tags. However, some people do use lightweight tags for releases, and would like to describe those anyway. The command line tool has an option to support this. Teach the placeholder to support this as well. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-01pretty.c: rework describe options parsing for better extensibilityEli Schwartz
It contains option arguments only, not options. We would like to add option support here too, but to do that we need to distinguish between different types of options. Lay out the groundwork for distinguishing between bools, strings, etc. and move the central logic (validating values and pushing new arguments to *args) into the successful match, because that will be fairly conditional on what type of argument is being parsed. Signed-off-by: Eli Schwartz <eschwartz@archlinux.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29Merge branch 'jk/log-warn-on-bogus-encoding'Junio C Hamano
Squelch over-eager warning message added during this cycle. * jk/log-warn-on-bogus-encoding: log: document --encoding behavior on iconv() failure Revert "logmsg_reencode(): warn when iconv() fails"
2021-10-29Revert "logmsg_reencode(): warn when iconv() fails"Junio C Hamano
This reverts commit fd680bc5 (logmsg_reencode(): warn when iconv() fails, 2021-08-27). Throwing a warning for each and every commit that gets reencoded, without allowing a way to squelch, would make it unpleasant for folks who have to deal with an ancient part of the history in an old project that used wrong encoding in the commits.
2021-10-25Merge branch 'fs/ssh-signing'Junio C Hamano
Use ssh public crypto for object and push-cert signing. * fs/ssh-signing: ssh signing: test that gpg fails for unknown keys ssh signing: tests for logs, tags & push certs ssh signing: duplicate t7510 tests for commits ssh signing: verify signatures using ssh-keygen ssh signing: provide a textual signing_key_id ssh signing: retrieve a default key from ssh-agent ssh signing: add ssh key format and signing code ssh signing: add test prereqs ssh signing: preliminary refactoring and clean-up
2021-10-08pretty: colorize pattern matches in commit messagesHamza Mahfooz
The "git log" command limits its output to the commits that contain strings matched by a pattern when the "--grep=<pattern>" option is used, but unlike output from "git grep -e <pattern>", the matches are not highlighted, making them harder to spot. Teach the pretty-printer code to highlight matches from the "--grep=<pattern>", "--author=<pattern>" and "--committer=<pattern>" options (to view the last one, you may have to ask for --pretty=fuller). Also, it must be noted that we are effectively greping the content twice (because it would be a hassle to rework the existing matching code to do a /g match and then pass it all down to the coloring code), however it only slows down "git log --author=^H" on this repository by around 1-2% (compared to v2.33.0), so it should be a small enough slow down to justify the addition of the feature. Signed-off-by: Hamza Mahfooz <someguy@effective-light.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10ssh signing: preliminary refactoring and clean-upFabian Stelzer
Openssh v8.2p1 added some new options to ssh-keygen for signature creation and verification. These allow us to use ssh keys for git signatures easily. In our corporate environment we use PIV x509 Certs on Yubikeys for email signing/encryption and ssh keys which I think is quite common (at least for the email part). This way we can establish the correct trust for the SSH Keys without setting up a separate GPG Infrastructure (which is still quite painful for users) or implementing x509 signing support for git (which lacks good forwarding mechanisms). Using ssh agent forwarding makes this feature easily usable in todays development environments where code is often checked out in remote VMs / containers. In such a setup the keyring & revocationKeyring can be centrally generated from the x509 CA information and distributed to the users. To be able to implement new signing formats this commit: - makes the sigc structure more generic by renaming "gpg_output" to "output" - introduces function pointers in the gpg_format structure to call format specific signing and verification functions - moves format detection from verify_signed_buffer into the check_signature api function and calls the format specific verify - renames and wraps sign_buffer to handle format specific signing logic as well Signed-off-by: Fabian Stelzer <fs@gigacodes.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-27logmsg_reencode(): warn when iconv() failsJeff King
If the user asks for a pretty-printed commit to be converted (either explicitly with --encoding=foo, or implicitly because the commit is non-utf8 and we want to convert it), we pass it through iconv(). If that fails, we fall back to showing the input verbatim, but don't tell the user that the output may be bogus. Let's add a warning to do so, along with a mention in the documentation for --encoding. Two things to note about the implementation: - we could produce the warning closer to the call to iconv() in reencode_string_len(), which would let us relay the value of errno. But this is not actually very helpful. reencode_string_len() does not know we are operating on a commit, and indeed does not know that the caller won't produce an error of its own. And the errno values from iconv() are seldom helpful (iconv_open() only ever produces EINVAL; perhaps EILSEQ from iconv() might be illuminating, but it can also return EINVAL for incomplete sequences). - if the reason for the failure is that the output charset is not supported, then the user will see this warning for every commit we try to display. That might be ugly and overwhelming, but on the other hand it is making it clear that every one of them has not been converted (and the likely outcome anyway is to re-try the command with a supported output encoding). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-29log: avoid loading decorations for userformats that don't need itJeff King
If no --decorate option is given, we default to auto-decoration. And when that kicks in, cmd_log_init_finish() will unconditionally load the decoration refs. However, if we are using a user-format that does not include "%d" or "%D", we won't show the decorations at all, so we don't need to load them. We can detect this case and auto-disable them by adding a new field to our userformat_want helper. We can do this even when the user explicitly asked for --decorate, because it can't affect the output at all. This patch consistently reduces the time to run "git log -1 --format=%H" on my git.git clone (with ~2k refs) from 34ms to 7ms. On a much more extreme real-world repository (with ~220k refs), it goes from 2.5s to 4ms. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27pretty: provide human date formatZheNing Hu
Add the placeholders %ah and %ch to format author date and committer date, like --date=human does, which provides more humanity date output. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-22Merge branch 'rs/pretty-describe'Junio C Hamano
"git log --format='...'" learned "%(describe)" placeholder. * rs/pretty-describe: archive: expand only a single %(describe) per archive pretty: document multiple %(describe) being inconsistent t4205: assert %(describe) test coverage pretty: add merge and exclude options to %(describe) pretty: add %(describe)
2021-03-14use CALLOC_ARRAYRené Scharfe
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11archive: expand only a single %(describe) per archiveRené Scharfe
Every %(describe) placeholder in $Format:...$ strings in files with the attribute export-subst is expanded by calling git describe. This can potentially result in a lot of such calls per archive. That's OK for local repositories under control of the user of git archive, but could be a problem for hosted repositories. Expand only a single %(describe) placeholder per archive for now to avoid denial-of-service attacks. We can make this limit configurable later if needed, but let's start out simple. Reported-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-01Merge branch 'hv/trailer-formatting'Junio C Hamano
The logic to handle "trailer" related placeholders in the "--format=" mechanisms in the "log" family and "for-each-ref" family is getting unified. * hv/trailer-formatting: ref-filter: use pretty.c logic for trailers pretty.c: capture invalid trailer argument pretty.c: refactor trailer logic to `format_set_trailers_options()` t6300: use function to test trailer options
2021-02-17pretty: add merge and exclude options to %(describe)René Scharfe
Allow restricting the tags used by the placeholder %(describe) with the options match and exclude. E.g. the following command describes the current commit using official version tags, without those for release candidates: $ git log -1 --format='%(describe:match=v[0-9]*,exclude=*rc*)' Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17pretty: add %(describe)René Scharfe
Add a format placeholder for describe output. Implement it by actually calling git describe, which is simple and guarantees correctness. It's intended to be used with $Format:...$ in files with the attribute export-subst and git archive. It can also be used with git log etc., even though that's going to be slow due to the fork for each commit. Suggested-by: Eli Schwartz <eschwartz@archlinux.org> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16pretty.c: capture invalid trailer argumentHariom Verma
As we would like to use this trailers logic in the ref-filter, it's nice to get an invalid trailer argument. This will allow us to print precise error message while using `format_set_trailers_options()` in ref-filter. For capturing the invalid argument, we changed the working of `format_set_trailers_options()` a little bit. Original logic does "break" and fell through in mainly 2 cases - 1. unknown/invalid argument 2. end of the arg string But now instead of "break", we capture invalid argument and return non-zero. And non-zero is handled by the caller. (We prepared the caller to handle non-zero in the previous commit). Capturing invalid arguments this way will also affects the working of current logic. As at the end of the arg string it will return non-zero. So in order to make things correct, introduced an additional conditional statement i.e if encounter ")", do 'break'. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-16pretty.c: refactor trailer logic to `format_set_trailers_options()`Hariom Verma
Refactored trailers formatting logic inside pretty.c to a new function `format_set_trailers_options()`. This new function returns the non-zero in case of unusual. The caller handles the non-zero by "goto trailers_out". This change will allow us to reuse the same logic in other places. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-28pretty: lazy-load commit data when expanding user-formatJeff King
When we expand a user-format, we try to avoid work that isn't necessary for the output. For instance, we don't bother parsing the commit header until we know we need the author, subject, etc. But we do always load the commit object's contents from disk, even if the format doesn't require it (e.g., just "%H"). Traditionally this didn't matter much, because we'd have loaded it as part of the traversal anyway, and we'd typically have those bytes attached to the commit struct (or these days, cached in a commit-slab). But when we have a commit-graph, we might easily get to the point of pretty-printing a commit without ever having looked at the actual object contents. We should push off that load (and reencoding) until we're certain that it's needed. I think the results of p4205 show the advantage pretty clearly (we serve parent and tree oids out of the commit struct itself, so they benefit as well): # using git.git as the test repo Test HEAD^ HEAD ---------------------------------------------------------------------- 4205.1: log with %H 0.40(0.39+0.01) 0.03(0.02+0.01) -92.5% 4205.2: log with %h 0.45(0.44+0.01) 0.09(0.09+0.00) -80.0% 4205.3: log with %T 0.40(0.39+0.00) 0.04(0.04+0.00) -90.0% 4205.4: log with %t 0.46(0.46+0.00) 0.09(0.08+0.01) -80.4% 4205.5: log with %P 0.39(0.39+0.00) 0.03(0.03+0.00) -92.3% 4205.6: log with %p 0.46(0.46+0.00) 0.10(0.09+0.00) -78.3% 4205.7: log with %h-%h-%h 0.52(0.51+0.01) 0.15(0.14+0.00) -71.2% 4205.8: log with %an-%ae-%s 0.42(0.41+0.00) 0.42(0.41+0.01) +0.0% # using linux.git as the test repo Test HEAD^ HEAD ---------------------------------------------------------------------- 4205.1: log with %H 7.12(6.97+0.14) 0.76(0.65+0.11) -89.3% 4205.2: log with %h 7.35(7.19+0.16) 1.30(1.19+0.11) -82.3% 4205.3: log with %T 7.58(7.42+0.15) 1.02(0.94+0.08) -86.5% 4205.4: log with %t 8.05(7.89+0.15) 1.55(1.41+0.13) -80.7% 4205.5: log with %P 7.12(7.01+0.10) 0.76(0.69+0.07) -89.3% 4205.6: log with %p 7.38(7.27+0.10) 1.32(1.20+0.12) -82.1% 4205.7: log with %h-%h-%h 7.81(7.67+0.13) 1.79(1.67+0.12) -77.1% 4205.8: log with %an-%ae-%s 7.90(7.74+0.15) 7.81(7.66+0.15) -1.1% I added the final test to show where we don't improve (the 1% there is just lucky noise), but also as a regression test to make sure we're not doing anything stupid like loading the commit multiple times when there are several placeholders that need it. Reported-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-12shortlog: remove unused(?) "repo-abbrev" featureÆvar Arnfjörð Bjarmason
Remove support for the magical "repo-abbrev" comment in .mailmap files. This was added to .mailmap parsing in [1], as a generalized feature of the git-shortlog Perl script added earlier in [2]. There was no documentation or tests for this feature, and I don't think it's used in practice anymore. What it did was to allow you to specify a single string to be search-replaced with "/.../" in the .mailmap file. E.g. for linux.git's current .mailmap: git archive --remote=git@gitlab.com:linux-kernel/linux.git \ HEAD -- .mailmap | grep -a repo-abbrev # repo-abbrev: /pub/scm/linux/kernel/git/ Then when running e.g.: git shortlog --merges --author=Linus -1 v5.10-rc7..v5.10 | grep Merge We'd emit (the [...] is mine): Merge tag [...]git://git.kernel.org/.../tip/tip But will now emit: Merge tag [...]git.kernel.org/pub/scm/linux/kernel/git/tip/tip I think at this point this is just a historical artifact we can get rid of. It was initially meant for Linus's own use when we integrated the Perl script[2], but since then it seems he's stopped using it. Digging through Linus's release announcements on the LKML[3] the last release I can find that made use of this output is Linux 2.6.25-rc6 back in March 2008[4]. Later on Linus started using --no-merges[5], and nowadays seems to prefer some custom not-quite-shortlog format of merges from lieutenants[6]. You will still see it on linux.git if you run "git shortlog" manually yourself with --merges, with this removed you can still get the same output with: git log --pretty=fuller v5.10-rc7..v5.10 | sed 's!/pub/scm/linux/kernel/git/!/.../!g' | git shortlog Arguably we should do the same for the search-replacing of "[PATCH]" at the beginning with "". That seems to be another relic of a bygone era when linux.git patches would have their E-Mail subject lines applied as-is by "git am" or whatever. But we documented that feature in "git-shortlog(1)", and it seems more widely applicable than something purely kernel-specific. 1. 7595e2ee6ef (git-shortlog: make common repository prefix configurable with .mailmap, 2006-11-25) 2. fa375c7f1b6 (Add git-shortlog perl script, 2005-06-04) 3. https://lore.kernel.org/lkml/ 4. https://lore.kernel.org/lkml/alpine.LFD.1.00.0803161651350.3020@woody.linux-foundation.org/ 5. https://lore.kernel.org/lkml/BANLkTinrbh7Xi27an3uY7pDWrNKhJRYmEA@mail.gmail.com/ 6. https://lore.kernel.org/lkml/CAHk-=wg1+kf1AVzXA-RQX0zjM6t9J2Kay9xyuNqcFHWV-y5ZYw@mail.gmail.com/ Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09pretty format %(trailers): add a "key_value_separator"Ævar Arnfjörð Bjarmason
Add a "key_value_separator" option to the "%(trailers)" pretty format, to go along with the existing "separator" argument. In combination these two options make it trivial to produce machine-readable (e.g. \0 and \0\0-delimited) format output. As elaborated on in a previous commit which added "keyonly" it was needlessly tedious to extract structured data from "%(trailers)" before the addition of this "key_value_separator" option. As seen by the test being added here extracting this data now becomes trivial. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-09pretty format %(trailers): add a "keyonly"Ævar Arnfjörð Bjarmason
Add support for a "keyonly". This allows for easier parsing out of the key and value. Before if you didn't want to make assumptions about how the key was formatted. You'd need to parse it out as e.g.: --pretty=format:'%H%x00%(trailers:separator=%x00%x00)' \ '%x00%(trailers:separator=%x00%x00,valueonly)' And then proceed to deduce keys by looking at those two and subtracting the value plus the hardcoded ": " separator from the non-valueonly %(trailers) line. Now it's possible to simply do: --pretty=format:'%H%x00%(trailers:separator=%x00%x00,keyonly)' \ '%x00%(trailers:separator=%x00%x00,valueonly)' Which at least reduces it to a state machine where you get N keys and correlate them with N values. Even better would be to have a way to change the ": " delimiter to something easily machine-readable (a key might contain ": " too). A follow-up change will add support for that. I don't really have a use-case for just "keyonly" myself. I suppose it would be useful in some cases as "key=*" matches case-insensitively, so a plain "keyonly" will give you the variants of the keys you matched. I'm mainly adding it to fix the inconsistency with "valueonly". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-28pretty: refactor `format_sanitized_subject()`Hariom Verma
The function 'format_sanitized_subject()' is responsible for sanitized subject line in pretty.c e.g. the subject line the-sanitized-subject-line It would be a nice enhancement to `subject` atom to have the same feature. So in the later commits, we plan to add this feature to ref-filter. Refactor `format_sanitized_subject()`, so it can be reused in ref-filter.c for adding new modifier `sanitize` to "subject" atom. Currently, the loop inside `format_sanitized_subject()` runs until `\n` is found. But now, we stored the first occurrence of `\n` in a variable `eol` and passed it in `format_sanitized_subject()`. And the loop runs upto `eol`. Mentored-by: Christian Couder <chriscool@tuxfamily.org> Mentored-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-08format-patch: teach --no-encode-email-headersEmma Brooks
When commit subjects or authors have non-ASCII characters, git format-patch Q-encodes them so they can be safely sent over email. However, if the patch transfer method is something other than email (web review tools, sneakernet), this only serves to make the patch metadata harder to read without first applying it (unless you can decode RFC 2047 in your head). git am as well as some email software supports non-Q-encoded mail as described in RFC 6531. Add --[no-]encode-email-headers and format.encodeEmailHeaders to let the user control this behavior. Signed-off-by: Emma Brooks <me@pluvano.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-17Merge branch 'rs/strbuf-insertstr'Junio C Hamano
Code clean-up. * rs/strbuf-insertstr: mailinfo: don't insert header prefix for handle_content_type() strbuf: add and use strbuf_insertstr()
2020-02-10strbuf: add and use strbuf_insertstr()René Scharfe
Add a function for inserting a C string into a strbuf. Use it throughout the source to get rid of magic string length constants and explicit strlen() calls. Like strbuf_addstr(), implement it as an inline function to avoid the implicit strlen() calls to cause runtime overhead. Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-30Merge branch 'hi/gpg-mintrustlevel'Junio C Hamano
gpg.minTrustLevel configuration variable has been introduced to tell various signature verification codepaths the required minimum trust level. * hi/gpg-mintrustlevel: gpg-interface: add minTrustLevel as a configuration option
2020-01-15gpg-interface: add minTrustLevel as a configuration optionHans Jerry Illikainen
Previously, signature verification for merge and pull operations checked if the key had a trust-level of either TRUST_NEVER or TRUST_UNDEFINED in verify_merge_signature(). If that was the case, the process die()d. The other code paths that did signature verification relied entirely on the return code from check_commit_signature(). And signatures made with a good key, irregardless of its trust level, was considered valid by check_commit_signature(). This difference in behavior might induce users to erroneously assume that the trust level of a key in their keyring is always considered by Git, even for operations where it is not (e.g. during a verify-commit or verify-tag). The way it worked was by gpg-interface.c storing the result from the key/signature status *and* the lowest-two trust levels in the `result` member of the signature_check structure (the last of these status lines that were encountered got written to `result`). These are documented in GPG under the subsection `General status codes` and `Key related`, respectively [1]. The GPG documentation says the following on the TRUST_ status codes [1]: """ These are several similar status codes: - TRUST_UNDEFINED <error_token> - TRUST_NEVER <error_token> - TRUST_MARGINAL [0 [<validation_model>]] - TRUST_FULLY [0 [<validation_model>]] - TRUST_ULTIMATE [0 [<validation_model>]] For good signatures one of these status lines are emitted to indicate the validity of the key used to create the signature. The error token values are currently only emitted by gpgsm. """ My interpretation is that the trust level is conceptionally different from the validity of the key and/or signature. That seems to also have been the assumption of the old code in check_signature() where a result of 'G' (as in GOODSIG) and 'U' (as in TRUST_NEVER or TRUST_UNDEFINED) were both considered a success. The two cases where a result of 'U' had special meaning were in verify_merge_signature() (where this caused git to die()) and in format_commit_one() (where it affected the output of the %G? format specifier). I think it makes sense to refactor the processing of TRUST_ status lines such that users can configure a minimum trust level that is enforced globally, rather than have individual parts of git (e.g. merge) do it themselves (except for a grace period with backward compatibility). I also think it makes sense to not store the trust level in the same struct member as the key/signature status. While the presence of a TRUST_ status code does imply that the signature is good (see the first paragraph in the included snippet above), as far as I can tell, the order of the status lines from GPG isn't well-defined; thus it would seem plausible that the trust level could be overwritten with the key/signature status if they were stored in the same member of the signature_check structure. This patch introduces a new configuration option: gpg.minTrustLevel. It consolidates trust-level verification to gpg-interface.c and adds a new `trust_level` member to the signature_check structure. Backward-compatibility is maintained by introducing a special case in verify_merge_signature() such that if no user-configurable gpg.minTrustLevel is set, then the old behavior of rejecting TRUST_UNDEFINED and TRUST_NEVER is enforced. If, on the other hand, gpg.minTrustLevel is set, then that value overrides the old behavior. Similarly, the %G? format specifier will continue show 'U' for signatures made with a key that has a trust level of TRUST_UNDEFINED or TRUST_NEVER, even though the 'U' character no longer exist in the `result` member of the signature_check structure. A new format specifier, %GT, is also introduced for users that want to show all possible trust levels for a signature. Another approach would have been to simply drop the trust-level requirement in verify_merge_signature(). This would also have made the behavior consistent with other parts of git that perform signature verification. However, requiring a minimum trust level for signing keys does seem to have a real-world use-case. For example, the build system used by the Qubes OS project currently parses the raw output from verify-tag in order to assert a minimum trust level for keys used to sign git tags [2]. [1] https://git.gnupg.org/cgi-bin/gitweb.cgi?p=gnupg.git;a=blob;f=doc/doc/DETAILS;h=bd00006e933ac56719b1edd2478ecd79273eae72;hb=refs/heads/master [2] https://github.com/QubesOS/qubes-builder/blob/9674c1991deef45b1a1b1c71fddfab14ba50dccf/scripts/verify-git-tag#L43 Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-10Merge branch 'dl/pretty-reference'Junio C Hamano
"git log" family learned "--pretty=reference" that gives the name of a commit in the format that is often used to refer to it in log messages. * dl/pretty-reference: SubmittingPatches: use `--pretty=reference` pretty: implement 'reference' format pretty: add struct cmt_fmt_map::default_date_mode_type pretty: provide short date format t4205: cover `git log --reflog -z` blindspot pretty.c: inline initalize format_context revision: make get_revision_mark() return const pointer completion: complete `tformat:` pretty format SubmittingPatches: remove dq from commit reference pretty-formats.txt: use generic terms for hash SubmittingPatches: use generic terms for hash
2019-11-20pretty: implement 'reference' formatDenton Liu
The standard format for referencing other commits within some projects (such as git.git) is the reference format. This is described in Documentation/SubmittingPatches as If you want to reference a previous commit in the history of a stable branch, use the format "abbreviated hash (subject, date)", like this: .... Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30) noticed that ... .... Since this format is so commonly used, standardize it as a pretty format. The tests that are implemented essentially show that the format-string does not change in response to various log options. This is useful because, for future developers, it shows that we've considered the limitations of the "canned format-string" approach and we are fine with them. Based-on-a-patch-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-20pretty: add struct cmt_fmt_map::default_date_mode_typeDenton Liu
In a future commit, we plan on having a pretty format which will use a default date format unless otherwise overidden. Add support for this by adding a `default_date_mode_type` member in `struct cmt_fmt_map`. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-20pretty: provide short date formatRené Scharfe
Add the placeholders %as and %cs to format author date and committer date, respectively, without the time part, like --date=short does, i.e. like YYYY-MM-DD. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-20pretty.c: inline initalize format_contextDenton Liu
Instead of memsetting and then initializing the fields in the struct, move the initialization of `format_context` to its assignment. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-30pretty: add "%aL" etc. to show local-part of email addressesPrarit Bhargava
In many projects the number of contributors is low enough that users know each other and the full email address doesn't need to be displayed. Displaying only the author's username saves a lot of columns on the screen. Existing 'e/E' (as in "%ae" and "%aE") placeholders would show the author's address as "prarit@redhat.com", which would waste columns to show the same domain-part for all contributors when used in a project internal to redhat. Introduce 'l/L' placeholders that strip '@' and domain part from the e-mail address. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-09log-tree: call load_ref_decorations() in get_name_decoration()René Scharfe
Load a default set of ref name decorations at the first lookup. This frees direct and indirect callers from doing so. They can still do it if they want to use a filter or are interested in full decorations instead of the default short ones -- the first load_ref_decorations() call wins. This means that the load in builtin/log.c::cmd_log_init_finish() is respected even if --simplify-by-decoration is given, as the previously dominating earlier load in handle_revision_opt() is gone. So a filter given with --decorate-refs-exclude is used for simplification in that case, as expected. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-18use COPY_ARRAY for copying arraysRené Scharfe
Convert calls of memcpy(3) to use COPY_ARRAY, which shortens and simplifies the code a bit. Patch generated by Coccinelle and contrib/coccinelle/array.cocci. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-20pretty: drop unused strbuf from parse_padding_placeholder()Jeff King
Unlike other parts of the --pretty user-format expansion, this function is not actually writing to the output, but instead just storing the padding values into a context struct. We don't need to be passed a strbuf at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-20pretty: drop unused "type" parameter in needs_rfc2047_encoding()Jeff King
The "should we encode" check was split off from add_rfc2047() into its own function in 41dd00bad3 (format-patch: fix rfc2047 address encoding with respect to rfc822 specials, 2012-10-18). But only the "add" half needs to know the rfc2047_type, since it only affects _how_ we encode, not whether we do. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>