path: root/builtin/rev-list.c
AgeCommit message (Collapse)Author
2014-02-27Merge branch 'jk/pack-bitmap'Junio C Hamano
Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...
2013-12-30rev-list: add bitmap mode to speed up object listsVicent Marti
The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list (i.e., not the path names at which trees and blobs were found). Calling `git rev-list --objects --use-bitmap-index [committish]` will perform an object iteration based on a bitmap result instead of actually walking the object graph. These are some example timings for `torvalds/linux` (warm cache, best-of-five): $ time git rev-list --objects master > /dev/null real 0m34.191s user 0m33.904s sys 0m0.268s $ time git rev-list --objects --use-bitmap-index master > /dev/null real 0m1.041s user 0m0.976s sys 0m0.064s Likewise, using `git rev-list --count --use-bitmap-index` will speed up the counting operation by building the resulting bitmap and performing a fast popcount (number of bits set on the bitmap) on the result. Here are some sample timings of different ways to count commits in `torvalds/linux`: $ time git rev-list master | wc -l 399882 real 0m6.524s user 0m6.060s sys 0m3.284s $ time git rev-list --count master 399882 real 0m4.318s user 0m4.236s sys 0m0.076s $ time git rev-list --use-bitmap-index --count master 399882 real 0m0.217s user 0m0.176s sys 0m0.040s This also respects negative refs, so you can use it to count a slice of history: $ time git rev-list --count v3.0..master 144843 real 0m1.971s user 0m1.932s sys 0m0.036s $ time git rev-list --use-bitmap-index --count v3.0..master real 0m0.280s user 0m0.220s sys 0m0.056s Though note that the closer the endpoints, the less it helps. In the traversal case, we have fewer commits to cross, so we take less time. But the bitmap time is dominated by generating the pack revindex, which is constant with respect to the refs given. Note that you cannot yet get a fast --left-right count of a symmetric difference (e.g., "--count --left-right master...topic"). The slow part of that walk actually happens during the merge-base determination when we parse "master...topic". Even though a count does not actually need to know the real merge base (it only needs to take the symmetric difference of the bitmaps), the revision code would require some refactoring to handle this case. Additionally, a `--test-bitmap` flag has been added that will perform the same rev-list manually (i.e. using a normal revwalk) and using bitmaps, and verify that the results are the same. This can be used to exercise the bitmap code, and also to verify that the contents of the .bitmap file are sane. Signed-off-by: Vicent Marti <> Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2013-10-16C: have space around && and || operatorsJunio C Hamano
Correct all hits from git grep -e '\(&&\|||\)[^ ]' -e '[^ ]\(&&\|||\)' -- '*.c' i.e. && or || operators that are followed by anything but a SP, or that follow something other than a SP or a HT, so that these operators have a SP around it when necessary. We usually refrain from making this kind of a tree-wide change in order to avoid unnecessary conflicts with other "real work" patches, but in this case, the end result does not have a potentially cumbersome tree-wide impact, while this is a tree-wide cleanup. Fixes to compat/regex/regcomp.c and xdiff/xemit.c are to replace a HT immediately after && with a SP. This is based on Felipe's patch to bultin/symbolic-ref.c; I did all the finding out what other files in the whole tree need to be fixed and did the fix and also the log message while reviewing that single liner, so any screw-ups in this version are mine. Signed-off-by: Felipe Contreras <> Signed-off-by: Junio C Hamano <>
2013-08-28list-objects: reduce one argument in mark_edges_uninterestingNguyễn Thái Ngọc Duy
mark_edges_uninteresting() is always called with this form mark_edges_uninteresting(revs->commits, revs, ...); Remove the first argument and let mark_edges_uninteresting figure that out by itself. It helps answer the question "are this commit list and revs related in any way?" when looking at mark_edges_uninteresting implementation. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2013-06-26pretty: --format output should honor logOutputEncodingAlexey Shumkin
One can set an alias $ git config [--global] alias.lg "log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)<%an>%Creset' --abbrev-commit --date=local" to see the log as a pretty tree (like *gitk* but in a terminal). However, log messages written in an encoding i18n.commitEncoding which differs from terminal encoding are shown corrupted even when i18n.logOutputEncoding and terminal encoding are the same (e.g. log messages committed on a Cygwin box with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice versa). To simplify an example we can say the following two commands are expected to give the same output to a terminal: $ git log --oneline --no-color $ git log --pretty=format:'%h %s' However, the former pays attention to i18n.logOutputEncoding configuration, while the latter does not when it formats "%s". The same corruption is true for $ git diff --submodule=log and $ git rev-list --pretty=format:%s HEAD and $ git reset --hard This patch makes pretty --format honor logOutputEncoding when it formats log message. Signed-off-by: Alexey Shumkin <> Signed-off-by: Junio C Hamano <>
2012-10-29Move print_commit_list to libgit.aNguyễn Thái Ngọc Duy
This is used by bisect.c, part of libgit.a while it stays in builtin/rev-list.c. Move it to commit.c so that we won't get undefined reference if a program that uses libgit.a happens to pull it in. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Jeff King <>
2012-10-29Move estimate_bisect_steps to libgit.aNguyễn Thái Ngọc Duy
This function is used by bisect.c, part of libgit.a while estimate_bisect_steps stays in builtin/rev-list.c. Move it to bisect.a so we won't have undefine reference if a standalone program that uses libgit.a happens to pull it in. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Jeff King <>
2012-05-11Merge branch 'jk/maint-reflog-walk-count-vs-time'Junio C Hamano
Gives a better DWIM behaviour for --pretty=format:%gd, "stash list", and "log -g", depending on how the starting point ("master" vs "master@{0}" vs "master@{now}") and date formatting options (e.g. "--date=iso") are given on the command line. By Jeff King (4) and Junio C Hamano (1) * jk/maint-reflog-walk-count-vs-time: reflog-walk: tell explicit --date=default from not having --date at all reflog-walk: always make HEAD@{0} show indexed selectors reflog-walk: clean up "flag" field of commit_reflog struct log: respect date_mode_explicit with --format:%gd t1411: add more selector index/date tests
2012-05-04log: respect date_mode_explicit with --format:%gdJeff King
When we show a reflog selector (e.g., via "git log -g"), we perform some DWIM magic: while we normally show the entry's index (e.g., HEAD@{1}), if the user has given us a date with "--date", then we show a date-based select (e.g., HEAD@{yesterday}). However, we don't want to trigger this magic if the alternate date format we got was from the "" configuration; that is not sufficiently strong context for us to invoke this particular magic. To fix this, commit f4ea32f (improve reflog date/number heuristic, 2009-09-24) introduced a "date_mode_explicit" flag in rev_info. This flag is set only when we see a "--date" option on the command line, and we a vanilla date to the reflog code if the date was not explicit. Later, commit 8f8f547 (Introduce new pretty formats %g[sdD] for reflog information, 2009-10-19) added another way to show selectors, and it did not respect the date_mode_explicit flag from f4ea32f. This patch propagates the date_mode_explicit flag to the pretty-print code, which can then use it to pass the appropriate date field to the reflog code. This brings the behavior of "%gd" in line with the other formats, and means that its output is independent of any user configuration. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2012-02-28rev-list: fix --verify-objects --quiet becoming --objectsNguyễn Thái Ngọc Duy
When --quiet is specified, finish_object() is called instead of show_object(). The latter is in charge of --verify-objects and will be skipped if --quiet is specified. Move the code up to finish_object(). Also pass the quiet flag along and make it always call show_* functions to avoid similar problems in future. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2012-02-28rev-list: remove BISECT_SHOW_TRIED flagNguyễn Thái Ngọc Duy
Since c99f069 (bisect--helper: remove "--next-vars" option as it is now useless - 2009-04-21), this flag has always been off. Remove the flag and all related code. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2012-02-13git rev-list: fix invalid typecastClemens Buchacher
git rev-list passes rev_list_info, not rev_list objects. Without this fix, rev-list enables or disables the --verify-objects option depending on a read from an undefined memory location. Signed-off-by: Clemens Buchacher <> Signed-off-by: Junio C Hamano <>
2011-09-01rev-list --verify-objectJunio C Hamano
Often we want to verify everything reachable from a given set of commits are present in our repository and connected without a gap to the tips of our refs. We used to do this for this purpose: $ rev-list --objects $commits_to_be_tested --not --all Even though this is good enough for catching missing commits and trees, we show the object name but do not verify their existence, let alone their well-formedness, for the blob objects at the leaf level. Add a new "--verify-object" option so that we can catch missing and broken blobs as well. Signed-off-by: Junio C Hamano <>
2011-09-01list-objects: pass callback data to show_objects()Junio C Hamano
The traverse_commit_list() API takes two callback functions, one to show commit objects, and the other to show other kinds of objects. Even though the former has a callback data parameter, so that the callback does not have to rely on global state, the latter does not. Give the show_objects() callback the same callback data parameter. Signed-off-by: Junio C Hamano <>
2011-08-22revision.c: add show_object_with_name() helper functionJunio C Hamano
There are two copies of traverse_commit_list callback that show the object name followed by pathname the object was found, to produce output similar to "rev-list --objects". Unify them. Signed-off-by: Junio C Hamano <>
2011-08-22rev-list: fix finish_object() callJunio C Hamano
The callback to traverse_commit_list() are to take linked name_path and a string for the last path component. If the callee used its parameters, it would have seen duplicated leading paths. In this particular case, the callee does not use this argument but that is not a reason to leave the call broken. Signed-off-by: Junio C Hamano <>
2011-05-31Merge branch 'jk/format-patch-am'Junio C Hamano
* jk/format-patch-am: format-patch: preserve subject newlines with -k clean up calling conventions for pretty.c functions pretty: add pp_commit_easy function for simple callers mailinfo: always clean up rfc822 header folding t: test subject handling in format-patch / am pipeline Conflicts: builtin/branch.c builtin/log.c commit.h
2011-05-26clean up calling conventions for pretty.c functionsJeff King
We have a pretty_print_context representing the parameters for a pretty-print session, but we did not use it uniformly. As a result, functions kept growing more and more arguments. Let's clean this up in a few ways: 1. All pretty-print pp_* functions now take a context. This lets us reduce the number of arguments to these functions, since we were just passing around the context values separately. 2. The context argument now has a cmit_fmt field, which was passed around separately. That's one less argument per function. 3. The context argument always comes first, which makes calling a little more uniform. This drops lines from some callers, and adds lines in a few places (because we need an extra line to set the context's fmt field). Overall, we don't save many lines, but the lines that are there are a lot simpler and more readable. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-04-26rev-list --count: separate count for --cherry-markMichael J Gruber
When --count is used with --cherry-mark, omit the patch equivalent commits from the count for left and right commits and print the count of equivalent commits separately. Signed-off-by: Michael J Gruber <> Signed-off-by: Junio C Hamano <>
2011-03-23revision.c: introduce --min-parents and --max-parents optionsMichael J Gruber
Introduce --min-parents and --max-parents options which limit the revisions to those commits which have at least (or at most) that many commits, where negative arguments for --max-parents= denote infinity (i.e. no upper limit). In particular: --max-parents=1 is the same as --no-merges; --min-parents=2 is the same as --merges; --max-parents=0 shows only roots; and --min-parents=3 shows only octopus merges Using --min-parents=n and --max-parents=m with n>m gives you what you ask for (i.e. nothing) for obvious reasons, just like when you give --merges (show only merge commits) and --no-merges (show only non-merge commits) at the same time. Also, introduce --no-min-parents and --no-max-parents to do the obvious thing for convenience. We compute the number of parents only when we limit by that, so there is no performance impact when there are no limiters. Signed-off-by: Michael J Gruber <> Signed-off-by: Junio C Hamano <>
2011-03-09rev-list/log: factor out revision mark generationMichael J Gruber
Currently, we have identical code for generating revision marks ('<', '>', '-') in 5 places. Factor out the code to a single function get_revision_mark() for easier maintenance and extensibility. Note that the check for !!revs in graph.c (which gets removed effectively by this patch) is superfluous. Signed-off-by: Michael J Gruber <> Signed-off-by: Junio C Hamano <>
2010-11-17Merge branch 'jk/maint-rev-list-nul'Junio C Hamano
* jk/maint-rev-list-nul: rev-list: handle %x00 NUL in user format
2010-10-14rev-list: handle %x00 NUL in user formatJeff King
The code paths for showing commits in "git log" and "git rev-list --graph" correctly handle embedded NULs by looking only at the resulting strbuf's length, and never treating it as a C string. The code path for regular rev-list, however, used printf("%s"), which resulted in truncated output. This patch uses fwrite instead, like the --graph code path. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2010-10-08Use angles for placeholders consistentlyŠtěpán Němec
Signed-off-by: Štěpán Němec <> Acked-by: Jonathan Nieder <> Signed-off-by: Junio C Hamano <>
2010-06-12rev-list: introduce --count optionThomas Rast
Add a --count option that, instead of actually listing the commits, merely counts them. This is mostly geared towards script use, and to this end it acts specially when used with --left-right: it outputs the left and right counts separately. Previously, scripts would have to run a shell loop or small inline script over to achieve the same. (Without --left-right, a simple |wc -l does the job.) Signed-off-by: Thomas Rast <> Signed-off-by: Junio C Hamano <>
2010-04-06Merge branch 'ef/maint-empty-commit-log'Junio C Hamano
* ef/maint-empty-commit-log: rev-list: fix --pretty=oneline with empty message
2010-04-03Merge branch 'mg/use-default-abbrev-length-in-rev-list'Junio C Hamano
* mg/use-default-abbrev-length-in-rev-list: rev-list: use default abbrev length when abbrev-commit is in effect
2010-03-10Merge branch 'lt/deepen-builtin-source'Junio C Hamano
* lt/deepen-builtin-source: Move 'builtin-*' into a 'builtin/' subdirectory Conflicts: Makefile
2010-02-22Move 'builtin-*' into a 'builtin/' subdirectoryLinus Torvalds
This shrinks the top-level directory a bit, and makes it much more pleasant to use auto-completion on the thing. Instead of [torvalds@nehalem git]$ em buil<tab> Display all 180 possibilities? (y or n) [torvalds@nehalem git]$ em builtin-sh builtin-shortlog.c builtin-show-branch.c builtin-show-ref.c builtin-shortlog.o builtin-show-branch.o builtin-show-ref.o [torvalds@nehalem git]$ em builtin-shor<tab> builtin-shortlog.c builtin-shortlog.o [torvalds@nehalem git]$ em builtin-shortlog.c you get [torvalds@nehalem git]$ em buil<tab> [type] builtin/ builtin.h [torvalds@nehalem git]$ em builtin [auto-completes to] [torvalds@nehalem git]$ em builtin/sh<tab> [type] shortlog.c shortlog.o show-branch.c show-branch.o show-ref.c show-ref.o [torvalds@nehalem git]$ em builtin/sho [auto-completes to] [torvalds@nehalem git]$ em builtin/shor<tab> [type] shortlog.c shortlog.o [torvalds@nehalem git]$ em builtin/shortlog.c which doesn't seem all that different, but not having that annoying break in "Display all 180 possibilities?" is quite a relief. NOTE! If you do this in a clean tree (no object files etc), or using an editor that has auto-completion rules that ignores '*.o' files, you won't see that annoying 'Display all 180 possibilities?' message - it will just show the choices instead. I think bash has some cut-off around 100 choices or something. So the reason I see this is that I'm using an odd editory, and thus don't have the rules to cut down on auto-completion. But you can simulate that by using 'ls' instead, or something similar. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>