summaryrefslogtreecommitdiff
path: root/rev-list.c
AgeCommit message (Collapse)Author
2005-09-17[PATCH] Fix "git-rev-list" revision range parsingLinus Torvalds
There were two bugs in there: - if the range didn't end up working, we restored the '.' character in the wrong place. - an empty end-of-range should be interpreted as HEAD. See rev-parse.c for the reference implementation of this. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-16[PATCH] Avoid building object ref lists when not neededLinus Torvalds
The object parsing code builds a generic "this object references that object" because doing a full connectivity check for fsck requires it. However, nothing else really needs it, and it's quite expensive for git-rev-list that can have tons of objects in flight. So, exactly like the commit buffer save thing, add a global flag to disable it, and use it in git-rev-list. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 After this change: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l 10.33user 0.18system 0:10.54elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+18509minor)pagefaults 0swaps 59124 and note how the number of pages touched by git-rev-list for this particular object list has shrunk from 26,718 (104 MB) to 18,509 (72 MB). Calculating the total object difference between two git revisions is still clearly the most expensive git operation (both in memory and CPU time), but it's now less than 40% of what it used to be. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-16[PATCH] Improve git-rev-list memory usage furtherLinus Torvalds
This avoids keeping tree entries around, and free's them as it traverses the list. This avoids building up a huge memory footprint just for these small but very common allocations. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l 11.65user 0.38system 0:12.65elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42934minor)pagefaults 0swaps 59124 After: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 Note how the minor fault numbers - which ends up being how many pages we needed to map - go down from 42934 (167 MB) to 26718 (104 MB). That is: Before: 42934 minor pagefaults After: 26718 minor pagefaults This is all in _addition_ to the previous fixes. It used to be ~48,000 pagefaults. That's still a honking big memory footprint, but it's about half of what it was just a day or two ago (and this is the object list for a pretty big update - almost 60,000 objects. Smaller updates need less memory). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-15[PATCH] Re-organize "git-rev-list --objects" logicLinus Torvalds
The logic to calculate the full object list used to be very inter-twined with the logic that looked up the commits. For no good reason - it's actually a lot simpler to just do that logic as a separate pass. This improves performance a bit, and uses slightly less memory in my tests, but more importantly it makes the code simpler to work with and follow what it does. The performance win is less than I had hoped for, but I get: Before: [torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l 13.64user 0.42system 0:14.13elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+47947minor)pagefaults 0swaps 58945 After: [torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l 11.80user 0.36system 0:12.16elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42684minor)pagefaults 0swaps 58945 ie it improved by 2 seconds, and took a 5000+ fewer pages (hey, that's 20MB out of 174MB to go). And got the same number of objects (in theory, the more expensive one might find some more shared objects to avoid. In practice it obviously doesn't). I know how to make it use _lots_ less memory, which will probably speed it up. But that's for another time, and I'd prefer to see this go in first. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-15[PATCH] Avoid wasting memory in git-rev-listLinus Torvalds
As pointed out on the list, git-rev-list can use a lot of memory. One low-hanging fruit is to free the commit buffer for commits that we parse. By default, parse_commit() will save away the buffer, since a lot of cases do want it, and re-reading it continually would be unnecessary. However, in many cases the buffer isn't actually necessary and saving it just wastes memory. We could just free the buffer ourselves, but especially in git-rev-list, we actually end up using the helper functions that automatically add parent commits to the commit lists, so we don't actually control the commit parsing directly. Instead, just make this behaviour of "parse_commit()" a global flag. Maybe this is a bit tasteless, but it's very simple, and it makes a noticable difference in memory usage. Before the change: [torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null 0.26user 0.02system 0:00.28elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+3714minor)pagefaults 0swaps after the change: [torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null 0.26user 0.00system 0:00.27elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2433minor)pagefaults 0swaps note how the minor faults have decreased from 3714 pages to 2433 pages. That's all due to the fewer anonymous pages allocated to hold the comment buffers and their metadata. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-24[PATCH] Fix "prefix" mixup in git-rev-listPavel Roskin
Recent changes in git have broken cg-log. git-rev-list no longer prints "commit" in front of commit hashes. It turn out a local "prefix" variable in main() shadows a file-scoped "prefix" variable. The patch removed the local "prefix" variable since its value is never used (in the intended way, that is). The call to setup_git_directory() is kept since it has useful side effects. The file-scoped "prefix" variable is renamed to "commit_prefix" just in case someone reintroduces "prefix" to hold the return value of setup_git_directory(). Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-23Make "git-rev-list" work within subdirectoriesLinus Torvalds
This trivial patch makes "git-rev-list" able to handle not being in the top-level directory. This magically also makes "git-whatchanged" do the right thing. Trivial scripting fix to make sure that "git log" also works. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-19[PATCH] git-rev-list: avoid crash on broken repositorySergey Vlasov
When following tags, check for parse_object() success and error out properly instead of segfaulting. Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-10Introduce --pretty=oneline format.Junio C Hamano
This introduces --pretty=oneline to git-rev-tree and git-rev-list commands to show only the first line of the commit message, without frills. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-10[PATCH] add *--no-merges* flag to suppress display of merge commitsJohannes Schindelin
As requested by Junio (who suggested --single-parents-only, but this could forget a no-parent root). Also, adds a few missing options to the usage string. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-05Teach rev-list since..til notation.Junio C Hamano
The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-30[PATCH] Support for NO_OPENSSLPetr Baudis
Support for completely OpenSSL-less builds. FSF considers distributing GPL binaries with OpenSSL linked in as a legal problem so this is trouble e.g. for Debian, or some people might not want to install OpenSSL anyway. If you make NO_OPENSSL=1 you get completely OpenSSL-less build, disabling --merge-order and using Mozilla's SHA1 implementation. Ported from Cogito. Signed-off-by: Petr Baudis <pasky@ucw.cz> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-30[PATCH] Fix interesting git-rev-list corner caseLinus Torvalds
This corner-case was triggered by a kernel commit that was not in date order, due to a misconfigured time zone that made the commit appear three hours older than it was. That caused git-rev-list to traverse the commit tree in a non-obvious order, and made it parse several of the _parents_ of the misplaced commit before it actually parsed the commit itself. That's fine, but it meant that the grandparents of the commit didn't get marked uninteresting, because they had been reached through an "interesting" branch. The reason was that "mark_parents_uninteresting()" (which is supposed to mark all existing parents as being uninteresting - duh) didn't actually traverse more than one level down the parent chain. NORMALLY this is fine, since with the date-based traversal order, grandparents won't ever even have been looked at before their parents (so traversing the chain down isn't needed, because the next time around when we pick out the parent we'll mark _its_ parents uninteresting), but since we'd gotten out of order, we'd already seen the parent and thus never got around to mark the grandparents. Anyway, the fix is simple. Just traverse parent chains recursively. Normally the chain won't even exist (since the parent hasn't been parsed yet), so this is not actually going to trigger except in this strange corner-case. Add a comment to the simple one-liner, since this was a bit subtle, and I had to really think things through to understand how it could happen. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-27Typofix: usage strings fix.Junio C Hamano
The *_usage strings should not start with "usage: ", since the usage() function gives its own. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-07-23Be more aggressive about marking trees uninterestingLinus Torvalds
We'll mark all the trees at the edges (as deep as we had to go to realize that we have all the commits needed) as uninteresting. Otherwise we'll occasionally list a lot of objects that were actually available at the edge in a commit that we just never ended up parsing because we could determine early that we had all relevant commits. NOTE! The object listing is still just a _heuristic_. It's guaranteed to list a superset of the actual new objects, but there might be the occasional old object in the list, just because the commit that referenced it was much further back in the history. For example, let's say that a recent commit is a revert of part of the tree to much older state: since we didn't walk _that_ far back in the commit history tree to list the commits necessary, git-rev-tree will never have marked the old objects uninteresting, and we'll end up listing them as "new". That's ok.
2005-07-11[PATCH] Dereference tag repeatedly until we get a non-tag.Junio C Hamano
When we allow a tag object in place of a commit object, we only dereferenced the given tag once, which causes a tag that points at a tag that points at a commit to be rejected. Instead, dereference tag repeatedly until we get a non-tag. This patch makes change to two functions: - commit.c::lookup_commit_reference() is used by merge-base, rev-tree and rev-parse to convert user supplied SHA1 to that of a commit. - rev-list uses its own get_commit_reference() to do the same. Dereferencing tags this way helps both of these uses. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-10git-rev-list: allow missing objects when the parent is marked UNINTERESTINGLinus Torvalds
We still want the "top-most" uninteresting object to exist, so that we know that we have reached it.
2005-07-07[PATCH] Ensure list insertion method does not depend on position of ↵Jon Seymour
--merge-order argument This change ensures that git-rev-list --merge-order produces the same result irrespective of what position the --merge-order argument appears in the argument list. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-06git-rev-list: remove the DUPCHECK logic, use SEEN insteadLinus Torvalds
That's what we should have done in the first place, since it not only avoids another unnecessary flag, it also protects the commits from showing up as duplicates later when they show up as parents of another commit (in the pop_most_recent_commit() path). This will hopefully also fix --topo-sort.
2005-07-06Make sure we generate the whole commit list before trying to sort it ↵Linus Torvalds
topologically This was my cherry-pickng merge bug. But topo-order still shows strange behaviour with multiple heads, so keep gitk using --merge-order for now.
2005-07-06[PATCH] Tidy up - slight simplification of rev-list.cJon Seymour
This patch implements a small tidy up of rev-list.c to reduce (but not eliminate) the amount of ugliness associated with the merge_order flag. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-06Add "--topo-order" flag to use new topological sortLinus Torvalds
2005-07-06Remove insane overlapping bit ranges from epoch.cLinus Torvalds
..and move the DUPCHECK to rev-list.c since both the merge-order and the upcoming topo-sort get confused by dups.
2005-07-06Clean up commit insertion in git-rev-listLinus Torvalds
Jon wants the commits in a different order for merge-order.
2005-07-06Make "insert_by_date()" match "commit_list_insert()"Linus Torvalds
Same argument order, same return type. This allows us to use a function pointer to choose one over the other.
2005-07-05Remove unnecessary usage of strncmp() in git-rev-list arg parsing.Linus Torvalds
Not only is it unnecessary, it incorrectly allows extraneous characters at the end of the argument. Junio noticed the --merge-order thing, and Jon points out that if we fix that one, we should fix --show-breaks too.
2005-07-04git-rev-list: make sure the output is sorted by recencyLinus Torvalds
We didn't sort the refs by date, so if you had multiple refs, the end result would not be properly sorted.
2005-07-04Make rev-list flush the stdio buffers after each rev.Linus Torvalds
We'd rather get the revisions in a slow but timely manner than have to wait for them.
2005-07-03"git rev-list --unpacked" shows only unpacked commitsLinus Torvalds
More infrastructure to do efficient incremental packs.
2005-07-03Add "--all" flag to rev-parse that shows all refsLinus Torvalds
And make git-rev-list just silently ignore non-commit refs if we're not asking for all objects.
2005-07-03Fix sparse warnings.Linus Torvalds
Mainly making a lot of local functions and variables be marked "static", but there was a "zero as NULL" warning in there too.
2005-06-29Teach git-rev-list about non-commit objectsLinus Torvalds
Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits.
2005-06-29Prepare git-rev-list for tracking tag objects tooLinus Torvalds
We want to be able to just say "give a difference between these objects", rather than limiting it to commits only. This isn't there yet, but it sets things up to be a bit easier.
2005-06-27Add "--pretty=full" format that also shows committer.Linus Torvalds
Also move the common implementation of parsing the --pretty argument format into commit.c rather than having duplicates in diff-tree.c and rev-list.c.
2005-06-26Ooh. Make git-rev-list --object associate a name with objects.Linus Torvalds
The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint.
2005-06-25git-rev-list: add option to list all objects (not just commits)Linus Torvalds
When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first.
2005-06-20[PATCH] Fix for --merge-order, --max-age interaction issueJon Seymour
This patch fixes a problem reported by Paul Mackerras regarding the interaction of the --merge-order and --max-age switches of git-rev-list. This patch applies to the current Linus HEAD. A cleaner fix for the same problem in my current HEAD will follow later. With this change, --merge-order produces the same result as no --merge-order on the linux-2.6 git repository, to wit: $> git-rev-list --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 | wc -l 655 $> git-rev-list --merge-order --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 | wc -l 655 Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-20[PATCH] Prevent git-rev-list without --merge-order producing duplicates in ↵Jon Seymour
output If b is reachable from a, then: git-rev-list a b argument would print one of the commits twice. This patch fixes that problem. A previous problem fixed it for the --merge-order switch. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-19Avoid warning about function without return.Linus Torvalds
Strangely, this warning only shows up when not compiling with "-O2", which is why I didn't see it originally.
2005-06-18git-rev-list: add "--bisect" flag to find the "halfway" pointLinus Torvalds
This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering.
2005-06-08[PATCH] Tidy up some rev-list-related stuffPetr Baudis
This patch tidies up the git-rev-list documentation and epoch.c, which are in severe clash with the unwritten coding style now, and quite unreadable. It also fixes up compile failures with older compilers due to variable declarations after code. The patch mostly wraps lines before or on the 80th column, removes plenty of superfluous empty lines and changes comments from // to /* */. Signed-off-by: Petr Baudis <pasky@ucw.cz> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-06[PATCH] Modify git-rev-list to linearise the commit history in merge order.jon@blackcubes.dyndns.org
This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- | \ \ | b4 | |/ | | a3 | | | | | a2 | | | | c3 | | | | | c2 | b3 | | | /| | b2 | | | c1 | | / | b1 a1 | | | a0 | | / root Sorts like this: = a4 | c3 | c2 | c1 ^ b4 | b3 | b2 | b1 ^ a3 | a2 | a1 | a0 = root Instead of this: = a4 | c3 ^ b4 | a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-05pretty_print_commit: add different formatsLinus Torvalds
You can ask to print out "raw" format (full headers, full body), "medium" format (author and date, full body) or "short" format (author only, condensed body). Use "git-rev-list --pretty=short HEAD | less -S" for an example.
2005-06-04git-rev-list: allow arbitrary head selections, use git-rev-tree syntaxLinus Torvalds
This makes git-rev-list use the same command line syntax to mark the commits as git-rev-tree does, and instead of just allowing a start and end commit, it allows an arbitrary list of "interesting" and "uninteresting" commits. For example, imagine that you had three branches (a, b and c) that you are interested in, but you don't want to see stuff that already exists in another persons three releases (x, y and z). You can do git-rev-list a b c ^x ^y ^z (order doesn't matter, btw - feel free to put the uninteresting ones first or otherwise swithc them around), and it will show all the commits that are reachable from a/b/c but not reachable from x/y/z. The old syntax "git-rev-list start end" would not be written as "git-rev-list start ^end", or "git-rev-list ^end start". There's no limit to the number of heads you can specify (unlike git-rev-tree, which can handle a maximum of 16 heads).
2005-06-02git-rev-list: split out commit limiting from main() too.Linus Torvalds
Ok, now I'm happier.
2005-06-02git-rev-list: factor out the commit printing from "main()"Linus Torvalds
Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction.
2005-06-01git-rev-list: add "--pretty" command line optionLinus Torvalds
That pretty-prints the resulting commit messages, so git-rev-list --pretty HEAD v2.6.12-rc5 | less -S basically ends up being a log of the changes between -rc5 and current head. It uses the pretty-printing helper function I just extracted from diff-tree.c.
2005-05-31git-rev-list: add "--parents" command line flagLinus Torvalds
It makes rev-list show the list of parents, the same way git-rev-tree does (but without the expense).
2005-05-31git-rev-list: use proper lazy reachability analysisLinus Torvalds
This mean sthat you can give a beginning/end pair to git-rev-list, and it will show all entries that are reachable from the beginning but not the end. For example git-rev-list v2.6.12-rc5 v2.6.12-rc4 shows all commits that are in -rc5 but are not in -rc4.
2005-05-26git-rev-list: add "end" commit and "--header" flagLinus Torvalds
The "end" commit is just faking it right now, it's sorting things purely by date, so this is _not_ a reachability analysis. Some day. The "--header" flag causes the commit message to be printed out, with a NUL character separator after it for parseability. This allows you to do things like use "grep -z" to grep for certain authors etc.