path: root/diff.c
AgeCommit message (Collapse)Author
2007-09-09git-diff: don't squelch the new SHA1 in submodule diffsSven Verdoolaege
The code to squelch empty diffs introduced by commit fb13227e089f22dc31a3b1624559153821056848 would inadvertently populate filespec "two" of a submodule change using the uninitialized (null) SHA1, thereby replacing the submodule SHA1 by 0{40} in the output. This change teaches diffcore_skip_stat_unmatch to handle submodule changes correctly. Signed-off-by: Sven Verdoolaege <> Signed-off-by: Junio C Hamano <>
2007-09-01git-diff: resurrect the traditional empty "diff --git" behaviourJunio C Hamano
The warning message to suggest "Consider running git-status" from "git-diff" that we experimented with during the 1.5.3 cycle turns out to be a bad idea. It robbed cache-dirty information from people who valued it, while still asking users to run "update-index --refresh". It was hoped that the new behaviour would at least have some educational value, but not showing the cache-dirty paths like before meant that the user would not even know easily which paths were cache-dirty, and it made the need to refresh the index look like even more unnecessary chore. This commit reinstates the traditional behaviour, but with a twist. By default, the empty "diff --git" output is totally squelched out from "git diff" output. At the end of the command, it automatically runs "update-index --refresh" as needed, without even bothering the user. In other words, people who do not care about the cache-dirtyness do not even have to see the warning. The traditional behaviour to see the stat-dirty output and to bypassing the overhead of content comparison can be specified by setting the configuration variable diff.autorefreshindex to false. Signed-off-by: Junio C Hamano <>
2007-08-19Take binary diffs into account for "git rebase"Linus Torvalds
We used to not generate a patch ID for binary diffs, but that means that some commits may be skipped as being identical to already-applied diffs when doing a rebase. So just delete the code that skips the binary diff. At the very least, we'd want the filenames to be part of the patch ID, but we might also want to generate some hash for the binary diff itself too. This fixes an issue noticed by Torgil Svensson. Tested-by: Torgil Svensson <> Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2007-08-15diff: squelch empty diffs even moreRené Scharfe
When we compare two non-tracked files, or explicitly specify --no-index, the suggestion to run git-status is not helpful. The patch adds a new diff_options bitfield member, no_index, that is used instead of the special value of -2 of the rev_info field max_count to indicate that the index is not to be used. This makes it possible to pass that flag down to diffcore_skip_stat_unmatch(), which only has one diff_options parameter. This could even become a cleanup if we removed all assignments of max_count to a value of -2 (viz. replacement of a magic value with a self-documenting field name) but I didn't dare to do that so late in the rc game.. The no_index bit, if set, then tells diffcore_skip_stat_unmatch() to not account for any skipped stat-mismatches, which avoids the suggestion to run git-status. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2007-08-14git-diff: squelch "empty" diffsJunio C Hamano
After starting to edit a working tree file but later when your edit ends up identical to the original (this can also happen when you ran a wholesale regexp replace with something like "perl -i" that does not actually modify many of the paths), "git diff" between the index and the working tree outputs many "empty" diffs that show "diff --git" headers and nothing else, because these paths are stat-dirty. While it was a way to warn the user that the earlier action of the user made the index ineffective as an optimization mechanism, it was felt too loud for the purpose of warning even to experienced users, and also resulted in confusing people new to git. This replaces the "empty" diffs with a single warning message at the end. Having many such paths hurts performance, and you can run "git-update-index --refresh" to update the lstat(2) information recorded in the index in such a case. "git-status" does so as a side effect, and that is more familiar to the end-user, so we recommend it to them. The change affects only "git diff" that outputs patch text, because that is where the annoyance of too many "empty" diff is most strongly felt, and because the warning message can be safely ignored by downstream tools without getting mistaken as part of the patch. For the low-level "git diff-files" and "git diff-index", the traditional behaviour is retained. Signed-off-by: Junio C Hamano <>
2007-07-26git_mkstemp(): be careful not to overflow the path buffer.Junio C Hamano
If user's TMPDIR is insanely long, return negative after setting errno to ENAMETOOLONG, pretending that the underlying mkstemp() choked on a temporary file path that is too long. Signed-off-by: Junio C Hamano <>
2007-07-08diff.c: make built-in hunk header pattern a separate tableJunio C Hamano
This would hopefully make it easier to maintain. Initially we would have "java" and "tex" defined, as they are the only ones we already have. Signed-off-by: Junio C Hamano <>
2007-07-07diff: honor binariness specified in attributesJunio C Hamano
The code shuffling mistakenly lost binariness specified with the attribute mecahnism and made it always guess from the data. Noticed by Johannes, with two test cases to t4020. Signed-off-by: Junio C Hamano <>
2007-07-07Fix configuration syntax to specify customized hunk header patterns.Junio C Hamano
This updates the hunk header customization syntax. The special case 'funcname' attribute is gone. You assign the name of the type of contents to path's "diff" attribute as a string value in .gitattributes like this: *.java diff=java *.perl diff=perl *.doc diff=doc If you supply "diff.<name>.funcname" variable via the configuration mechanism (e.g. in $HOME/.gitconfig), the value is used as the regexp set to find the line to use for the hunk header (the variable is called "funcname" because such a line typically is the one that has the name of the function in programming language source text). If there is no such configuration, built-in default is used, if any. Currently there are two default patterns: default and java. Signed-off-by: Junio C Hamano <>
2007-07-06Per-path attribute based hunk header selection.Junio C Hamano
This makes"diff -p" hunk headers customizable via gitattributes mechanism. It is based on Johannes's earlier patch that allowed to define a single regexp to be used for everything. The mechanism to arrive at the regexp that is used to define hunk header is the same as other use of gitattributes. You assign an attribute, funcname (because "diff -p" typically uses the name of the function the patch is about as the hunk header), a simple string value. This can be one of the names of built-in pattern (currently, "java" is defined) or a custom pattern name, to be looked up from the configuration file. (in .gitattributes) *.java funcname=java *.perl funcname=perl (in .git/config) [funcname] java = ... # ugly and complicated regexp to override the built-in one. perl = ... # another ugly and complicated regexp to define a new one. Signed-off-by: Junio C Hamano <>
2007-07-06Future-proof source for changes in xdemitconf_tJohannes Schindelin
The instances of xdemitconf_t were initialized member by member. Instead, initialize them to all zero, so we do not have to update those places each time we introduce a new member. [jc: minimally fixed by getting rid of a new global] Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-07-06Introduce diff_filespec_is_binary()Junio C Hamano
This replaces an explicit initialization of filespec->is_binary field used for rename/break followed by direct access to that field with a wrapper function that lazily iniaitlizes and accesses the field. We would add more attribute accesses for the use of diff routines, and it would be better to make this abstraction earlier. Signed-off-by: Junio C Hamano <>
2007-07-04Merge branch 'maint'Junio C Hamano
* maint: Document -<n> for git-format-patch glossary: add 'reflog' diff --no-index: fix --name-status with added files Don't smash stack when $GIT_ALTERNATE_OBJECT_DIRECTORIES is too long
2007-07-04Add diff-option --ext-diffJohannes Schindelin
To prevent funky games with external diff engines, git-log and friends prevent external diff engines from being called. That makes sense in the context of git-format-patch or git-rebase. However, for "git log -p" it is not so nice to get the message that binary files cannot be compared, while "git diff" has no problems with them, if you provided an external diff driver. With this patch, "git log --ext-diff -p" will do what you expect, and the option "--no-ext-diff" can be used to override that setting. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-07-03diff --no-index: fix --name-status with added filesJohannes Schindelin
Without this patch, an added file would be reported as /dev/null. Noticed by David Kastrup. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-07-02Merge branch 'jc/diffcore'Junio C Hamano
* jc/diffcore: diffcore-delta.c: Ignore CR in CRLF for text files diffcore-delta.c: update the comment on the algorithm. diffcore_filespec: add is_binary diffcore_count_changes: pass diffcore_filespec
2007-07-01diffcore_filespec: add is_binaryJunio C Hamano
diffcore-break and diffcore-rename would want to behave slightly differently depending on the binary-ness of the data, so add one bit to the filespec, as the structure is now passed down to diffcore_count_changes() function. Signed-off-by: Junio C Hamano <>
2007-06-25diff: round down similarity indexRené Scharfe
Rounding down the printed (dis)similarity index allows us to use "100%" as a special value that indicates complete rewrites and fully equal file contents, respectively. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2007-06-23Finally implement "git log --follow"Linus Torvalds
Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show *that* particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is *not* carrying around a "commit,pathname" kind of pair and it's *not* going to be able to notice the file coming from multiple *different* files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2007-06-16Move buffer_is_binary() to xdiff-interface.hJohannes Schindelin
We already have two instances where we want to determine if a buffer contains binary data as opposed to text. [jc: cherry-picked 6bfce93e from 'master'] Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-06-12Teach diff to imply --find-copies-harder upon -C -CJohannes Schindelin
Earlier, a second "-C" on the command line had no effect. But "--find-copies-harder" is so long to type, let's make doubled -C enable that option. It is in line with how "git blame" handles such doubled options to mean "work harder". Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-06-08Even more missing staticJunio C Hamano
Signed-off-by: Junio C Hamano <>
2007-06-07War on whitespaceJunio C Hamano
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <>
2007-06-05Move buffer_is_binary() to xdiff-interface.hJohannes Schindelin
We already have two instances where we want to determine if a buffer contains binary data as opposed to text. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-05-27Merge branch 'maint'Junio C Hamano
* maint: Fix git-svn to handle svn not reporting the md5sum of a file, and test. Fix mishandling of $Id$ expanded in the repository copy in convert.c More echo "$user_message" fixes. Add tests for the last two fixes. git-commit: use printf '%s\n' instead of echo on user-supplied strings git-am: use printf instead of echo on user-supplied strings Documentation: Add definition of "evil merge" to GIT Glossary Replace the last 'dircache's by 'index' Documentation: Clean up links in GIT Glossary
2007-05-26Merge branch 'maint-1.5.1' into maintJunio C Hamano
* maint-1.5.1: Fix git-svn to handle svn not reporting the md5sum of a file, and test. More echo "$user_message" fixes. Add tests for the last two fixes. git-commit: use printf '%s\n' instead of echo on user-supplied strings git-am: use printf instead of echo on user-supplied strings Documentation: Add definition of "evil merge" to GIT Glossary Replace the last 'dircache's by 'index' Documentation: Clean up links in GIT Glossary
2007-05-26Replace the last 'dircache's by 'index'Jakub Narebski
Signed-off-by: Jakub Narebski <> Signed-off-by: Junio C Hamano <>
2007-05-22rename dirlink to gitlink.Martin Waitz
Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' | xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <>
2007-05-21Merge branch 'maint-1.5.1' into maintJunio C Hamano
* maint-1.5.1: annotate: make it work from subdirectories. git-config: Correct asciidoc documentation for --int/--bool t1300: Add tests for git-config --bool --get unpack-trees.c: verify_uptodate: remove dead code Use PATH_MAX instead of TEMPFILE_PATH_LEN branch: fix segfault when resolving an invalid HEAD
2007-05-20Use PATH_MAX instead of TEMPFILE_PATH_LENFernando J. Pereda
Signed-off-by: Fernando J. Pereda <> Signed-off-by: Junio C Hamano <>
2007-05-16Ensure return value from xread() is always stored into an ssize_tJohan Herland
This patch fixes all calls to xread() where the return value is not stored into an ssize_t. The patch should not have any effect whatsoever, other than putting better/more appropriate type names on variables. Signed-off-by: Johan Herland <> Signed-off-by: Junio C Hamano <>
2007-05-07diff.c: do not use a separate "size cache".Junio C Hamano
diff_filespec has a slot to record the size of the data already, so make use of it instead of a separate size cache. Signed-off-by: Junio C Hamano <>
2007-05-07diff: release blobs after generating textual diff.Junio C Hamano
This reduces the memory pressure when dealing with many paths. An unscientific test of running "diff-tree --stat --summary -M" between v2.6.19 and v2.6.20-rc1 in the linux kernel repository indicates that the number of minor faults are reduced by 2/3. Signed-off-by: Junio C Hamano <>
2007-05-04Merge branch 'maint'Junio C Hamano
* maint: gitweb: use decode_utf8 directly posix compatibility for t4200 Document 'opendiff' value in config.txt and git-mergetool.txt Allow PERL_PATH="/usr/bin/env perl" Make xstrndup common diff.c: fix "size cache" handling. http-fetch: Disable use of curl multi support for libcurl < 7.16.
2007-05-04diff.c: fix "size cache" handling.Junio C Hamano
We broke the size-cache handling when we changed the function signature of sha1_object_info() in 21666f1a. We obviously wanted to cache the size we obtained when sha1_object_info() succeeded, not when it failed. Signed-off-by: Junio C Hamano <>
2007-04-23Support 'diff=pgm' attributeJunio C Hamano
This enhances the attributes mechanism so that external programs meant for existing GIT_EXTERNAL_DIFF interface can be specifed per path. To configure such a custom diff driver, first define a custom diff driver in the configuration: [diff "my-c-diff"] command = <<your command string comes here>> Then mark the paths that you want to use this custom driver using the attribute mechanism. *.c diff=my-c-diff The intent of this separation is that the attribute mechanism is used for specifying the type of the contents, while the configuration mechanism is used to define what needs to be done to that type of the contents, which would be specific to both platform and personal taste. Signed-off-by: Junio C Hamano <>
2007-04-22Merge branch 'jc/attr'Junio C Hamano
* 'jc/attr': (28 commits) lockfile: record the primary process. convert.c: restructure the attribute checking part. Fix bogus linked-list management for user defined merge drivers. Simplify calling of CR/LF conversion routines Document gitattributes(5) Update 'crlf' attribute semantics. Documentation: support manual section (5) - file formats. Simplify code to find recursive merge driver. Counto-fix in merge-recursive Fix funny types used in attribute value representation Allow low-level driver to specify different behaviour during internal merge. Custom low-level merge driver: change the configuration scheme. Allow the default low-level merge driver to be configured. Custom low-level merge driver support. Add a demonstration/test of customized merge. Allow specifying specialized merge-backend per path. merge-recursive: separate out xdl_merge() interface. Allow more than true/false to attributes. Document git-check-attr Change attribute negation marker from '!' to '-'. ...
2007-04-21Simplify calling of CR/LF conversion routinesAlex Riesen
Signed-off-by: Alex Riesen <> Signed-off-by: Junio C Hamano <>
2007-04-18Fix funny types used in attribute value representationJunio C Hamano
It was bothering me a lot that I abused small integer values casted to (void *) to represent non string values in gitattributes. This corrects it by making the type of attribute values (const char *), and using the address of a few statically allocated character buffer to denote true/false. Unset attributes are represented as having NULLs as their values. Added in-header documentation to explain how git_checkattr() routine should be called. Signed-off-by: Junio C Hamano <>
2007-04-17Allow more than true/false to attributes.Junio C Hamano
This allows you to define three values (and possibly more) to each attribute: true, false, and unset. Typically the handlers that notice and act on attribute values treat "unset" attribute to mean "do your default thing" (e.g. crlf that is unset would trigger "guess from contents"), so being able to override a setting to an unset state is actually useful. - If you want to set the attribute value to true, have an entry in .gitattributes file that mentions the attribute name; e.g. *.o binary - If you want to set the attribute value explicitly to false, use '-'; e.g. *.a -diff - If you want to make the attribute value _unset_, perhaps to override an earlier entry, use '!'; e.g. *.a -diff c.i.a !diff This also allows string values to attributes, with the natural syntax: attrname=attrvalue but you cannot use it, as nobody takes notice and acts on it yet. Signed-off-by: Junio C Hamano <>
2007-04-15Fix 'diff' attribute semantics.Junio C Hamano
This is in the same spirit as the previous one. Earlier 'diff' meant 'do the built-in binary heuristics and disable patch text generation based on it' while '!diff' meant 'do not guess, do not generate patch text'. There was no way to say 'do generate patch text even when the heuristics says it has NUL in it'. Signed-off-by: Junio C Hamano <>
2007-04-15Expose subprojects as special files to "git diff" machineryLinus Torvalds
The same way we generate diffs on symlinks as the the diff of text of the symlink, we can generate subproject diffs (when not recursing into them!) as the diff of the text that describes the subproject. Of course, since what descibes a subproject is just the SHA1, that's what we'll use. Add some pretty-printing to make it a bit more obvious what is going on, and we're done. So with this, we can get both raw diffs and "textual" diffs of subproject changes: - git diff --raw: :160000 160000 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5 0000000... M sub-A - git diff: diff --git a/sub-A b/sub-A index 2de597b..e8f11a4 160000 --- a/sub-A +++ b/sub-A @@ -1 +1 @@ -Subproject commit 2de597b5ad348b7db04bd10cdd38cd81cbc93ab5 +Subproject commit e8f11a45c5c6b9e2fec6d136d3fb5aff75393d42 NOTE! We'll also want to have the ability to recurse into the subproject and actually diff it recursively, but that will involve a new command line option (I'd suggest "--subproject" and "-S", but the latter is in use by pickaxe), and some very different code. But regardless of ay future recursive behaviour, we need the non-recursive version too (and it should be the default, at least in the absense of config options, so that large superprojects don't default to something extremely expensive). Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2007-04-14Teach 'diff' about 'diff' attribute.Junio C Hamano
This makes paths that explicitly unset 'diff' attribute not to produce "textual" diffs from 'git-diff' family. Signed-off-by: Junio C Hamano <>
2007-04-05Show binary file size change in diff --statAndy Parkins
Previously, a binary file in the diffstat would show as: some-binary-file.bin | Bin The space after the "Bin" was never used. This patch changes binary lines in the diffstat to be: some-binary-file.bin | Bin 12345 -> 123456 bytes The very nice "->" notation was suggested by Johannes Schindelin, and shows the before and after sizes more clearly than "+" and "-" would. If a size is 0 it's not shown (although it would probably be better to treat no-file differently from zero-byte-file). The user can see what changed in the binary file, and how big the new file is. This is in keeping with the information in the rest of the diffstat. The diffstat_t members "added" and "deleted" were unused when the file was binary, so this patch loads them with the file sizes in builtin_diffstat(). These figures are then read in show_stats() when the file is marked binary. Signed-off-by: Andy Parkins <> Signed-off-by: Junio C Hamano <>
2007-03-14diff --quietJunio C Hamano
This adds the command line option 'quiet' to tell 'git diff-*' that we are not interested in the actual diff contents but only want to know if there is any change. This option automatically turns --exit-code on, and turns off output formatting, as it does not make much sense to show the first hit we happened to have found. The --quiet option is silently turned off (but --exit-code is still in effect, so is silent output) if postprocessing filters such as pickaxe and diff-filter are used. For all practical purposes I do not think of a reason to want to use these filters and not viewing the diff output. The backends have not been taught about the option with this patch. That is a topic for later rounds. Signed-off-by: Junio C Hamano <>
2007-03-14Remove unused diffcore_std_no_resolveJunio C Hamano
This was only used by diff-tree-helper program, whose purpose was to translate a raw diff to a patch. Signed-off-by: Junio C Hamano <>
2007-03-14Allow git-diff exit with codes similar to diff(1)Alex Riesen
This introduces a new command-line option: --exit-code. The diff programs will return 1 for differences, return 0 for equality, and something else for errors. Signed-off-by: Alex Riesen <> Signed-off-by: Junio C Hamano <>
2007-03-11Merge branch 'js/diff-ni'Junio C Hamano
* js/diff-ni: Get rid of the dependency to GNU diff in the tests diff --no-index: support /dev/null as filename diff-ni: fix the diff with standard input diff: support reading a file from stdin via "-"
2007-03-07Cast 64 bit off_t to 32 bit size_tShawn O. Pearce
Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2007-03-04diff-ni: fix the diff with standard inputJunio C Hamano
The earlier commit to read from stdin was full of problems, and this corrects them. - The mode bits should have been set to satisify S_ISREG(); we forgot to the S_IFREG bits and hardcoded 0644; - We did not give escape hatch to name a path whose name is really "-". Allow users to say "./-" for that; - Use of xread() was not prepared to see short read (e.g. reading from tty) nor handing read errors. Signed-off-by: Junio C Hamano <>