2010-01-21Merge branch 'jc/conflict-marker-size'Junio C Hamano
* jc/conflict-marker-size: rerere: honor conflict-marker-size attribute rerere: prepare for customizable conflict marker length conflict-marker-size: new attribute rerere: use ll_merge() instead of using xdl_merge() merge-tree: use ll_merge() not xdl_merge() xdl_merge(): allow passing down marker_size in xmparam_t xdl_merge(): introduce xmparam_t for merge specific parameters git_attr(): fix function signature Conflicts: builtin-merge-file.c ll-merge.c xdiff/xdiff.h xdiff/xmerge.c
2010-01-17xdl_merge(): allow passing down marker_size in xmparam_tJunio C Hamano
This allows the callers of xdl_merge() to pass marker_size (defaults to 7) in xmparam_t argument, to use conflict markers of non-default length. Signed-off-by: Junio C Hamano <>
2010-01-17xdl_merge(): introduce xmparam_t for merge specific parametersJunio C Hamano
So far we have only needed to be able to pass an option that is generic to xdiff family of functions to this function. Extend the interface so that we can give it merge specific parameters. Signed-off-by: Junio C Hamano <>
2009-11-30git-merge-file --ours, --theirsJunio C Hamano
Sometimes people want their conflicting merges autoresolved by favouring upstream changes. The standard answer they are given is to run "git diff --name-only | xargs git checkout MERGE_HEAD --" in such a case. This is to accept automerge results for the paths that are fully resolved automatically, while taking their version of the file in full for paths that have conflicts. This is problematic on two counts. One is that this is not exactly what these people want. It discards all changes they did on their branch for any paths that conflicted. They usually want to salvage as much automerge result as possible in a conflicted file, and want to take the upstream change only in the conflicted part. This patch teaches two new modes of operation to the lowest-lever merge machinery, xdl_merge(). Instead of leaving the conflicted lines from both sides enclosed in <<<, ===, and >>> markers, the conflicts are resolved favouring our side or their side of changes. A larger problem is that this tends to encourage a bad workflow by allowing people to record such a mixed up half-merged result as a full commit without auditing. This commit does not tackle this issue at all. In git, we usually give long enough rope to users with strange wishes as long as the risky features are not enabled by default, and this is such a risky feature. Signed-off-by: Avery Pennarun <> Signed-off-by: Junio C Hamano <>
2009-09-01Merge branch 'tf/diff-whitespace-incomplete-line'Junio C Hamano
* tf/diff-whitespace-incomplete-line: xutils: Fix xdl_recmatch() on incomplete lines xutils: Fix hashing an incomplete line with whitespaces at the end
2009-08-23xutils: Fix xdl_recmatch() on incomplete linesJunio C Hamano
Thell Fowler noticed that various "ignore whitespace" options to git diff do not work well on an incomplete line. The loop control of the function responsible for these bugs was extremely difficult to follow. This patch restructures the loops for three variants of "ignore whitespace" logic. The basic idea of the re-written logic is: - A loop runs while the characters from both strings we are looking at match. We declare unmatch immediately when we find something that does not match and return false from the function. We break out of the loop if we ran out of either side of the string. The way we skip spaces inside this loop varies depending on the style of ignoring whitespaces. - After the above loop breaks, we know that the parts of the strings we inspected so far match, ignoring the whitespaces. The lines can match only if the remainder consists of nothing but whitespaces. This part of the logic is shared across all three styles. The new code is more obvious and should be much easier to follow. Tested-by: Thell Fowler <> Signed-off-by: Junio C Hamano <>
2009-08-23xutils: Fix hashing an incomplete line with whitespaces at the endJunio C Hamano
Upon seeing a whitespace, xdl_hash_record_with_whitespace() first skipped the run of whitespaces (excluding LF) that begins there, ensuring that the pointer points at the last whitespace character in the run, and assumed that the next character must be LF at the end of the line. This does not work when hashing an incomplete line, which lacks the LF at the end. Introduce "at_eol" variable that is true when either we are at the end of line (looking at LF) or at the end of an incomplete line, and use that instead throughout the code. Noticed by Thell Fowler. Signed-off-by: Junio C Hamano <>
2009-07-23refactor: use bitsizeof() instead of 8 * sizeof()Pierre Habouzit
Signed-off-by: Pierre Habouzit <> Signed-off-by: Junio C Hamano <>
2009-06-02Merge branch 'cb/maint-1.6.0-xdl-merge-fix' into maintJunio C Hamano
* cb/maint-1.6.0-xdl-merge-fix: Change xdl_merge to generate output even for null merges t6023: merge-file fails to output anything for a degenerate merge Conflicts: xdiff/xmerge.c
2009-05-25Change xdl_merge to generate output even for null mergesCharles Bailey
xdl_merge used to have a check to ensure that there was at least some change in one or other side being merged but this suppressed output for the degenerate case when base, local and remote contents were all identical. Removing this check enables correct output in the degenerate case and xdl_free_script handles freeing NULL scripts so there is no need to have the check for these calls. Signed-off-by: Charles Bailey <> Signed-off-by: Junio C Hamano <>
2009-04-23Fix typos / spelling in commentsMike Ralphson
Signed-off-by: Mike Ralphson <> Signed-off-by: Junio C Hamano <>
2009-03-16Fix various dead stores found by the clang static analyzerBenjamin Kramer
http-push.c::finish_request(): request is initialized by the for loop index-pack.c::free_base_data(): b is initialized by the for loop merge-recursive.c::process_renames(): move compare to narrower scope, and remove unused assignments to it remove unused variable renames2 xdiff/xdiffi.c::xdl_recs_cmp(): remove unused variable ec xdiff/xemit.c::xdl_emit_diff(): xche is always overwritten Signed-off-by: Benjamin Kramer <> Signed-off-by: Junio C Hamano <>
2009-02-06Merge branch 'kc/maint-diff-bwi-fix' into maintJunio C Hamano
* kc/maint-diff-bwi-fix: Fix combined use of whitespace ignore options to diff test more combinations of ignore-whitespace options to diff
2009-01-24Merge branch 'js/patience-diff'Junio C Hamano
* js/patience-diff: bash completions: Add the --patience option Introduce the diff option '--patience' Implement the patience diff algorithm Conflicts: contrib/completion/git-completion.bash
2009-01-22Merge branch 'kc/maint-diff-bwi-fix'Junio C Hamano
* kc/maint-diff-bwi-fix: Fix combined use of whitespace ignore options to diff
2009-01-20Fix combined use of whitespace ignore options to diffKeith Cascio
The code used to misbehave when options to ignore certain whitespaces (-w -b and --ignore-at-eol) were combined. Signed-off-by: Keith Cascio <> Signed-off-by: Junio C Hamano <>
2009-01-07Implement the patience diff algorithmJohannes Schindelin
The patience diff algorithm produces slightly more intuitive output than the classic Myers algorithm, as it does not try to minimize the number of +/- lines first, but tries to preserve the lines that are unique. To this end, it first determines lines that are unique in both files, then the maximal sequence which preserves the order (relative to both files) is extracted. Starting from this initial set of common lines, the rest of the lines is handled recursively, with Myers' algorithm as a fallback when the patience algorithm fails (due to no common unique lines). This patch includes memory leak fixes by Pierre Habouzit. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2008-12-29diff: add option to show context between close hunksRené Scharfe
Merge two hunks if there is only the specified number of otherwise unshown context between them. For --inter-hunk-context=1, the resulting patch has the same number of lines but shows uninterrupted context instead of a context header line in between. Patches generated with this option are easier to read but are also more likely to conflict if the file to be patched contains other changes. This patch keeps the default for this option at 0. It is intended to just make the feature available in order to see its advantages and downsides. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2008-12-03xdiff: give up scanning similar lines earlyDavide Libenzi
In a corner case of large files whose lines do not match uniquely, the loop to eliminate a line that matches multiple locations adjacent to a run of lines that do not uniquely match wasted too much cycles. Fix this by giving up early after scanning 100 lines in both direction. Signed-off-by: Junio C Hamano <>
2008-11-13Merge branch 'dl/xdiff'Junio C Hamano
* dl/xdiff: xdiff: give up scanning similar lines early
2008-10-25Allow alternate "low-level" emit function from xdl_diffBrian Downing
For some users (e.g. git blame), getting textual patch output is just extra work, as they can get all the information they need from the low- level diff structures. Allow for an alternate low-level emit function to be defined to allow bypassing the textual patch generation; set xemitconf_t's emit_func member to enable this. The (void (*)()) type is pretty ugly, but the alternative would be to include most of the private xdiff headers in xdiff.h to get the types required for the "proper" function prototype. Also, a (void *) won't work, as ANSI C doesn't allow a function pointer to be cast to an object pointer. Signed-off-by: Brian Downing <> Signed-off-by: Junio C Hamano <>
2008-08-31xmerge.c: "diff3 -m" style clips merge reduction level to EAGER or lessJunio C Hamano
When showing a conflicting merge result, and "--diff3 -m" style is asked for, this patch makes sure that the merge reduction level does not exceed XDL_MERGE_EAGER. This is because "diff3 -m" style output would not make sense for anything more aggressive than XDL_MERGE_EAGER, because of the way how the merge reduction works. "git merge-file" no longer has to force MERGE_EAGER when "--diff3" is asked for because of this change. Suppose a common ancestor (shared preimage) is modified to postimage #1 and #2 (each letter represents one line): ##### postimage#1: 1234ABCDE789 | / | / preimage: 123456789 | \ postimage#2: 1234AXYE789 #### XDL_MERGE_MINIMAL and XDL_MERGE_EAGER would: (1) find the s/56/ABCDE/ done on one side and s/56/AXYE/ done on the other side, (2) notice that they touch an overlapping area, and (3) mark it as a conflict, "ABCDE vs AXYE". The difference between the two algorithms is that EAGER drops the hunk altogether if the postimages match (i.e. both sides modified the same way), while MINIMAL keeps it. There is no other operation performed to the hunk. As the result, lines marked with "#" in the above picure will be in the RCS merge style output like this (letters <, = and > represent conflict marker lines): output: 1234<ABCDE=AXYE>789 ; with MINIMAL/EAGER The part from the preimage that corresponds to these conflicting changes is "56", which is what "diff3 -m" style output adds to it: output: 1234<ABCDE|56=AXYE>789 ; in "diff3 -m" style Now, XDL_MERGE_ZEALOUS looks at the differences between the changes two postimages made in order to reduce the number of lines in the conflicting regions. It notices that both sides start their new contents with "A", and excludes it from the output (it also excludes "E" for the same reason). The conflict that used to be "ABCDE vs AXYE" is now "BCD vs XY": output: 1234A<BCD=XY>E789 ; with ZEALOUS There could even be matching parts between two postimages in the middle. Instead of one side rewriting the shared "56" to "ABCDE" and the other side to "AXYE", imagine the case where the postimages are "ABCDE" and "AXCYE", in which case instead of having one conflicted hunk "BCD vs XY", you would have two conflicting hunks "B vs X" and "D vs Y". In either case, once you reduce "ABCDE vs AXYE" to "BCD vs XY" (or "ABCDE vs AXCYE" to "B vs X" and "D vs Y"), there is no part from the preimage that corresponds to the conflicting change made in both postimages anymore. In other words, conflict reduced by ZEALOUS algorithm cannot be expressed in "diff3 -m" style. Representing the last illustration like this is misleading to say the least: output: 1234A<BCD|56=XY>E789 ; broken "diff3 -m" style because the preimage was not ...4A56E... to begin with. "A" and "E" are common only between the postimages. Even worse, once a single conflicting hunk is split into multiple ones (recall the example of breaking "ABCDE vs AXCYE" to "B vs X" and "D vs Y"), there is no sane way to distribute the preimage text across split conflicting hunks. Signed-off-by: Junio C Hamano <>
2008-08-31xmerge.c: minimum readability fixupsJunio C Hamano
This replaces hardcoded magic constants with symbolic ones for readability, and swaps one if/else blocks to better match the order in which 0/1/2 variables are handled to nearby codepath. Signed-off-by: Junio C Hamano <>
2008-08-31xdiff-merge: optionally show conflicts in "diff3 -m" styleJunio C Hamano
When showing conflicting merges, we traditionally followed RCS's merge output format. The output shows: <<<<<<< postimage from one side; ======= postimage of the other side; and >>>>>>> Some poeple find it easier to be able to understand what is going on when they can view the common ancestor's version, which is used by "diff3 -m", which shows: <<<<<<< postimage from one side; ||||||| shared preimage; ======= postimage of the other side; and >>>>>>> This is an initial step to bring that as an optional feature to git. Only "git merge-file" has been converted, with "--diff3" option. Signed-off-by: Junio C Hamano <>
2008-08-31xdl_fill_merge_buffer(): separate out a too deeply nested functionJunio C Hamano
This simply moves code around to make a separate function that prepares a single conflicted hunk with markers into the buffer. Signed-off-by: Junio C Hamano <>
2008-02-18xdl_merge(): introduce XDL_MERGE_ZEALOUS_ALNUMJohannes Schindelin
When a merge conflicts, there are often common lines that are not really common, such as empty lines or lines containing a single curly bracket. With XDL_MERGE_ZEALOUS_ALNUM, we use the following heuristics: when a hunk does not contain any letters or digits, it is treated as conflicting. In other words, a conflict which used to look like this: <<<<<<< a = 1; ======= output(); >>>>>>> } } } <<<<<<< output(); ======= b = 1; >>>>>>> will look like this with ZEALOUS_ALNUM: <<<<<<< a = 1; } } } output(); ======= output(); } } } b = 1; >>>>>>> To demonstrate this, git-merge-file has been switched from XDL_MERGE_ZEALOUS to XDL_MERGE_ZEALOUS_ALNUM. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2008-02-18xdl_merge(): make XDL_MERGE_ZEALOUS output simplerJohannes Schindelin
When a merge conflicts, there are often less than three common lines between two conflicting regions. Since a conflict takes up as many lines as are conflicting, plus three lines for the commit markers, the output will be shorter (and thus, simpler) in this case, if the common lines will be merged into the conflicting regions. This patch merges up to three common lines into the conflicts. For example, what looked like this before this patch: <<<<<<< if (a == 1) ======= if (a != 0) >>>>>>> { int i; <<<<<<< a = 0; ======= a = !a; >>>>>>> will now look like this: <<<<<<< if (a == 1) { int i; a = 0; ======= if (a != 0) { int i; a = !a; >>>>>>> Suggested Linus (based on ideas by "Voltage Spike" -- if that name is real, it is mighty cool). Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-11-16Remove unreachable statementsGuido Ostkamp
Solaris Workshop Compiler found a few unreachable statements. Signed-off-by: Guido Ostkamp <> Signed-off-by: Junio C Hamano <>
2007-07-06Per-path attribute based hunk header selection.Junio C Hamano
This makes"diff -p" hunk headers customizable via gitattributes mechanism. It is based on Johannes's earlier patch that allowed to define a single regexp to be used for everything. The mechanism to arrive at the regexp that is used to define hunk header is the same as other use of gitattributes. You assign an attribute, funcname (because "diff -p" typically uses the name of the function the patch is about as the hunk header), a simple string value. This can be one of the names of built-in pattern (currently, "java" is defined) or a custom pattern name, to be looked up from the configuration file. (in .gitattributes) *.java funcname=java *.perl funcname=perl (in .git/config) [funcname] java = ... # ugly and complicated regexp to override the built-in one. perl = ... # another ugly and complicated regexp to define a new one. Signed-off-by: Junio C Hamano <>
2007-06-08Missing statics.Pierre Habouzit
Signed-off-by: Pierre Habouzit <> Signed-off-by: Junio C Hamano <>
2007-06-07War on whitespaceJunio C Hamano
This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <>
2007-03-20xdiff/xutils.c(xdl_hash_record): factor out whitespace handlingJohannes Schindelin
Since in at least one use case, xdl_hash_record() takes over 15% of the CPU time, it makes sense to even micro-optimize it. For many cases, no whitespace special handling is needed, and in these cases we should not even bother to check for whitespace in _every_ iteration of the loop. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2007-02-14teach diff machinery about --ignore-space-at-eolJohannes Schindelin
`git diff --ignore-space-at-eol` will ignore whitespace at the line ends. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-12-31Fix yet another subtle xdl_merge() bugJohannes Schindelin
In very obscure cases, a merge can hit an unexpected code path (where the original code went as far as saying that this was a bug). This failing merge was noticed by Alexandre Juillard. The problem is that the original file contains something like this: -- snip -- two non-empty lines before two empty lines after two empty lines -- snap -- and this snippet is reduced to _one_ empty line in _both_ new files. However, it is ambiguous as to which hunk takes the empty line: the first or the second one? Indeed in Alexandre's example files, the xdiff algorithm attributes the empty line to the first hunk in one case, and to the second hunk in the other case. (Trimming down the example files _changes_ that behaviour!) Thus, the call to xdl_merge_cmp_lines() has no chance to realize that the change is actually identical in both new files. Therefore, xdl_refine_conflicts() finds an empty diff script, which was not expected there, because (the original author of xdl_merge() thought) xdl_merge_cmp_lines() would catch that case earlier. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-12-28xdl_merge(): fix a segmentation fault when refining conflictsJohannes Schindelin
The function xdl_refine_conflicts() tries to break down huge conflicts by doing a diff on the conflicting regions. However, this does not make sense when one side is empty. Worse, when one side is not only empty, but after EOF, the code accessed unmapped memory. Noticed by Luben Tuikov, Shawn Pearce and Alexandre Julliard, the latter providing a test case. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-12-13Merge branch 'master' into js/mergeJunio C Hamano
* master: (42 commits)
2006-12-05xdl_merge(): fix and simplify conflict handlingJohannes Schindelin
Suppose you have changes in new1 to the original lines 10-20, and changes in new2 to the original lines 15-25, then the changes to 10-25 conflict. But it is possible that the next changes in new1 still overlap with this change to new2. So, in the next iteration we have to look at the same change to new2 again. The old code tried to be a bit too clever. The new code is shorter and more to the point: do not fiddle with the ranges at all. Also, xdl_append_merge() tries harder to combine conflicts. This is necessary, because with the above simplification, some conflicts would not be recognized as conflicts otherwise: In the above scenario, it is possible that there is no other change to new1. Absent the combine logic, the change in new2 would be recorded _again_, but as a non-conflict. Signed-off-by: Johannes Schindelin <>
2006-12-05diff -b: ignore whitespace at end of lineJohannes Schindelin
This is _not_ the same as "treat eol as whitespace", since that would mean that multiple empty lines would be treated as equal to e.g. a space. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-12-03xdl_merge(): fix thinkoJohannes Schindelin
If one side's block (of changed lines) ends later than the other side's block, the former should be tested against the next block of the other side, not vice versa. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-12-03xdl_merge(): fix an off-by-one bugJohannes Schindelin
The line range is i1 .. (i1 + chg1 - 1), not i1 .. (i1 + chg1). Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-12-03xmerge: make return value from xdl_merge() more usable.Junio C Hamano
The callers would want to know if the resulting merge is clean; do not discard that information away after calling xdl_do_merge(). Signed-off-by: Junio C Hamano <>
2006-12-03xdiff: add xdl_merge()Johannes Schindelin
This new function implements the functionality of RCS merge, but in-memory. It returns < 0 on error, otherwise the number of conflicts. Finding the conflicting lines can be a very expensive task. You can control the eagerness of this algorithm: - a level value of 0 means that all overlapping changes are treated as conflicts, - a value of 1 means that if the overlapping changes are identical, it is not treated as a conflict. - If you set level to 2, overlapping changes will be analyzed, so that almost identical changes will not result in huge conflicts. Rather, only the conflicting lines will be shown inside conflict markers. With each increasing level, the algorithm gets slower, but more accurate. Note that the code for level 2 depends on the simple definition of mmfile_t specific to git, and therefore it will be harder to port that to LibXDiff. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-11-24Increase length of function name bufferAndy Parkins
In xemit.c:xdl_emit_diff() a buffer for showing the function name as commentary is allocated; this buffer was 40 characters. This is a bit small; particularly for C++ function names where there is often an identical prefix (like void LongNamespace::LongClassName) on multiple functions, which makes the context the same everywhere. In other words the context is useless. This patch increases that buffer to 80 characters - which may still not be enough, but is better Signed-off-by: Andy Parkins <> Signed-off-by: Junio C Hamano <>
2006-10-25xdiff: Match GNU diff behaviour when deciding hunk comment worthiness of linesPetr Baudis
This removes the '#' and '(' tests and adds a '$' test instead although I have no idea what it is actually good for - but hey, if that's what GNU diff does... Pasky only went and did as Junio sayeth. Signed-off-by: Petr Baudis <> Signed-off-by: Junio C Hamano <>
2006-10-23xdiff/xemit.c (xdl_find_func): Elide trailing white space in a context header.Jim Meyering
This removes trailing blanks from git-generated diff headers the same way a similar patch did that for GNU diff: That is, it removes trailing blanks on the hunk header line that shows the function name. Signed-off-by: Jim Meyering <> Signed-off-by: Junio C Hamano <>
2006-10-17Merge branch 'maint'Junio C Hamano
* maint: Fix hash function in xdiff library
2006-10-17Fix hash function in xdiff libraryv1.4.2.4Linus Torvalds
Jim Mayering noticed that xdiff library took insanely long time when comparing files with many identical lines. This was because the hash function used in the library is broken on 64-bit architectures and caused too many collisions. Acked-by: Davide Libenzi <> Signed-off-by: Junio C Hamano <>
2006-10-12diff: fix 2 whitespace issuesJohannes Schindelin
When whitespace or whitespace change was ignored, the function xdl_recmatch() returned memcmp() style differences, which is wrong, since it should return 0 on non-match. Also, there were three horrible off-by-one bugs, even leading to wrong hashes in the whitespace special handling. The issue was noticed by Ray Lehtiniemi. For good measure, this commit adds a test. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2006-07-13Merge branch 'lt/merge-tree'Junio C Hamano
* lt/merge-tree: Improved three-way blob merging code Prepare "git-merge-tree" for future work xdiff: generate "anti-diffs" aka what is common to two files