summaryrefslogtreecommitdiff
path: root/contrib/diff-highlight
AgeCommit message (Collapse)Author
2012-02-13diff-highlight: document some non-optimal casesJeff King
The diff-highlight script works on heuristics, so it can be wrong. Let's document some of the wrong-ness in case somebody feels like working on it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: match multi-line hunksJeff King
Currently we only bother highlighting single-line hunks. The rationale was that the purpose of highlighting is to point out small changes between two similar lines that are otherwise hard to see. However, that meant we missed similar cases where two lines were changed together, like: -foo(buf); -bar(buf); +foo(obj->buf); +bar(obj->buf); Each of those changes is simple, and would benefit from highlighting (the "obj->" parts in this case). This patch considers whole hunks at a time. For now, we consider only the case where the hunk has the same number of removed and added lines, and assume that the lines from each segment correspond one-to-one. While this is just a heuristic, in practice it seems to generate sensible results (especially because we now omit highlighting on completely-changed lines, so when our heuristic is wrong, we tend to avoid highlighting at all). Based on an original idea and implementation by Michał Kiedrowicz. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: refactor to prepare for multi-line hunksJeff King
The current code structure assumes that we will only look at a pair of lines at any given time, and that the end result should always be to output that pair. However, we want to eventually handle multi-line hunks, which will involve collating pairs of removed/added lines. Let's refactor the code to return highlighted pairs instead of printing them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: don't highlight whole linesJeff King
If you have a change like: -foo +bar we end up highlighting the entirety of both lines (since the whole thing is changed). But the point of diff highlighting is to pinpoint the specific change in a pair of lines that are mostly identical. In this case, the highlighting is just noise, since there is nothing to pinpoint, and we are better off doing nothing. The implementation looks for "interesting" pairs by checking to see whether they actually have a matching prefix or suffix that does not simply consist of colorization and whitespace. However, the implementation makes it easy to plug in other heuristics, too, like: 1. Depending on the source material, the set of "boring" characters could be tweaked to include language-specific stuff (like braces or semicolons for C). 2. Instead of saying "an interesting line has at least one character of prefix or suffix", we could require that less than N percent of the line be highlighted. The simple "ignore whitespace, and highlight if there are any matched characters" implemented by this patch seems to give good results on git.git. I'll leave experimentation with other heuristics to somebody who has a dataset that does not look good with the current code. Based on an original idea and implementation by Michał Kiedrowicz. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: make perl strict and warnings fatalJeff King
These perl features can catch bugs, and we shouldn't be violating any of the strict rules or creating any warnings, so let's turn them on. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-18contrib: add diff highlight scriptJeff King
This is a simple and stupid script for highlighting differing parts of lines in a unified diff. See the README for a discussion of the limitations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>