summaryrefslogtreecommitdiff
path: root/t/t4018
AgeCommit message (Collapse)Author
2021-04-21Merge branch 'ab/userdiff-tests'Junio C Hamano
A bit of code clean-up and a lot of test clean-up around userdiff area. * ab/userdiff-tests: blame tests: simplify userdiff driver test blame tests: don't rely on t/t4018/ directory userdiff: remove support for "broken" tests userdiff tests: list builtin drivers via test-tool userdiff tests: explicitly test "default" pattern userdiff: add and use for_each_userdiff_driver() userdiff style: normalize pascal regex declaration userdiff style: declare patterns with consistent style userdiff style: re-order drivers in alphabetical order
2021-04-08userdiff: add support for SchemeAtharva Raykar
Add a diff driver for Scheme-like languages which recognizes top level and local `define` forms, whether it is a function definition, binding, syntax definition or a user-defined `define-xyzzy` form. Also supports R6RS `library` forms, `module` forms along with class and struct declarations used in Racket (PLT Scheme). Alternate "def" syntax such as those in Gerbil Scheme are also supported, like defstruct, defsyntax and so on. The rationale for picking `define` forms for the hunk headers is because it is usually the only significant form for defining the structure of the program, and it is a common pattern for schemers to have local function definitions to hide their visibility, so it is not only the top level `define`'s that are of interest. Schemers also extend the language with macros to provide their own define forms (for example, something like a `define-test-suite`) which is also captured in the hunk header. Since it is common practice to extend syntax with variants of a form like `module+`, `class*` etc, those have been supported as well. The word regex is a best-effort attempt to conform to R7RS[1] valid identifiers, symbols and numbers. [1] https://small.r7rs.org/attachment/r7rs.pdf (section 2.1) Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08userdiff: remove support for "broken" testsÆvar Arnfjörð Bjarmason
There have been no "broken" tests since 75c3b6b2e8 (userdiff: improve Fortran xfuncname regex, 2020-08-12). Let's remove the test support for them. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-02Merge branch 've/userdiff-bash'Junio C Hamano
The userdiff pattern learned to identify the function definition in POSIX shells and bash. * ve/userdiff-bash: userdiff: support Bash
2020-10-27Merge branch 'sd/userdiff-css-update'Junio C Hamano
Userdiff for CSS update. * sd/userdiff-css-update: userdiff: expand detected chunk headers for css
2020-10-27Merge branch 'kb/userdiff-rust-macro-rules'Junio C Hamano
Userdiff for Rust update. * kb/userdiff-rust-macro-rules: userdiff: recognize 'macro_rules!' as starting a Rust function block
2020-10-22userdiff: support BashVictor Engmark
Support POSIX, bashism and mixed function declarations, all four compound command types, trailing comments and mixed whitespace. Even though Bash allows locale-dependent characters in function names <https://unix.stackexchange.com/a/245336/3645>, only detect function names with characters allowed by POSIX.1-2017 <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_235> for simplicity. This should cover the vast majority of use cases, and produces system-agnostic results. Since a word pattern has to be specified, but there is no easy way to know the default word pattern, use the default `IFS` characters for a starter. A later patch can improve this. Signed-off-by: Victor Engmark <victor@engmark.name> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-08userdiff: expand detected chunk headers for cssSohom Datta
The regex used for the CSS builtin diff driver in git is only able to show chunk headers for lines that start with a number, a letter or an underscore. However, the regex fails to detect classes (starts with a .), ids (starts with a #), :root and attribute-value based selectors (for example [class*="col-"]), as well as @based block-level statements like @page,@keyframes and @media since all of them, start with a special character. Allow the selectors and block level statements to begin with these special characters. Signed-off-by: Sohom Datta <sohom.datta@learner.manipal.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-07userdiff: recognize 'macro_rules!' as starting a Rust function blockKonrad Borowski
Signed-off-by: Konrad Borowski <konrad@borowski.pw> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-07userdiff: PHP: catch "abstract" and "final" functionsJavier Spagnoletti
PHP permits functions to be defined like final public function foo() { } abstract protected function bar() { } but our hunk header pattern does not recognize these decorations. Add "final" and "abstract" to the list of function modifiers. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Javier Spagnoletti <phansys@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-13userdiff: improve Fortran xfuncname regexPhilippe Blain
The third part of the Fortran xfuncname regex wants to match the beginning of a subroutine or function, so it allows for all characters except `'`, `"` or whitespace before the keyword 'function' or 'subroutine'. This is meant to match the 'recursive', 'elemental' or 'pure' keywords, as well as function return types, and to prevent matches inside strings. However, the negated set does not contain the `!` comment character, so a line with an end-of-line comment containing the keyword 'function' or 'subroutine' followed by another word is mistakenly chosen as a hunk header. Improve the regex by adding `!` to the negated set. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-13userdiff: add tests for Fortran xfuncname regexPhilippe Blain
The Fortran userdiff patterns, introduced in 909a5494f8 (userdiff.c: add builtin fortran regex patterns, 2010-09-10), predate the test infrastructure for xfuncname patterns, introduced in bfa7d01413 (t4018: an infrastructure to test hunk headers, 2014-03-21). Add tests for the Fortran xfuncname patterns. The test 't/t4018/fortran-comment-keyword' documents a shortcoming of the regex that is fixed in a subsequent commit. While at it, add descriptive comments for the different parts of the regex. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-03userdiff: support MarkdownAsh Holland
It's typical to find Markdown documentation alongside source code, and having better context for documentation changes is useful; see also commit 69f9c87d4 (userdiff: add support for Fountain documents, 2015-07-21). The pattern is based on the CommonMark specification 0.29, section 4.2 <https://spec.commonmark.org/> but doesn't match empty headings, as seeing them in a hunk header is unlikely to be useful. Only ATX headings are supported, as detecting setext headings would require printing the line before a pattern matches, or matching a multiline pattern. The word-diff pattern is the same as the pattern for HTML, because many Markdown parsers accept inline HTML. Signed-off-by: Ash Holland <ash@sorrel.sh> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-05Merge branch 'jh/userdiff-python-async'Junio C Hamano
The userdiff machinery has been taught that "async def" is another way to begin a "function" in Python. * jh/userdiff-python-async: userdiff: support Python async functions
2019-11-20userdiff: support Python async functionsJosh Holland
Python's async functions (declared with "async def" rather than "def") were not being displayed in hunk headers. This commit teaches git about the async function syntax, and adds tests for the Python userdiff regex. Signed-off-by: Josh Holland <anowlcalledjosh@gmail.com> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-10userdiff: add Elixir to supported userdiff languagesŁukasz Niemier
Adds support for xfuncref in Elixir[1] language which is Ruby-like language that runs on Erlang[3] Virtual Machine (BEAM). [1]: https://elixir-lang.org [2]: https://www.erlang.org Signed-off-by: Łukasz Niemier <lukasz@niemier.pl> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-21userdiff: fix some corner cases in dts regexStephen Boyd
While reviewing some dts diffs recently I noticed that the hunk header logic was failing to find the containing node. This is because the regex doesn't consider properties that may span multiple lines, i.e. property = <something>, <something_else>; and it got hung up on comments inside nodes that look like the root node because they start with '/*'. Add tests for these cases and update the regex to find them. Maybe detecting the root node is too complicated but forcing it to be a backslash with any amount of whitespace up to an open bracket seemed OK. I tried to detect that a comment is in-between the two parts but I wasn't happy so I just dropped it. Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-21userdiff: add a builtin pattern for dts filesStephen Boyd
The Linux kernel receives many patches to the devicetree files each release. The hunk header for those patches typically show nothing, making it difficult to figure out what node is being modified without applying the patch or opening the file and seeking to the context. Let's add a builtin 'dts' pattern to git so that users can get better diff output on dts files when they use the diff=dts driver. The regex has been constructed based on the spec at devicetree.org[1] and with some help from Johannes Sixt. [1] https://github.com/devicetree-org/devicetree-specification/releases/latest Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-21Merge branch 'ml/userdiff-rust'Junio C Hamano
The pattern "git diff/grep" use to extract funcname and words boundary for Rust has been added. * ml/userdiff-rust: userdiff: two simplifications of patterns for rust userdiff: add built-in pattern for rust
2019-05-19userdiff: add OctaveBoxuan Li
Octave pattern is almost the same as matlab, except that '%%%' and '##' can also be used to begin code sections, in addition to '%%' that is understood by both. Octave pattern is merged into Matlab pattern. Test cases for the hunk header patterns of matlab and octave under t/t4018 are added. Signed-off-by: Boxuan Li <liboxuan@connect.hku.hk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-17userdiff: add built-in pattern for rustMarc-André Lureau
This adds xfuncname and word_regex patterns for Rust, a quite popular programming language. It also includes test cases for the xfuncname regex (t4018) and updated documentation. The word_regex pattern finds identifiers, integers, floats and operators, according to the Rust Reference Book. Cc: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-06userdiff: support new keywords in PHP hunk headerKana Natsuno
Recent version of PHP supports interface, trait, abstract class and final class. This patch fixes the PHP hunk header regexp to support all of these keywords. Signed-off-by: Kana Natsuno <dev@whileimautomaton.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-06t4018: add missing test cases for PHPKana Natsuno
A later patch changes the built-in PHP pattern. These test cases demonstrate aspects of the pattern that we do not want to change. Signed-off-by: Kana Natsuno <dev@whileimautomaton.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-01userdiff: add built-in pattern for golangAlban Gruin
This adds xfuncname and word_regex patterns for golang, a quite popular programming language. It also includes test cases for the xfuncname regex (t4018) and updated documentation. The xfuncname regex finds functions, structs and interfaces. Although the Go language prohibits the opening brace from being on its own line, the regex does not makes it mandatory, to be able to match `func` statements like this: func foo(bar int, baz int) { } This is covered by the test case t4018/golang-long-func. The word_regex pattern finds identifiers, integers, floats, complex numbers and operators, according to the go specification. Signed-off-by: Alban Gruin <alban.gruin@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-03userdiff: add built-in pattern for CSSWilliam Duclot
CSS is widely used, motivating it being included as a built-in pattern. It must be noted that the word_regex for CSS (i.e. the regex defining what is a word in the language) does not consider '.' and '#' characters (in CSS selectors) to be part of the word. This behavior is documented by the test t/t4018/css-rule. The logic behind this behavior is the following: identifiers in CSS selectors are identifiers in a HTML/XML document. Therefore, the '.'/'#' character are not part of the identifier, but an indicator of the nature of the identifier in HTML/XML (class or id). Diffing ".class1" and ".class2" must show that the class name is changed, but we still are selecting a class. Logic behind the "pattern" regex is: 1. reject lines ending with a colon/semicolon (properties) 2. if a line begins with a name in column 1, pick the whole line Credits to Johannes Sixt (j6t@kdbg.org) for the pattern regex and most of the tests. Signed-off-by: William Duclot <william.duclot@ensimag.grenoble-inp.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-23userdiff: add support for Fountain documentsZoë Blade
Add support for Fountain, a plain text screenplay format. Git facilitates not just programming specifically, but creative writing in general, so it makes sense to also support other plain text documents besides source code. In the structure of a screenplay specifically, scenes are roughly analogous to functions, in the sense that it makes your job easier if you can see which ones were changed in a given range of patches. More information about the Fountain format can be found on its official website, at http://fountain.io . Signed-off-by: Zoë Blade <zoe@bytenoise.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21userdiff: have 'cpp' hunk header pattern catch more C++ anchor pointsJohannes Sixt
The hunk header pattern 'cpp' is intended for C and C++ source code, but it is actually not particularly useful for the latter, and even misses some use-cases for the former. The parts of the pattern have the following flaws: - The first part matches an identifier followed immediately by a colon and arbitrary text and is intended to reject goto labels and C++ access specifiers (public, private, protected). But this pattern also rejects C++ constructs, which look like this: MyClass::MyClass() MyClass::~MyClass() MyClass::Item MyClass::Find(... - The second part matches an identifier followed by a list of qualified names (i.e. identifiers separated by the C++ scope operator '::') separated by space or '*' followed by an opening parenthesis (with space between the tokens). It matches function declarations like struct item* get_head(... int Outer::Inner::Func(... Since the pattern requires at least two identifiers, GNU-style function definitions are ignored: void func(... Moreover, since the pattern does not allow punctuation other than '*', the following C++ constructs are not recognized: . template definitions: template<class T> int func(T arg) . functions returning references: const string& get_message() . functions returning templated types: vector<int> foo() . operator definitions: Value operator+(Value l, Value r) - The third part of the pattern finally matches compound definitions. But it forgets about unions and namespaces, and also skips single-line definitions struct random_iterator_tag {}; because no semicolon can occur on the line. Change the first pattern to require a colon at the end of the line (except for trailing space and comments), so that it does not reject constructor or destructor definitions. Notice that all interesting anchor points begin with an identifier or keyword. But since there is a large variety of syntactical constructs after the first "word", the simplest is to require only this word and accept everything else. Therefore, this boils down to a line that begins with a letter or underscore (optionally preceded by the C++ scope operator '::' to accept functions returning a type anchored at the global namespace). Replace the second and third part by a single pattern that picks such a line. This has the following desirable consequence: - All constructs mentioned above are recognized. and the following likely desirable consequences: - Definitions of global variables and typedefs are recognized: int num_entries = 0; extern const char* help_text; typedef basic_string<wchar_t> wstring; - Commonly used marco-ized boilerplate code is recognized: BEGIN_MESSAGE_MAP(CCanvas,CWnd) Q_DECLARE_METATYPE(MyStruct) PATTERNS("tex",...) (The last one is from this very patch.) but also the following possibly undesirable consequence: - When a label is not on a line by itself (except for a comment) it is no longer rejected, but can appear as a hunk header if it occurs at the beginning of a line: next:; IMO, the benefits of the change outweigh the (possible) regressions by a large margin. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21t4018: test cases showing that the cpp pattern misses many anchor pointsJohannes Sixt
Most of the tests show C++ code, but there is also a union definition and a GNU style function definition that are not recognized. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21t4018: test cases for the built-in cpp patternJohannes Sixt
A later patch changes the built-in cpp pattern. These test cases demonstrate aspects of the pattern that we do not want to change. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21t4018: convert custom pattern test to the new infrastructureJohannes Sixt
For the test case "matches to end of line", extend the pattern by a few wildcards so that the pattern captures the "RIGHT" token, which is needed for verification, without mentioning it in the pattern. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21t4018: convert java pattern test to the new infrastructureJohannes Sixt
Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21t4018: convert perl pattern tests to the new infrastructureJohannes Sixt
There is one subtlety: The old test case 'perl pattern gets full line of POD header' does not have its own new test case, but the feature is tested nevertheless by placing the RIGHT tag at the end of the expected hunk header in t4018/perl-skip-sub-in-pod. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21t4018: an infrastructure to test hunk headersJohannes Sixt
Add an infrastructure that simplifies adding new tests of the hunk header regular expressions. To add new tests, a file with the syntax to test can be dropped in the directory t4018. The README file explains how a test file must contain; the README itself tests the default behavior. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>