summaryrefslogtreecommitdiff
path: root/grep.c
AgeCommit message (Collapse)Author
2009-03-07grep: remove grep_opt argument from match_expr_eval()René Scharfe
The only use of the struct grep_opt argument of match_expr_eval() is to pass the option word_regexp to match_one_pattern(). By adding a pattern flag for it we can reduce the number of function arguments of these two functions, as a cleanup and preparation for adding more in the next patch. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07grep: micro-optimize hit collection for AND nodesRené Scharfe
In addition to returning if an expression matches a line, match_expr_eval() updates the expression's hit flag if the parameter collect_hits is set. It never sets collect_hits for children of AND nodes, though, so their hit flag will never be updated. Because of that we can return early if the first child didn't match, no matter if collect_hits is set or not. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18Add is_regex_special()René Scharfe
Add is_regex_special(), a character class macro for chars that have a special meaning in regular expressions. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-18Change NUL char handling of isspecial()René Scharfe
Replace isspecial() by the new macro is_glob_special(), which is more, well, specialized. The former included the NUL char in its character class, while the letter only included characters that are special to file name globbing. The new name contains underscores because they enhance readability considerably now that it's made up of three words. Renaming the function is necessary to document its changed scope. The call sites of isspecial() are updated to check explicitly for NUL. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-10grep: don't call regexec() for fixed stringsRené Scharfe
Add the new flag "fixed" to struct grep_pat and set it if the pattern is doesn't contain any regex control characters in addition to if the flag -F/--fixed-strings was specified. This gives a nice speed up on msysgit, where regexec() seems to be extra slow. Before (best of five runs): $ time git grep grep v1.6.1 >/dev/null real 0m0.552s user 0m0.000s sys 0m0.000s $ time git grep -F grep v1.6.1 >/dev/null real 0m0.170s user 0m0.000s sys 0m0.015s With the patch: $ time git grep grep v1.6.1 >/dev/null real 0m0.173s user 0m0.000s sys 0m0.000s The difference is much smaller on Linux, but still measurable. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-10grep -w: forward to next possible position after rejected matchRené Scharfe
grep -w accepts matches between non-word characters, only. If a match from regexec() doesn't meet this criteria, grep continues its search after the first character of that match. We can be a bit smarter here and skip all positions that follow a word character first, as they can't match our criteria. This way we can consume characters quite cheaply and don't need to special-case the handling of the beginning of a line. Here's a contrived example command on msysgit (best of five runs): $ time git grep -w ...... v1.6.1 >/dev/null real 0m1.611s user 0m0.000s sys 0m0.015s With the patch it's quite a bit faster: $ time git grep -w ...... v1.6.1 >/dev/null real 0m1.179s user 0m0.000s sys 0m0.015s More common search patterns will gain a lot less, but it's a nice clean up anyway. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-05remove trailing LF in die() messagesAlexander Potashev
LF at the end of format strings given to die() is redundant because die already adds one on its own. Signed-off-by: Alexander Potashev <aspotashev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-11Merge branch 'maint'Junio C Hamano
* maint: Fix non-literal format in printf-style calls git-submodule: Avoid printing a spurious message. git ls-remote: make usage string match manpage Makefile: help people who run 'make check' by mistake
2008-11-11Fix non-literal format in printf-style callsDaniel Lowe
These were found using gcc 4.3.2-1ubuntu11 with the warning: warning: format not a string literal and no format arguments Incorporated suggestions from Brandon Casey <casey@nrlssc.navy.mil>. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-01git grep: Add "-z/--null" option as in GNU's grep.Raphael Zimmerer
Here's a trivial patch that adds "-z" and "--null" options to "git grep". It was discussed on the mailing-list that git's "-z" convention should be used instead of GNU grep's "-Z". So things like 'git grep -l -z "$FOO" | xargs -0 sed -i "s/$FOO/$BOO/"' do work now. Signed-off-by: Raphael Zimmerer <killekulla@rdrz.de> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-09-05log --author/--committer: really match only with name partJunio C Hamano
When we tried to find commits done by AUTHOR, the first implementation tried to pattern match a line with "^author .*AUTHOR", which later was enhanced to strip leading caret and look for "^author AUTHOR" when the search pattern was anchored at the left end (i.e. --author="^AUTHOR"). This had a few problems: * When looking for fixed strings (e.g. "git log -F --author=x --grep=y"), the regexp internally used "^author .*x" would never match anything; * To match at the end (e.g. "git log --author='google.com>$'"), the generated regexp has to also match the trailing timestamp part the commit header lines have. Also, in order to determine if the '$' at the end means "match at the end of the line" or just a literal dollar sign (probably backslash-quoted), we would need to parse the regexp ourselves. An earlier alternative tried to make sure that a line matches "^author " (to limit by field name) and the user supplied pattern at the same time. While it solved the -F problem by introducing a special override for matching the "^author ", it did not solve the trailing timestamp nor tail match problem. It also would have matched every commit if --author=author was asked for, not because the author's email part had this string, but because every commit header line that talks about the author begins with that field name, regardleses of who wrote it. Instead of piling more hacks on top of hacks, this rethinks the grep machinery that is used to look for strings in the commit header, and makes sure that (1) field name matches literally at the beginning of the line, followed by a SP, and (2) the user supplied pattern is matched against the remainder of the line, excluding the trailing timestamp data. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-05Move buffer_is_binary() to xdiff-interface.hJohannes Schindelin
We already have two instances where we want to determine if a buffer contains binary data as opposed to text. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2006-12-20simplify inclusion of system header files.Junio C Hamano
This is a mechanical clean-up of the way *.c files include system header files. (1) sources under compat/, platform sha-1 implementations, and xdelta code are exempt from the following rules; (2) the first #include must be "git-compat-util.h" or one of our own header file that includes it first (e.g. config.h, builtin.h, pkt-line.h); (3) system headers that are included in "git-compat-util.h" need not be included in individual C source files. (4) "git-compat-util.h" does not have to include subsystem specific header files (e.g. expat.h). Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-28grep --all-matchJunio C Hamano
This lets you say: git grep --all-match -e A -e B -e C to find lines that match A or B or C but limit the matches from the files that have all of A, B and C. This is different from git grep -e A --and -e B --and -e C in that the latter looks for a single line that has all of these at the same time. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27grep: fix --fixed-strings combined with expression.Junio C Hamano
"git grep --fixed-strings -e GIT --and -e VERSION .gitignore" misbehaved because we did not notice this needs to grab lines that have the given two fixed strings at the same time. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-27grep: free expressions and patterns when done.Junio C Hamano
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20Update grep internal for grepping only in head/bodyJunio C Hamano
This further updates the built-in grep engine so that we can say something like "this pattern should match only in head". This can be used to simplify grepping in the log messages. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-09-20builtin-grep: make pieces of it available as library.Junio C Hamano
This makes three functions and associated option structures from builtin-grep available from other parts of the system. * options to drive built-in grep engine is stored in struct grep_opt; * pattern strings and extended grep expressions are added to struct grep_opt with append_grep_pattern(); * when finished calling append_grep_pattern(), call compile_grep_patterns() to prepare for execution; * call grep_buffer() to find matches in the in-core buffer. This also adds an internal option "status_only" to grep_opt, which suppresses any output from grep_buffer(). Callers of the function as library can use it to check if there is a match without producing any output. Signed-off-by: Junio C Hamano <junkio@cox.net>