path: root/builtin/grep.c
AgeCommit message (Collapse)Author
2012-06-18verify_filename(): ask the caller to chose the kind of diagnosisMatthieu Moy
verify_filename() can be called in two different contexts. Either we just tried to interpret a string as an object name, and it fails, so we try looking for a working tree file (i.e. we finished looking at revs that come earlier on the command line, and the next argument must be a pathname), or we _know_ that we are looking for a pathname, and shouldn't even try interpreting the string as an object name. For example, with this change, we get: $ git log COPYING HEAD:inexistant fatal: HEAD:inexistant: no such path in the working tree. Use '-- <path>...' to specify paths that do not exist locally. $ git log HEAD:inexistant fatal: Path 'inexistant' does not exist in 'HEAD' Signed-off-by: Matthieu Moy <> Signed-off-by: Junio C Hamano <>
2012-06-01Merge branch 'rs/maint-grep-F' into maintJunio C Hamano
"git grep -e '$pattern'", unlike the case where the patterns are read from a file, did not treat individual lines in the given pattern argument as separate regular expressions as it should. By René Scharfe * rs/maint-grep-F: grep: stop leaking line strings with -f grep: support newline separated pattern list grep: factor out do_append_grep_pat() grep: factor out create_grep_pat()
2012-05-25Merge branch 'rs/maint-grep-F'Junio C Hamano
"git grep -e '$pattern'", unlike the case where the patterns are read from a file, did not treat individual lines in the given pattern argument as separate regular expressions as it should.
2012-05-21grep: stop leaking line strings with -fRené Scharfe
When reading patterns from a file, we pass the lines as allocated string buffers to append_grep_pat() and never free them. That's not a problem because they are needed until the program ends anyway. However, now that the function duplicates the pattern string, we can reuse the strbuf after calling that function. This simplifies the code a bit and plugs a minor memory leak. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2012-03-02Merge branch 'rs/no-no-no-parseopt'Junio C Hamano
* rs/no-no-no-parseopt: parse-options: remove PARSE_OPT_NEGHELP parse-options: allow positivation of options starting, with no- test-parse-options: convert to OPT_BOOL() Conflicts: builtin/grep.c
2012-02-28parse-options: remove PARSE_OPT_NEGHELPRené Scharfe
PARSE_OPT_NEGHELP is confusing because short options defined with that flag do the opposite of what the helptext says. It is also not needed anymore now that options starting with no- can be negated by removing that prefix. Convert its only two users to OPT_NEGBIT() and OPT_BOOL() and then remove support for PARSE_OPT_NEGHELP. Signed-off-by: Rene Scharfe <> Acked-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2012-02-14Merge branch 'jk/grep-binary-attribute'Junio C Hamano
* jk/grep-binary-attribute: grep: pre-load userdiff drivers when threaded grep: load file data after checking binary-ness grep: respect diff attributes for binary-ness grep: cache userdiff_driver in grep_source grep: drop grep_buffer's "name" parameter convert git-grep to use grep_source interface grep: refactor the concept of "grep source" into an object grep: move sha1-reading mutex into low-level code grep: make locking flag global
2012-02-07drop odd return value semantics from userdiff_configJeff King
When the userdiff_config function was introduced in be58e70 (diff: unify external diff and funcname parsing code, 2008-10-05), it used a return value convention unlike any other config callback. Like other callbacks, it used "-1" to signal error. But it returned "1" to indicate that it found something, and "0" otherwise; other callbacks simply returned "0" to indicate that no error occurred. This distinction was necessary at the time, because the userdiff namespace overlapped slightly with the color configuration namespace. So "" could mean "the 'foo' slot of diff coloring" or "the 'foo' component of the "color" userdiff driver". Because the color-parsing code would die on an unknown color slot, we needed the userdiff code to indicate that it had matched the variable, letting us bypass the color-parsing code entirely. Later, in 8b8e862 (ignore unknown color configuration, 2009-12-12), the color-parsing code learned to silently ignore unknown slots. This means we no longer need to protect userdiff-matched variables from reaching the color-parsing code. We can therefore change the userdiff_config calling convention to a more normal one. This drops some code from each caller, which is nice. But more importantly, it reduces the cognitive load for readers who may wonder why userdiff_config is unlike every other config callback. There's no need to add a new test confirming that this works; t4020 already contains a test that sets diff.color.external. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2012-02-02grep: pre-load userdiff drivers when threadedJeff King
The low-level grep_source code will automatically load the userdiff driver to see whether a file is binary. However, when we are threaded, it will load the drivers in a non-deterministic order, handling each one as its assigned thread happens to be scheduled. Meanwhile, the attribute lookup code (which underlies the userdiff driver lookup) is optimized to handle paths in sequential order (because they tend to share the same gitattributes files). Multi-threading the lookups destroys the locality and makes this optimization less effective. We can fix this by pre-loading the userdiff driver in the main thread, before we hand off the file to a worker thread. My best-of-five for "git grep foo" on the linux-2.6 repository went from: real 0m0.391s user 0m1.708s sys 0m0.584s to: real 0m0.360s user 0m1.576s sys 0m0.572s Not a huge speedup, but it's quite easy to do. The only trick is that we shouldn't perform this optimization if "-a" was used, in which case we won't bother checking whether the files are binary at all. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2012-02-02convert git-grep to use grep_source interfaceJeff King
The grep_source interface (as opposed to grep_buffer) will eventually gives us a richer interface for telling the low-level grep code about our buffers. Eventually this will lead to things like better binary-file handling. For now, it lets us drop a lot of now-redundant code. The conversion is mostly straight-forward. One thing to note is that the memory ownership rules for "struct grep_source" are different than the "struct work_item" found here (the former will copy things like the filename, rather than taking ownership). Therefore you will also see some slight tweaking of when filename buffers are released. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2012-02-02grep: move sha1-reading mutex into low-level codeJeff King
The multi-threaded git-grep code needs to serialize access to the thread-unsafe read_sha1_file call. It does this with a mutex that is local to builtin/grep.c. Let's instead push this down into grep.c, where it can be used by both builtin/grep.c and grep.c. This will let us safely teach the low-level grep.c code tricks that involve reading from the object db. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2012-02-02grep: make locking flag globalJeff King
The low-level grep code traditionally didn't care about threading, as it doesn't do any threading itself and didn't call out to other non-thread-safe code. That changed with 0579f91 (grep: enable threading with -p and -W using lazy attribute lookup, 2011-12-12), which pushed the lookup of funcname attributes (which is not thread-safe) into the low-level grep code. As a result, the low-level code learned about a new global "grep_attr_mutex" to serialize access to the attribute code. A multi-threaded caller (e.g., builtin/grep.c) is expected to initialize the mutex and set "use_threads" in the grep_opt structure. The low-level code only uses the lock if use_threads is set. However, putting the use_threads flag into the grep_opt struct is not the most logical place. Whether threading is in use is not something that matters for each call to grep_buffer, but is instead global to the whole program (i.e., if any thread is doing multi-threaded grep, every other thread, even if it thinks it is doing its own single-threaded grep, would need to use the locking). In practice, this distinction isn't a problem for us, because the only user of multi-threaded grep is "git-grep", which does nothing except call grep. This patch turns the opt->use_threads flag into a global flag. More important than the nit-picking semantic argument above is that this means that the locking functions don't need to actually have access to a grep_opt to know whether to lock. Which in turn can make adding new locks simpler, as we don't need to pass around a grep_opt. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2012-01-23grep: fix -l/-L interaction with decoration linesAlbert Yale
In threaded mode, git-grep emits file breaks (enabled with context, -W and --break) into the accumulation buffers even if they are not required. The output collection thread then uses skip_first_line to skip the first such line in the output, which would otherwise be at the very top. This is wrong when the user also specified -l/-L/-c, in which case every line is relevant. While arguably giving these options together doesn't make any sense, git-grep has always quietly accepted it. So do not skip anything in these cases. Signed-off-by: Albert Yale <> Signed-off-by: Thomas Rast <> Signed-off-by: Junio C Hamano <>
2011-12-16grep: disable threading in non-worktree caseThomas Rast
Measurements by various people have shown that grepping in parallel is not beneficial when the object store is involved. For example, with a simple regex: Threads | --cached case | worktree case ---------------------------------------------------------------- 8 (default) | 2.88u 0.21s 0:02.94real | 0.19u 0.32s 0:00.16real 4 | 2.89u 0.29s 0:02.99real | 0.16u 0.34s 0:00.17real 2 | 2.83u 0.36s 0:02.87real | 0.18u 0.32s 0:00.26real NO_PTHREADS | 2.16u 0.08s 0:02.25real | 0.12u 0.17s 0:00.31real This happens because all the threads contend on read_sha1_mutex almost all of the time. A more complex regex allows the threads to do more work in parallel, but as Jeff King found out, the "super boost" (much higher clock when only one core is active) feature of recent CPUs still causes the unthreaded case to win by a large margin. So until the pack machinery allows unthreaded access, we disable grep's threading in all but the worktree case. Helped-by: René Scharfe <> Helped-by: Jeff King <> Signed-off-by: Thomas Rast <> Signed-off-by: Junio C Hamano <>
2011-12-16grep: enable threading with -p and -W using lazy attribute lookupThomas Rast
Lazily load the userdiff attributes in match_funcname(). Use a separate mutex around this loading to protect the (not thread-safe) attributes machinery. This lets us re-enable threading with -p and -W while reducing the overhead caused by looking up attributes. Signed-off-by: Thomas Rast <> Signed-off-by: Junio C Hamano <>
2011-12-05Merge branch 'nd/misc-cleanups'Junio C Hamano
* nd/misc-cleanups: unpack_object_header_buffer(): clear the size field upon error tree_entry_interesting: make use of local pointer "item" tree_entry_interesting(): give meaningful names to return values read_directory_recursive: reduce one indentation level get_tree_entry(): do not call find_tree_entry() on an empty tree tree-walk.c: do not leak internal structure in tree_entry_len()
2011-10-27tree_entry_interesting(): give meaningful names to return valuesNguyễn Thái Ngọc Duy
It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2011-10-27tree-walk.c: do not leak internal structure in tree_entry_len()Nguyễn Thái Ngọc Duy
tree_entry_len() does not simply take two random arguments and return a tree length. The two pointers must point to a tree item structure, or struct name_entry. Passing random pointers will return incorrect value. Force callers to pass struct name_entry instead of two pointers (with hope that they don't manually construct struct name_entry themselves) Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2011-10-26builtin/grep: simplify lock_and_read_sha1_file()Junio C Hamano
As read_sha1_lock/unlock have been made aware of use_threads, this caller can be made a lot simpler. Signed-off-by: Junio C Hamano <>
2011-10-26builtin/grep: make lock/unlock into static inline functionsJunio C Hamano
Signed-off-by: Junio C Hamano <>
2011-10-26git grep: be careful to use mutexes only when they are initializedJohannes Schindelin
Rather nasty things happen when a mutex is not initialized but locked nevertheless. Now, when we're not running in a threaded manner, the mutex is not initialized, which is correct. But then we went and used the mutex anyway, which -- at least on Windows -- leads to a hard crash (ordinarily it would be called a segmentation fault, but in Windows speak it is an access violation). This problem was identified by our faithful tests when run in the msysGit environment. To avoid having to wrap the line due to the 80 column limit, we use the name "WHEN_THREADED" instead of "IF_USE_THREADS" because it is one character shorter. Which is all we need in this case. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2011-10-16Merge branch 'jc/grep-untracked-exclude'Junio C Hamano
* jc/grep-untracked-exclude: grep: fix the error message that mentions --exclude
2011-10-16Merge branch 'jc/maint-grep-untracked-exclude' into jc/grep-untracked-excludeJunio C Hamano
* jc/maint-grep-untracked-exclude: grep: fix the error message that mentions --exclude Conflicts: builtin/grep.c
2011-10-14Merge branch 'jc/grep-untracked-exclude'Junio C Hamano
* jc/grep-untracked-exclude: grep: teach --untracked and --exclude-standard options
2011-10-14Merge branch 'bw/grep-no-index-no-exclude'Junio C Hamano
* bw/grep-no-index-no-exclude: grep --no-index: don't use git standard exclusions grep: do not use --index in the short usage output
2011-10-05Merge branch 'nm/grep-object-sha1-lock'Junio C Hamano
* nm/grep-object-sha1-lock: grep: Fix race condition in delta_base_cache Conflicts: builtin/grep.c
2011-10-05Merge branch 'jc/maint-grep-untracked-exclude' into jc/grep-untracked-excludeJunio C Hamano
* jc/maint-grep-untracked-exclude: grep: teach --untracked and --exclude-standard options grep --no-index: don't use git standard exclusions grep: do not use --index in the short usage output Conflicts: Documentation/git-grep.txt builtin/grep.c
2011-08-29Merge branch 'jk/color-and-pager'Junio C Hamano
* jk/color-and-pager: want_color: automatically fallback to color.ui diff: don't load color config in plumbing config: refactor get_colorbool function color: delay auto-color decision until point of use git_config_colorbool: refactor stdout_is_tty handling diff: refactor COLOR_DIFF from a flag into an int setup_pager: set GIT_PAGER_IN_USE t7006: use test_config helpers test-lib: add helper functions for config t7006: modernize calls to unset Conflicts: builtin/commit.c parse-options.c
2011-08-19want_color: automatically fallback to color.uiJeff King
All of the "do we want color" flags default to -1 to indicate that we don't have any color configured. This value is handled in one of two ways: 1. In porcelain, we check early on whether the value is still -1 after reading the config, and set it to the value of color.ui (which defaults to 0). 2. In plumbing, it stays untouched as -1, and want_color defaults it to off. This works fine, but means that every porcelain has to check and reassign its color flag. Now that want_color gives us a place to put this check in a single spot, we can do that, simplifying the calling code. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-08-18git_config_colorbool: refactor stdout_is_tty handlingJeff King
Usually this function figures out for itself whether stdout is a tty. However, it has an extra parameter just to allow git-config to override the auto-detection for its --get-colorbool option. Instead of an extra parameter, let's just use a global variable. This makes calling easier in the common case, and will make refactoring the colorbool code much simpler. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2011-08-01grep: long context optionsRené Scharfe
Take long option names for -A (--after-context), -B (--before-context) and -C (--context) from GNU grep and add a similar long option name for -W (--function-context). Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2011-08-01grep: add option to show whole function as contextRené Scharfe
Add a new option, -W, to show the whole surrounding function of a match. It uses the same regular expressions as -p and diff to find the beginning of sections. Currently it will not display comments in front of a function, but those that are following one. Despite this shortcoming it is already useful, e.g. to simply see a more complete applicable context or to extract whole functions. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2011-06-06grep: add --headingRené Scharfe
With --heading, the filename is printed once before matches from that file instead of at the start of each line, giving more screen space to the actual search results. This option is taken from ack ( And now git grep can dress up like it: $ git config alias.ack "grep --break --heading --line-number" $ git ack -e --heading Documentation/git-grep.txt 154:--heading:: t/ 785:test_expect_success 'grep --heading' ' 786: git grep --heading -e char -e lo_w hello.c hello_world >actual && 808: git grep --break --heading -n --color \ Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2011-06-06grep: add --breakRené Scharfe
With --break, an empty line is printed between matches from different files, increasing readability. This option is taken from ack ( Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2011-06-06grep: fix coloring of hunk marks between filesRené Scharfe
Commit 431d6e7b (grep: enable threading for context line printing) split the printing of the "--\n" mark between results from different files out into two places: show_line() in grep.c for the non-threaded case and work_done() in builtin/grep.c for the threaded case. Commit 55f638bd (grep: Colorize filename, line number, and separator) updated the former, but not the latter, so the separators between files are not colored if threads are used. This patch merges the two. In the threaded case, hunk marks are now printed by show_line() for every file, including the first one, and the very first mark is simply skipped in work_done(). This ensures that the output is properly colored and works just as well. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2011-05-30Merge branch 'mk/grep-pcre'Junio C Hamano
* mk/grep-pcre: git-grep: Fix problems with recently added tests git-grep: Update tests (mainly for -P) Makefile: Pass USE_LIBPCRE down in GIT-BUILD-OPTIONS git-grep: update tests now regexp type is "last one wins" git-grep: do not die upon -F/-P when grep.extendedRegexp is set. git-grep: Bail out when -P is used with -F or -E grep: Add basic tests configure: Check for libpcre git-grep: Learn PCRE grep: Extract compile_regexp_failed() from compile_regexp() grep: Fix a typo in a comment grep: Put calls to fixmatch() and regmatch() into patmatch() contrib/completion: --line-number to git grep Documentation: Add --line-number to git-grep synopsis
2011-05-23Merge branch 'jc/magic-pathspec'Junio C Hamano
* jc/magic-pathspec: setup.c: Fix some "symbol not declared" sparse warnings t3703: Skip tests using directory name ":" on Windows revision.c: leave a note for "a lone :" enhancement t3703, t4208: add test cases for magic pathspec rev/path disambiguation: further restrict "misspelled index entry" diag fix overslow :/no-such-string-ever-existed diagnostics fix overstrict :<path> diagnosis grep: use get_pathspec() correctly pathspec: drop "lone : means no pathspec" from get_pathspec() Revert "magic pathspec: add ":(icase)path" to match case insensitively" magic pathspec: add ":(icase)path" to match case insensitively magic pathspec: futureproof shorthand form magic pathspec: add tentative ":/path/from/top/level" pathspec support
2011-05-10grep: use get_pathspec() correctlyJunio C Hamano
When there is no remaining string in argv, get_pathspec(prefix, argv) will return a two-element array that has prefix as the first element, so there is no need to re-roll that logic in the code that uses get_pathspec(). Signed-off-by: Junio C Hamano <>
2011-05-10git-grep: do not die upon -F/-P when grep.extendedRegexp is set.Junio C Hamano
The previous one made "git grep -P" fail when grep.extendedRegexp is enabled. That is a no-starter. The option on the command line should just make the command ignore the configured default. The handling of "-F" in the existing code has the same problem. Instead of saying -G/-F/-E/-P incompatible with each other, just allow the last one win. That way, you can have "[alias] gr = grep -P" and use Pcre for everyday work e.g. "git gr ':i?foo'", and append -G to the aliased command line to override it e.g. "git gr -G '[Ff][Oo][Oo]'". Signed-off-by: Junio C Hamano <>
2011-05-09git-grep: Bail out when -P is used with -F or -EMichał Kiedrowicz
This patch makes git-grep die() when -P is used on command line together with -E/--extended-regexp or -F/--fixed-strings. This also makes it bail out when grep.extendedRegexp is enabled. But `git grep -G -P pattern` and `git grep -E -G -P pattern` still work because -G and -E set opts.regflags during parse_options() and there is no way to detect `-G` or `-E -G`. Signed-off-by: Michał Kiedrowicz <> Signed-off-by: Junio C Hamano <>
2011-05-09git-grep: Learn PCREMichał Kiedrowicz
This patch teaches git-grep the --perl-regexp/-P options (naming borrowed from GNU grep) in order to allow specifying PCRE regexes on the command line. PCRE has a number of features which make them more handy to use than POSIX regexes, like consistent escaping rules, extended character classes, ungreedy matching etc. git isn't build with PCRE support automatically. USE_LIBPCRE environment variable must be enabled (like `make USE_LIBPCRE=YesPlease`). Signed-off-by: Michał Kiedrowicz <> Signed-off-by: Junio C Hamano <>
2011-05-06Merge branch 'nd/struct-pathspec'Junio C Hamano
* nd/struct-pathspec: pathspec: rename per-item field has_wildcard to use_wildcard Improve tree_entry_interesting() handling code Convert read_tree{,_recursive} to support struct pathspec Reimplement read_tree_recursive() using tree_entry_interesting()
2011-04-03sparse: Fix errors and silence warningsStephen Boyd
* load_file() returns a void pointer but is using 0 for the return value * builtin/receive-pack.c forgot to include builtin.h * packet_trace_prefix can be marked static * ll_merge takes a pointer for its last argument, not an int * crc32 expects a pointer as the second argument but Z_NULL is defined to be 0 (see 38f4d13 sparse fix: Using plain integer as NULL pointer, 2006-11-18 for more info) Signed-off-by: Stephen Boyd <> Signed-off-by: Junio C Hamano <>
2011-04-02Merge branch 'jr/grep-en-config'Junio C Hamano
* jr/grep-en-config: grep: allow -E and -n to be turned on by default via configuration
2011-04-02Merge branch 'ab/i18n-st'Junio C Hamano
* ab/i18n-st: (69 commits) i18n: git-shortlog basic messages i18n: git-revert split up "could not revert/apply" message i18n: git-revert literal "me" messages i18n: git-revert "Your local changes" message i18n: git-revert basic messages i18n: git-notes GIT_NOTES_REWRITE_MODE error message i18n: git-notes basic commands i18n: git-gc "Auto packing the repository" message i18n: git-gc basic messages i18n: git-describe basic messages i18n: git-clean clean.requireForce messages i18n: git-clean basic messages i18n: git-bundle basic messages i18n: git-archive basic messages i18n: git-status "renamed: " message i18n: git-status "Initial commit" message i18n: git-status "Changes to be committed" message i18n: git-status shortstatus messages i18n: git-status "nothing to commit" messages i18n: git-status basic messages ... Conflicts: builtin/branch.c builtin/checkout.c builtin/clone.c builtin/commit.c builtin/grep.c builtin/merge.c builtin/push.c builtin/revert.c t/ t/
2011-03-30grep: allow -E and -n to be turned on by default via configurationJoe Ratterman
Add two configration variables grep.extendedRegexp and grep.lineNumbers to allow the user to skip typing -E and -n on the command line, respectively. Scripts that are meant to be used by random users and/or in random repositories now have use -G and/or --no-line-number options as appropriately to override the settings in the repository or user's ~/.gitconfig settings. Just because the script didn't say "git grep -n" no longer guarantees that the output from the command will not have line numbers. Signed-off-by: Joe Ratterman <> Signed-off-by: Junio C Hamano <>
2011-03-28Merge branch 'maint'Junio C Hamano
* maint: git tag documentation grammar fixes and readability updates grep: Add the option '--line-number'
2011-03-28grep: Add the option '--line-number'Joe Ratterman
This is a synonym for the existing '-n' option, matching GNU grep. Signed-off-by: Joe Ratterman <> Signed-off-by: Junio C Hamano <>
2011-03-25Improve tree_entry_interesting() handling codeNguyễn Thái Ngọc Duy
t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting */ break; if (ret == 0) /* skip this round */ continue; } /* ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2011-03-20grep: read patterns from stdin with -f -René Scharfe
Support the well-know convention of reading standard input instead of a named file if "-" (dash) is specified. GNU grep does the same. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>