summaryrefslogtreecommitdiff
path: root/builtin/check-ignore.c
AgeCommit message (Collapse)Author
2020-02-18check-ignore: fix documentation and implementation to matchElijah Newren
check-ignore has two different modes, and neither of these modes has an implementation that matches the documentation. These modes differ in whether they just print paths or whether they also print the final pattern matched by the path. The fix is different for both modes, so I'll discuss both separately. === First (default) mode === The first mode is documented as: For each pathname given via the command-line or from a file via --stdin, check whether the file is excluded by .gitignore (or other input files to the exclude mechanism) and output the path if it is excluded. However, it fails to do this because it did not account for negated patterns. Commands other than check-ignore verify exclusion rules via calling ... -> treat_one_path() -> is_excluded() -> last_matching_pattern() while check-ignore has a call path of the form: ... -> check_ignore() -> last_matching_pattern() The fact that the latter does not include the call to is_excluded() means that it is susceptible to to messing up negated patterns (since that is the only significant thing is_excluded() adds over last_matching_pattern()). Unfortunately, we can't make it just call is_excluded(), because the same codepath is used by the verbose mode which needs to know the matched pattern in question. This brings us to... === Second (verbose) mode === The second mode, known as verbose mode, references the first in the documentation and says: Also output details about the matching pattern (if any) for each given pathname. For precedence rules within and between exclude sources, see gitignore(5). The "Also" means it will print patterns that match the exclude rules as noted for the first mode, and also print which pattern matches. Unless more information is printed than just pathname and pattern (which is not done), this definition is somewhat ill-defined and perhaps even self-contradictory for negated patterns: A path which matches a negated exclude pattern is NOT excluded and thus shouldn't be printed by the former logic, while it certainly does match one of the explicit patterns and thus should be printed by the latter logic. === Resolution == Since the second mode exists to find out which pattern matches given paths, and showing the user a pattern that begins with a '!' is sufficient for them to figure out whether the pattern is excluded, the existing behavior is desirable -- we just need to update the documentation to match the implementation (i.e. it is about printing which pattern is matched by paths, not about showing which paths are excluded). For the first or default mode, users just want to know whether a pattern is excluded. As such, the existing documentation is desirable; change the implementation to match the documented behavior. Finally, also adjust a few tests in t0008 that were caught up by this discrepancy in how negated paths were handled. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05treewide: rename 'exclude' methods to 'pattern'Derrick Stolee
The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames several methods defined in dir.h to make more sense with the renamed 'struct exclude_list' to 'struct pattern_list' and 'struct exclude' to 'struct path_pattern': * last_exclude_matching() -> last_matching_pattern() * parse_exclude() -> parse_path_pattern() In addition, the word 'exclude' was replaced with 'pattern' in the methods below: * add_exclude_list() * add_excludes_from_file_to_list() * add_excludes_from_file() * add_excludes_from_blob_to_list() * add_exclude() * clear_exclude_list() A few methods with the word "exclude" remain. These will be handled seperately. In particular, the method "is_excluded()" is concretely about the .gitignore file relative to a specific directory. This is the important boundary between library and consumer: is_excluded() cares about .gitignore, but is_excluded() calls last_matching_pattern() to make that decision. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05treewide: rename 'EXCL_FLAG_' to 'PATTERN_FLAG_'Derrick Stolee
The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit replaces 'EXCL_FLAG_' to 'PATTERN_FLAG_' in the names of the flags used on 'struct path_pattern'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05treewide: rename 'struct exclude_list' to 'struct pattern_list'Derrick Stolee
The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames 'struct exclude_list' to 'struct pattern_list' and renames several variables called 'el' to 'pl'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05treewide: rename 'struct exclude' to 'struct path_pattern'Derrick Stolee
The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a list of 'struct exclude' items makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames 'struct exclude' to 'struct path_pattern' and renames several variable names to match. 'struct pattern' was already taken by attr.c, and this more completely describes that the patterns are specific to file paths. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switchNguyễn Thái Ngọc Duy
By default, index compat macros are off from now on, because they could hide the_index dependency. Only those in builtin can use it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-12check-ignore: fix mix of directories and other file typesRené Scharfe
In check_ignore(), the first pathspec item determines the dtype for any subsequent ones. That means that a pathspec matching a regular file can prevent following pathspecs from matching directories, which makes no sense. Fix that by determining the dtype for each pathspec separately, by passing the value DT_UNKNOWN to last_exclude_matching() each time. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'bw/config-h'Junio C Hamano
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-15config: don't include config.h by defaultBrandon Williams
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-30Merge branch 'bw/pathspec-sans-the-index'Junio C Hamano
Simplify parse_pathspec() codepath and stop it from looking at the default in-core index. * bw/pathspec-sans-the-index: pathspec: convert find_pathspecs_matching_against_index to take an index pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_CHEAP ls-files: prevent prune_cache from overeagerly pruning submodules pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flag submodule: add die_in_unpopulated_submodule function pathspec: provide a more descriptive die message
2017-05-12pathspec: convert find_pathspecs_matching_against_index to take an indexBrandon Williams
Convert find_pathspecs_matching_against_index to take an index parameter. In addition mark pathspec.c with NO_THE_INDEX_COMPATIBILITY_MACROS now that it doesn't use any cache macros or reference 'the_index'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-12pathspec: remove PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flagBrandon Williams
Since (ae8d08242 pathspec: pass directory indicator to match_pathspec_item()) the path matching logic has been able to cope with submodules without needing to strip off a trailing slash if a path refers to a submodule. Since the stripping the trailing slash is no longer necessary, remove the PATHSPEC_STRIP_SUBMODULE_SLASH_EXPENSIVE flag. In addition, factor out the logic which dies if a path decends into a submodule so that it can still be used as a check after a pathspec struct has been initialized. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-06dir: convert is_excluded to take an indexBrandon Williams
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-01give "nbuf" strbuf a more meaningful nameJeff King
It's a common pattern in our code to read paths from stdin, separated either by newlines or NULs, and unquote as necessary. In each of these five cases we use "nbuf" to temporarily store the unquoted value. Let's give it the more meaningful name "unquoted", which makes it easier to understand the purpose of the variable. While we're at it, let's also static-initialize all of our strbufs. It's not wrong to call strbuf_init, but it increases the cognitive load on the reader, who might wonder "do we sometimes avoid initializing them? why?". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15check-ignore: there are only two possible line terminationsJunio C Hamano
The program by default reads LF terminated lines, with an option to use NUL terminated records. Instead of pretending that there can be other useful values for line_termination, use a boolean variable, nul_term_line, to tell if NUL terminated records are used, and switch between strbuf_getline_{lf,nul} based on it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-16usage: do not insist that standard input must come from a fileJunio C Hamano
The synopsys text and the usage string of subcommands that read list of things from the standard input are often shown like this: git gostak [--distim] < <list-of-doshes> This is problematic in a number of ways: * The way to use these commands is more often to feed them the output from another command, not feed them from a file. * Manual pages outside Git, commands that operate on the data read from the standard input, e.g "sort", "grep", "sed", etc., are not described with such a "< redirection-from-file" in their synopsys text. Our doing so introduces inconsistency. * We do not insist on where the output should go, by saying git gostak [--distim] < <list-of-doshes> > <output> * As it is our convention to enclose placeholders inside <braket>, the redirection operator followed by a placeholder filename becomes very hard to read, both in the documentation and in the help text. Let's clean them all up, after making sure that the documentation clearly describes the modes that take information from the standard input and what kind of things are expected on the input. [jc: stole example for fmt-merge-msg from Jonathan] Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-14standardize usage info string formatAlex Henrie
This patch puts the usage info strings that were not already in docopt- like format into docopt-like format, which will be a litle easier for end users and a lot easier for translators. Changes include: - Placing angle brackets around fill-in-the-blank parameters - Putting dashes in multiword parameter names - Adding spaces to [-f|--foobar] to make [-f | --foobar] - Replacing <foobar>* with [<foobar>...] Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-20Merge branch 'dw/check-ignore-sans-index'Junio C Hamano
"git check-ignore" follows the same rule as "git add" and "git status" in that the ignore/exclude mechanism does not take effect on paths that are already tracked. With "--no-index" option, it can be used to diagnose which paths that should have been ignored have been mistakenly added to the index. * dw/check-ignore-sans-index: check-ignore: Add option to ignore index contents
2013-09-12check-ignore: Add option to ignore index contentsDave Williams
check-ignore currently shows how .gitignore rules would treat untracked paths. Tracked paths do not generate useful output. This prevents debugging of why a path became tracked unexpectedly unless that path is first removed from the index with `git rm --cached <path>`. The option --no-index tells the command to bypass the check for the path being in the index and hence allows tracked paths to be checked too. Whilst this behaviour deviates from the characteristics of `git add` and `git status` its use case is unlikely to cause any user confusion. Test scripts are augmented to check this option against the standard ignores to ensure correct behaviour. Signed-off-by: Dave Williams <dave@opensourcesolutions.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-09Merge branch 'jl/submodule-mv'Junio C Hamano
"git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...
2013-09-04Merge branch 'sb/parseopt-boolean-removal'Junio C Hamano
Convert most uses of OPT_BOOLEAN/OPTION_BOOLEAN that can use OPT_BOOL/OPTION_BOOLEAN which have much saner semantics, and turn remaining ones into OPT_SET_INT, OPT_COUNTUP, etc. as necessary. * sb/parseopt-boolean-removal: revert: use the OPT_CMDMODE for parsing, reducing code checkout-index: fix negations of even numbers of -n config parsing options: allow one flag multiple times hash-object: replace stdin parsing OPT_BOOLEAN by OPT_COUNTUP branch, commit, name-rev: ease up boolean conditions checkout: remove superfluous local variable log, format-patch: parsing uses OPT__QUIET Replace deprecated OPT_BOOLEAN by OPT_BOOL Remove deprecated OPTION_BOOLEAN for parsing arguments
2013-09-04Merge branch 'jc/check-x-z'Junio C Hamano
"git check-ignore -z" applied the NUL termination to both its input (with --stdin) and its output, but "git check-attr -z" ignored the option on the output side. This is potentially a backward incompatible fix. Let's see if anybody screams before deciding if we want to do anything to help existing users (there may be none). * jc/check-x-z: check-attr -z: a single -z should apply to both input and output check-ignore -z: a single -z should apply to both input and output check-attr: the name of the character is NUL, not NULL check-ignore: the name of the character is NUL, not NULL
2013-08-05Replace deprecated OPT_BOOLEAN by OPT_BOOLStefan Beller
This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN, 2011-09-27). All occurrences of the respective variables have been reviewed and none of them relied on the counting up mechanism, but all of them were using the variable as a true boolean. This patch does not change semantics of any command intentionally. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15remove match_pathspec() in favor of match_pathspec_depth()Nguyễn Thái Ngọc Duy
match_pathspec_depth was created to replace match_pathspec (see 61cf282 (pathspec: add match_pathspec_depth() - 2010-12-15). It took more than two years, but the replacement finally happens :-) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15check-ignore: convert to use parse_pathspecNguyễn Thái Ngọc Duy
check-ignore (at least the test suite) seems to rely on the pattern order. PATHSPEC_KEEP_ORDER is introduced to explictly express this. The lack of PATHSPEC_MAXDEPTH_VALID is sufficient because it's the only flag that reorders pathspecs, but it's less obvious that way. Cc: Adam Spiers <git@adamspiers.org> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12check-ignore -z: a single -z should apply to both input and outputJunio C Hamano
Unless a command has separate --nul-terminated-{input,output} options, the --nul-terminated-records (-z) option should apply to both input and output for consistency. The caller knows that its input paths may need to be protected for LF, and the program shows these problematic paths to its output. The code already did the right thing. Only the help text needs fixing. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12check-ignore: the name of the character is NUL, not NULLJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-29Merge branch 'as/check-ignore'Junio C Hamano
Enhance "check-ignore" (1.8.2 update) to work more like "check-attr" over bidi-pipes. * as/check-ignore: t0008: use named pipe (FIFO) to test check-ignore streaming Documentation: add caveats about I/O buffering for check-{attr,ignore} check-ignore: allow incremental streaming of queries via --stdin check-ignore: move setup into cmd_check_ignore() check-ignore: add -n / --non-matching option t0008: remove duplicated test fixture data
2013-04-15dir.c: git-status --ignored: don't scan the work tree twiceKarsten Blees
'git-status --ignored' still scans the work tree twice to collect untracked and ignored files, respectively. fill_directory / read_directory already supports collecting untracked and ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED flag to enable this has some git-add specific side-effects (e.g. it doesn't recurse into ignored directories, so listing ignored files with --untracked=all doesn't work). The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored files in dir_struct.entries[] (instead of dir_struct.ignored[] as DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git. We don't want to break the existing API, so lets introduce a new flag DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar to DIR_COLLECT_FILES, but will recurse into sub-directories based on the other flags as DIR_SHOW_IGNORED does. In dir.c::read_directory_recursive, add ignored files to either dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move the DIR_COLLECT_IGNORED case here so that filling result lists is in a common place. In wt-status.c::wt_status_collect_untracked, use the new flag and read results from dir_struct.ignored[]. Remove the extra fill_directory call. builtin/check-ignore.c doesn't call fill_directory, setting the git-add specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity. Update API documentation to reflect the changes. Performance: with this patch, 'git-status --ignored' is typically as fast as 'git-status'. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15dir.c: replace is_path_excluded with now equivalent is_excluded APIKarsten Blees
Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-11check-ignore: allow incremental streaming of queries via --stdinAdam Spiers
Some callers, such as the git-annex web assistant, find it useful to invoke git check-ignore as a persistent background process, which can then have queries fed to its STDIN at any point, and the corresponding response consumed from its STDOUT. For this we need to invoke check_ignore() once per line of standard input, and flush standard output after each result. The above use case suggests that empty STDIN is actually a reasonable scenario (e.g. when the caller doesn't know in advance whether any queries need to be fed to the background process until after it's already started), so we make the minor behavioural change that "no pathspec given." is no longer emitted in when STDIN is empty. Even though check_ignore() could now be changed to operate on a single pathspec, we keep it operating on an array of pathspecs since that is a more convenient way of consuming the existing pathspec API. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-11check-ignore: move setup into cmd_check_ignore()Adam Spiers
Initialisation of the dir_struct and path_exclude_check structs was previously done within check_ignore(). This was acceptable since check_ignore() was only called once per check-ignore invocation; however the next commit will convert it into an inner loop which is called once per line of STDIN when --stdin is given. Therefore moving the initialisation code out into cmd_check_ignore() ensures that initialisation is still only performed once per check-ignore invocation, and consequently that the output is identical whether pathspecs are provided as CLI arguments or via STDIN. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-11check-ignore: add -n / --non-matching optionAdam Spiers
If `-n` or `--non-matching` are specified, non-matching pathnames will also be output, in which case all fields in each output record except for <pathname> will be empty. This can be useful when running check-ignore as a background process, so that files can be incrementally streamed to STDIN, and for each of these files, STDOUT will indicate whether that file matched a pattern or not. (Without this option, it would be impossible to tell whether the absence of output for a given file meant that it didn't match any pattern, or that the result simply hadn't been flushed to STDOUT yet.) Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-19name-hash: allow hashing an empty stringJunio C Hamano
Usually we do not pass an empty string to the function hash_name() because we almost always ask for hash values for a path that is a candidate to be added to the index. However, check-ignore (and most likely check-attr, but I didn't check) apparently has a callchain to ask the hash value for an empty path when it was given a "." from the top-level directory to ask "Is the path . excluded by default?" Make sure that hash_name() does not overrun the end of the given pathname even when it is empty. Remove a sweep-the-issue-under-the-rug conditional in check-ignore that avoided to pass an empty string to the callchain while at it. It is a valid question to ask for check-ignore if the top-level is set to be ignored by default, even though the answer is most likely no, if only because there is currently no way to specify such an entry in the .gitignore file. But it is an unusual thing to ask and it is not worth optimizing for it by special casing at the top level of the call chain. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-06add git-check-ignore sub-commandAdam Spiers
This works in a similar manner to git-check-attr. Thanks to Jeff King and Junio C Hamano for the idea: http://thread.gmane.org/gmane.comp.version-control.git/108671/focus=108815 Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>