summaryrefslogtreecommitdiff
path: root/builtin/sparse-checkout.c
AgeCommit message (Collapse)Author
2020-03-05Merge branch 'ds/sparse-add'Junio C Hamano
"git sparse-checkout" learned a new "add" subcommand. * ds/sparse-add: sparse-checkout: allow one-character directories in cone mode sparse-checkout: work with Windows paths sparse-checkout: create 'add' subcommand sparse-checkout: extract pattern update from 'set' subcommand sparse-checkout: extract add_patterns_from_input()
2020-02-17Merge branch 'rs/strbuf-insertstr'Junio C Hamano
Code clean-up. * rs/strbuf-insertstr: mailinfo: don't insert header prefix for handle_content_type() strbuf: add and use strbuf_insertstr()
2020-02-11sparse-checkout: work with Windows pathsDerrick Stolee
When using Windows, a user may run 'git sparse-checkout set A\B\C' to add the Unix-style path A/B/C to their sparse-checkout patterns. Normalizing the input path converts the backslashes to slashes before we add the string 'A/B/C' to the recursive hashset. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-11sparse-checkout: create 'add' subcommandDerrick Stolee
When using the sparse-checkout feature, a user may want to incrementally grow their sparse-checkout pattern set. Allow adding patterns using a new 'add' subcommand. This is not much different from the 'set' subcommand, because we still want to allow the '--stdin' option and interpret inputs as directories when in cone mode and patterns otherwise. When in cone mode, we are growing the cone. This may actually reduce the set of patterns when adding directory A when A/B is already a directory in the cone. Test the different cases: siblings, parents, ancestors. When not in cone mode, we can only assume the patterns should be appended to the sparse-checkout file. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-11sparse-checkout: extract pattern update from 'set' subcommandDerrick Stolee
In anticipation of adding "add" and "remove" subcommands to the sparse-checkout builtin, extract a modify_pattern_list() method from the sparse_checkout_set() method. This command will read input from the command-line or stdin to construct a set of patterns, then modify the existing sparse-checkout patterns after a successful update of the working directory. Currently, the only way to modify the patterns is to replace all of the patterns. This will be extended in a later update. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-11sparse-checkout: extract add_patterns_from_input()Derrick Stolee
In anticipation of extending the sparse-checkout builtin with "add" and "remove" subcommands, extract the code that fills a pattern list based on the input values. The input changes depending on the presence of "--stdin" or the value of core.sparseCheckoutCone. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-10strbuf: add and use strbuf_insertstr()René Scharfe
Add a function for inserting a C string into a strbuf. Use it throughout the source to get rid of magic string length constants and explicit strlen() calls. Like strbuf_addstr(), implement it as an inline function to avoid the implicit strlen() calls to cause runtime overhead. Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-31sparse-checkout: escape all glob characters on writeDerrick Stolee
The sparse-checkout patterns allow special globs according to fnmatch(3). When writing cone-mode patterns for paths containing these characters, they must be escaped. Use is_glob_special() to check which characters must be escaped this way, and add a path to the tests that contains all glob characters at once. Note that ']' is not special, since the initial bracket '[' is escaped. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-31sparse-checkout: use C-style quotes in 'list' subcommandDerrick Stolee
When in cone mode, the 'git sparse-checkout list' subcommand lists the directories included in the sparse cone. When these directories contain odd characters, such as a backslash, then we need to use C-style quotes similar to 'git ls-tree'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-31sparse-checkout: unquote C-style strings over --stdinDerrick Stolee
If a user somehow creates a directory with an asterisk (*) or backslash (\), then the "git sparse-checkout set" command will struggle to provide the correct pattern in the sparse-checkout file. When not in cone mode, the provided pattern is written directly into the sparse-checkout file. However, in cone mode we expect a list of paths to directories and then we convert those into patterns. Even more specifically, the goal is to always allow the following from the root of a repo: git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin The ls-tree command provides directory names with an unescaped asterisk. It also quotes the directories that contain an escaped backslash. We must remove these quotes, then keep the escaped backslashes. Use unquote_c_style() when parsing lines from stdin. Command-line arguments will be parsed as-is, assuming the user can do the correct level of escaping from their environment to match the exact directory names. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-31sparse-checkout: write escaped patterns in cone modeDerrick Stolee
If a user somehow creates a directory with an asterisk (*) or backslash (\), then the "git sparse-checkout set" command will struggle to provide the correct pattern in the sparse-checkout file. When not in cone mode, the provided pattern is written directly into the sparse-checkout file. However, in cone mode we expect a list of paths to directories and then we convert those into patterns. However, there is some care needed for the timing of these escapes. The in-memory pattern list is used to update the working directory before writing the patterns to disk. Thus, we need the command to have the unescaped names in the hashsets for the cone comparisons, then escape the patterns later. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-24sparse-checkout: create leading directoriesDerrick Stolee
The 'git init' command creates the ".git/info" directory and fills it with some default files. However, 'git worktree add' does not create the info directory for that worktree. This causes a problem when running "git sparse-checkout init" inside a worktree. While care was taken to allow the sparse-checkout config to be specific to a worktree, this initialization was untested. Safely create the leading directories for the sparse-checkout file. This is the safest thing to do even without worktrees, as a user could delete their ".git/info" directory and expect Git to recover safely. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-30sparse-checkout: list directories in cone modeDerrick Stolee
When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set' command takes a list of directories as input, then creates an ordered list of sparse-checkout patterns such that those directories are recursively included and all sibling entries along the parent directories are also included. Listing the patterns is less user-friendly than the directories themselves. In cone mode, and as long as the patterns match the expected cone-mode pattern types, change the output of 'git sparse-checkout list' to only show the directories that created the patterns. With this change, the following piped commands would not change the working directory: git sparse-checkout list | git sparse-checkout set --stdin The only time this would not work is if core.sparseCheckoutCone is true, but the sparse-checkout file contains patterns that do not match the expected pattern types for cone mode. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-13sparse-checkout: respect core.ignoreCase in cone modeDerrick Stolee
When a user uses the sparse-checkout feature in cone mode, they add patterns using "git sparse-checkout set <dir1> <dir2> ..." or by using "--stdin" to provide the directories line-by-line over stdin. This behaviour naturally looks a lot like the way a user would type "git add <dir1> <dir2> ..." If core.ignoreCase is enabled, then "git add" will match the input using a case-insensitive match. Do the same for the sparse-checkout feature. Perform case-insensitive checks while updating the skip-worktree bits during unpack_trees(). This is done by changing the hash algorithm and hashmap comparison methods to optionally use case- insensitive methods. When this is enabled, there is a small performance cost in the hashing algorithm. To tease out the worst possible case, the following was run on a repo with a deep directory structure: git ls-tree -d -r --name-only HEAD | git sparse-checkout set --stdin The 'set' command was timed with core.ignoreCase disabled or enabled. For the repo with a deep history, the numbers were core.ignoreCase=false: 62s core.ignoreCase=true: 74s (+19.3%) For reproducibility, the equivalent test on the Linux kernel repository had these numbers: core.ignoreCase=false: 3.1s core.ignoreCase=true: 3.6s (+16%) Now, this is not an entirely fair comparison, as most users will define their sparse cone using more shallow directories, and the performance improvement from eb42feca97 ("unpack-trees: hash less in cone mode" 2019-11-21) can remove most of the hash cost. For a more realistic test, drop the "-r" from the ls-tree command to store only the first-level directories. In that case, the Linux kernel repository takes 0.2-0.25s in each case, and the deep repository takes one second, plus or minus 0.05s, in each case. Thus, we _can_ demonstrate a cost to this change, but it is unlikely to matter to any reasonable sparse-checkout cone. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: check for dirty statusDerrick Stolee
The index-merge performed by 'git sparse-checkout' will erase any staged changes, which can lead to data loss. Prevent these attempts by requiring a clean 'git status' output. Helped-by: Szeder Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: update working directory in-process for 'init'Derrick Stolee
The 'git sparse-checkout init' subcommand previously wrote directly to the sparse-checkout file and then updated the working directory. This may fail if there are modified files not included in the initial pattern set. However, that left a populated sparse-checkout file. Use the in-process working directory update to guarantee that the init subcommand only changes the sparse-checkout file if the working directory update succeeds. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: write using lockfileDerrick Stolee
If two 'git sparse-checkout set' subcommands are launched at the same time, the behavior can be unexpected as they compete to write the sparse-checkout file and update the working directory. Take a lockfile around the writes to the sparse-checkout file. In addition, acquire this lock around the working directory update to avoid two commands updating the working directory in different ways. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: use in-process update for disable subcommandDerrick Stolee
The 'git sparse-checkout disable' subcommand returns a user to a full working directory. The old process for doing this required updating the sparse-checkout file with the "/*" pattern and then updating the working directory with core.sparseCheckout enabled. Finally, the sparse-checkout file could be removed and the config setting disabled. However, it is valuable to keep a user's sparse-checkout file intact so they can re-enable the sparse-checkout they previously used with 'git sparse-checkout init'. This is now possible with the in-process mechanism for updating the working directory. Reported-by: Szeder Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: update working directory in-processDerrick Stolee
The sparse-checkout builtin used 'git read-tree -mu HEAD' to update the skip-worktree bits in the index and to update the working directory. This extra process is overly complex, and prone to failure. It also requires that we write our changes to the sparse-checkout file before trying to update the index. Remove this extra process call by creating a direct call to unpack_trees() in the same way 'git read-tree -mu HEAD' does. In addition, provide an in-memory list of patterns so we can avoid reading from the sparse-checkout file. This allows us to test a proposed change to the file before writing to it. An earlier version of this patch included a bug when the 'set' command failed due to the "Sparse checkout leaves no entry on working directory" error. It would not rollback the index.lock file, so the replay of the old sparse-checkout specification would fail. A test in t1091 now covers that scenario. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: sanitize for nested foldersDerrick Stolee
If a user provides folders A/ and A/B/ for inclusion in a cone-mode sparse-checkout file, the parsing logic will notice that A/ appears both as a "parent" type pattern and as a "recursive" type pattern. This is unexpected and hence will complain via a warning and revert to the old logic for checking sparse-checkout patterns. Prevent this from happening accidentally by sanitizing the folders for this type of inclusion in the 'git sparse-checkout' builtin. This happens in two ways: 1. Do not include any parent patterns that also appear as recursive patterns. 2. Do not include any recursive patterns deeper than other recursive patterns. In order to minimize duplicate code for scanning parents, create hashmap_contains_parent() method. It takes a strbuf buffer to avoid reallocating a buffer when calling in a tight loop. Helped-by: Eric Wong <e@80x24.org> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: init and set in cone modeDerrick Stolee
To make the cone pattern set easy to use, update the behavior of 'git sparse-checkout (init|set)'. Add '--cone' flag to 'git sparse-checkout init' to set the config option 'core.sparseCheckoutCone=true'. When running 'git sparse-checkout set' in cone mode, a user only needs to supply a list of recursive folder matches. Git will automatically add the necessary parent matches for the leading directories. When testing 'git sparse-checkout set' in cone mode, check the error stream to ensure we do not see any errors. Specifically, we want to avoid the warning that the patterns do not match the cone-mode patterns. Helped-by: Eric Wong <e@80x24.org> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: create 'disable' subcommandDerrick Stolee
The instructions for disabling a sparse-checkout to a full working directory are complicated and non-intuitive. Add a subcommand, 'git sparse-checkout disable', to perform those steps for the user. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: add '--stdin' option to set subcommandDerrick Stolee
The 'git sparse-checkout set' subcommand takes a list of patterns and places them in the sparse-checkout file. Then, it updates the working directory to match those patterns. For a large list of patterns, the command-line call can get very cumbersome. Add a '--stdin' option to instead read patterns over standard in. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: 'set' subcommandDerrick Stolee
The 'git sparse-checkout set' subcommand takes a list of patterns as arguments and writes them to the sparse-checkout file. Then, it updates the working directory using 'git read-tree -mu HEAD'. The 'set' subcommand will replace the entire contents of the sparse-checkout file. The write_patterns_and_update() method is extracted from cmd_sparse_checkout() to make it easier to implement 'add' and/or 'remove' subcommands in the future. If the core.sparseCheckout config setting is disabled, then enable the config setting in the worktree config. If we set the config this way and the sparse-checkout fails, then re-disable the config setting. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22clone: add --sparse modeDerrick Stolee
When someone wants to clone a large repository, but plans to work using a sparse-checkout file, they either need to do a full checkout first and then reduce the patterns they included, or clone with --no-checkout, set up their patterns, and then run a checkout manually. This requires knowing a lot about the repo shape and how sparse-checkout works. Add a new '--sparse' option to 'git clone' that initializes the sparse-checkout file to include the following patterns: /* !/*/ These patterns include every file in the root directory, but no directories. This allows a repo to include files like a README or a bootstrapping script to grow enlistments from that point. During the 'git sparse-checkout init' call, we must first look to see if HEAD is valid, since 'git clone' does not have a valid HEAD at the point where it initializes the sparse-checkout. The following checkout within the clone command will create the HEAD ref and update the working directory correctly. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: create 'init' subcommandDerrick Stolee
Getting started with a sparse-checkout file can be daunting. Help users start their sparse enlistment using 'git sparse-checkout init'. This will set 'core.sparseCheckout=true' in their config, write an initial set of patterns to the sparse-checkout file, and update their working directory. Make sure to use the `extensions.worktreeConfig` setting and write the sparse checkout config to the worktree-specific config file. This avoids confusing interactions with other worktrees. The use of running another process for 'git read-tree' is sub- optimal. This will be removed in a later change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-22sparse-checkout: create builtin with 'list' subcommandDerrick Stolee
The sparse-checkout feature is mostly hidden to users, as its only documentation is supplementary information in the docs for 'git read-tree'. In addition, users need to know how to edit the .git/info/sparse-checkout file with the right patterns, then run the appropriate 'git read-tree -mu HEAD' command. Keeping the working directory in sync with the sparse-checkout file requires care. Begin an effort to make the sparse-checkout feature a porcelain feature by creating a new 'git sparse-checkout' builtin. This builtin will be the preferred mechanism for manipulating the sparse-checkout file and syncing the working directory. The documentation provided is adapted from the "git read-tree" documentation with a few edits for clarity in the new context. Extra sections are added to hint toward a future change to a more restricted pattern set. Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>