diff options
Diffstat (limited to 'Documentation/git-sparse-checkout.txt')
-rw-r--r-- | Documentation/git-sparse-checkout.txt | 367 |
1 files changed, 276 insertions, 91 deletions
diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt index b81dbe0..3776705 100644 --- a/Documentation/git-sparse-checkout.txt +++ b/Documentation/git-sparse-checkout.txt @@ -3,9 +3,7 @@ git-sparse-checkout(1) NAME ---- -git-sparse-checkout - Initialize and modify the sparse-checkout -configuration, which reduces the checkout to a set of paths -given by a list of patterns. +git-sparse-checkout - Reduce your working tree to a subset of tracked files SYNOPSIS @@ -17,8 +15,20 @@ SYNOPSIS DESCRIPTION ----------- -Initialize and modify the sparse-checkout configuration, which reduces -the checkout to a set of paths given by a list of patterns. +This command is used to create sparse checkouts, which change the +working tree from having all tracked files present to only having a +subset of those files. It can also switch which subset of files are +present, or undo and go back to having all tracked files present in +the working copy. + +The subset of files is chosen by providing a list of directories in +cone mode (the default), or by providing a list of patterns in +non-cone mode. + +When in a sparse-checkout, other Git commands behave a bit differently. +For example, switching branches will not update paths outside the +sparse-checkout directories/patterns, and `git commit -a` will not record +paths outside the sparse-checkout directories/patterns as deleted. THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN @@ -28,30 +38,42 @@ THE FUTURE. COMMANDS -------- 'list':: - Describe the patterns in the sparse-checkout file. + Describe the directories or patterns in the sparse-checkout file. 'set':: - Enable the necessary config settings - (extensions.worktreeConfig, core.sparseCheckout, - core.sparseCheckoutCone) if they are not already enabled, and - write a set of patterns to the sparse-checkout file from the - list of arguments following the 'set' subcommand. Update the - working directory to match the new patterns. + Enable the necessary sparse-checkout config settings + (`core.sparseCheckout`, `core.sparseCheckoutCone`, and + `index.sparse`) if they are not already set to the desired values, + populate the sparse-checkout file from the list of arguments + following the 'set' subcommand, and update the working directory to + match. ++ +To ensure that adjusting the sparse-checkout settings within a worktree +does not alter the sparse-checkout settings in other worktrees, the 'set' +subcommand will upgrade your repository config to use worktree-specific +config if not already present. The sparsity defined by the arguments to +the 'set' subcommand are stored in the worktree-specific sparse-checkout +file. See linkgit:git-worktree[1] and the documentation of +`extensions.worktreeConfig` in linkgit:git-config[1] for more details. + -When the `--stdin` option is provided, the patterns are read from -standard in as a newline-delimited list instead of from the arguments. +When the `--stdin` option is provided, the directories or patterns are +read from standard in as a newline-delimited list instead of from the +arguments. + -When `--cone` is passed or `core.sparseCheckoutCone` is enabled, the -input list is considered a list of directories instead of -sparse-checkout patterns. This allows for better performance with a -limited set of patterns (see 'CONE PATTERN SET' below). Note that the -set command will write patterns to the sparse-checkout file to include -all files contained in those directories (recursively) as well as -files that are siblings of ancestor directories. The input format -matches the output of `git ls-tree --name-only`. This includes -interpreting pathnames that begin with a double quote (") as C-style -quoted strings. This may become the default in the future; --no-cone -can be passed to request non-cone mode. +By default, the input list is considered a list of directories, matching +the output of `git ls-tree -d --name-only`. This includes interpreting +pathnames that begin with a double quote (") as C-style quoted strings. +Note that all files under the specified directories (at any depth) will +be included in the sparse checkout, as well as files that are siblings +of either the given directory or any of its ancestors (see 'CONE PATTERN +SET' below for more details). In the past, this was not the default, +and `--cone` needed to be specified or `core.sparseCheckoutCone` needed +to be enabled. ++ +When `--no-cone` is passed, the input list is considered a list of +patterns. This mode has a number of drawbacks, including not working +with some options like `--sparse-index`. As explained in the +"Non-cone Problems" section below, we do not recommend using it. + Use the `--[no-]sparse-index` option to use a sparse index (the default is to not use it). A sparse index reduces the size of the @@ -69,11 +91,10 @@ understand the sparse directory entries index extension and may fail to interact with your repository until it is disabled. 'add':: - Update the sparse-checkout file to include additional patterns. - By default, these patterns are read from the command-line arguments, - but they can be read from stdin using the `--stdin` option. When - `core.sparseCheckoutCone` is enabled, the given patterns are interpreted - as directory names as in the 'set' subcommand. + Update the sparse-checkout file to include additional directories + (in cone mode) or patterns (in non-cone mode). By default, these + directories or patterns are read from the command-line arguments, + but they can be read from stdin using the `--stdin` option. 'reapply':: Reapply the sparsity pattern rules to paths in the working tree. @@ -114,16 +135,54 @@ paths to pass to a subsequent 'set' or 'add' command. However, the disable command, so the easy restore of calling a plain `init` decreased in utility. -SPARSE CHECKOUT ---------------- - -"Sparse checkout" allows populating the working directory sparsely. -It uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell -Git whether a file in the working directory is worth looking at. If -the skip-worktree bit is set, then the file is ignored in the working -directory. Git will avoid populating the contents of those files, which -makes a sparse checkout helpful when working in a repository with many -files, but only a few are important to the current user. +EXAMPLES +-------- +`git sparse-checkout set MY/DIR1 SUB/DIR2`:: + + Change to a sparse checkout with all files (at any depth) under + MY/DIR1/ and SUB/DIR2/ present in the working copy (plus all + files immediately under MY/ and SUB/ and the toplevel + directory). If already in a sparse checkout, change which files + are present in the working copy to this new selection. Note + that this command will also delete all ignored files in any + directory that no longer has either tracked or + non-ignored-untracked files present. + +`git sparse-checkout disable`:: + + Repopulate the working directory with all files, disabling sparse + checkouts. + +`git sparse-checkout add SOME/DIR/ECTORY`:: + + Add all files under SOME/DIR/ECTORY/ (at any depth) to the + sparse checkout, as well as all files immediately under + SOME/DIR/ and immediately under SOME/. Must already be in a + sparse checkout before using this command. + +`git sparse-checkout reapply`:: + + It is possible for commands to update the working tree in a + way that does not respect the selected sparsity directories. + This can come from tools external to Git writing files, or + even affect Git commands because of either special cases (such + as hitting conflicts when merging/rebasing), or because some + commands didn't fully support sparse checkouts (e.g. the old + `recursive` merge backend had only limited support). This + command reapplies the existing sparse directory specifications + to make the working directory match. + +INTERNALS -- SPARSE CHECKOUT +---------------------------- + +"Sparse checkout" allows populating the working directory sparsely. It +uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell Git +whether a file in the working directory is worth looking at. If the +skip-worktree bit is set, and the file is not present in the working tree, +then its absence is ignored. Git will avoid populating the contents of +those files, which makes a sparse checkout helpful when working in a +repository with many files, but only a few are important to the current +user. The `$GIT_DIR/info/sparse-checkout` file is used to define the skip-worktree reference bitmap. When Git updates the working @@ -131,64 +190,196 @@ directory, it updates the skip-worktree bits in the index based on this file. The files matching the patterns in the file will appear in the working directory, and the rest will not. -To enable the sparse-checkout feature, run `git sparse-checkout set` to -set the patterns you want to use. +INTERNALS -- NON-CONE PROBLEMS +------------------------------ + +The `$GIT_DIR/info/sparse-checkout` file populated by the `set` and +`add` subcommands is defined to be a bunch of patterns (one per line) +using the same syntax as `.gitignore` files. In cone mode, these +patterns are restricted to matching directories (and users only ever +need supply or see directory names), while in non-cone mode any +gitignore-style pattern is permitted. Using the full gitignore-style +patterns in non-cone mode has a number of shortcomings: + + * Fundamentally, it makes various worktree-updating processes (pull, + merge, rebase, switch, reset, checkout, etc.) require O(N*M) pattern + matches, where N is the number of patterns and M is the number of + paths in the index. This scales poorly. + + * Avoiding the scaling issue has to be done via limiting the number + of patterns via specifying leading directory name or glob. + + * Passing globs on the command line is error-prone as users may + forget to quote the glob, causing the shell to expand it into all + matching files and pass them all individually along to + sparse-checkout set/add. While this could also be a problem with + e.g. "git grep -- *.c", mistakes with grep/log/status appear in + the immediate output. With sparse-checkout, the mistake gets + recorded at the time the sparse-checkout command is run and might + not be problematic until the user later switches branches or rebases + or merges, thus putting a delay between the user's error and when + they have a chance to catch/notice it. + + * Related to the previous item, sparse-checkout has an 'add' + subcommand but no 'remove' subcommand. Even if a 'remove' + subcommand were added, undoing an accidental unquoted glob runs + the risk of "removing too much", as it may remove entries that had + been included before the accidental add. + + * Non-cone mode uses gitignore-style patterns to select what to + *include* (with the exception of negated patterns), while + .gitignore files use gitignore-style patterns to select what to + *exclude* (with the exception of negated patterns). The + documentation on gitignore-style patterns usually does not talk in + terms of matching or non-matching, but on what the user wants to + "exclude". This can cause confusion for users trying to learn how + to specify sparse-checkout patterns to get their desired behavior. + + * Every other git subcommand that wants to provide "special path + pattern matching" of some sort uses pathspecs, but non-cone mode + for sparse-checkout uses gitignore patterns, which feels + inconsistent. + + * It has edge cases where the "right" behavior is unclear. Two examples: + + First, two users are in a subdirectory, and the first runs + git sparse-checkout set '/toplevel-dir/*.c' + while the second runs + git sparse-checkout set relative-dir + Should those arguments be transliterated into + current/subdirectory/toplevel-dir/*.c + and + current/subdirectory/relative-dir + before inserting into the sparse-checkout file? The user who typed + the first command is probably aware that arguments to set/add are + supposed to be patterns in non-cone mode, and probably would not be + happy with such a transliteration. However, many gitignore-style + patterns are just paths, which might be what the user who typed the + second command was thinking, and they'd be upset if their argument + wasn't transliterated. + + Second, what should bash-completion complete on for set/add commands + for non-cone users? If it suggests paths, is it exacerbating the + problem above? Also, if it suggests paths, what if the user has a + file or directory that begins with either a '!' or '#' or has a '*', + '\', '?', '[', or ']' in its name? And if it suggests paths, will + it complete "/pro" to "/proc" (in the root filesytem) rather than to + "/progress.txt" in the current directory? (Note that users are + likely to want to start paths with a leading '/' in non-cone mode, + for the same reason that .gitignore files often have one.) + Completing on files or directories might give nasty surprises in + all these cases. + + * The excessive flexibility made other extensions essentially + impractical. `--sparse-index` is likely impossible in non-cone + mode; even if it is somehow feasible, it would have been far more + work to implement and may have been too slow in practice. Some + ideas for adding coupling between partial clones and sparse + checkouts are only practical with a more restricted set of paths + as well. + +For all these reasons, non-cone mode is deprecated. Please switch to +using cone mode. + + +INTERNALS -- CONE MODE HANDLING +------------------------------- + +The "cone mode", which is the default, lets you specify only what +directories to include. For any directory specified, all paths below +that directory will be included, and any paths immediately under +leading directories (including the toplevel directory) will also be +included. Thus, if you specified the directory + Documentation/technical/ +then your sparse checkout would contain: + + * all files in the toplevel-directory + * all files immediately under Documentation/ + * all files at any depth under Documentation/technical/ + +Also, in cone mode, even if no directories are specified, then the +files in the toplevel directory will be included. -To repopulate the working directory with all files, use the -`git sparse-checkout disable` command. +When changing the sparse-checkout patterns in cone mode, Git will inspect each +tracked directory that is not within the sparse-checkout cone to see if it +contains any untracked files. If all of those files are ignored due to the +`.gitignore` patterns, then the directory will be deleted. If any of the +untracked files within that directory is not ignored, then no deletions will +occur within that directory and a warning message will appear. If these files +are important, then reset your sparse-checkout definition so they are included, +use `git add` and `git commit` to store them, then remove any remaining files +manually to ensure Git can behave optimally. +See also the "Internals -- Cone Pattern Set" section to learn how the +directories are transformed under the hood into a subset of the +Full Pattern Set of sparse-checkout. -FULL PATTERN SET ----------------- -By default, the sparse-checkout file uses the same syntax as `.gitignore` -files. +INTERNALS -- FULL PATTERN SET +----------------------------- + +The full pattern set allows for arbitrary pattern matches and complicated +inclusion/exclusion rules. These can result in O(N*M) pattern matches when +updating the index, where N is the number of patterns and M is the number +of paths in the index. To combat this performance issue, a more restricted +pattern set is allowed when `core.sparseCheckoutCone` is enabled. + +The sparse-checkout file uses the same syntax as `.gitignore` files; +see linkgit:gitignore[5] for details. Here, though, the patterns are +usually being used to select which files to include rather than which +files to exclude. (However, it can get a bit confusing since +gitignore-style patterns have negations defined by patterns which +begin with a '!', so you can also select files to _not_ include.) + +For example, to select everything, and then to remove the file +`unwanted` (so that every file will appear in your working tree except +the file named `unwanted`): -While `$GIT_DIR/info/sparse-checkout` is usually used to specify what -files are included, you can also specify what files are _not_ included, -using negative patterns. For example, to remove the file `unwanted`: + git sparse-checkout set --no-cone '/*' '!unwanted' + +These patterns are just placed into the +`$GIT_DIR/info/sparse-checkout` as-is, so the contents of that file +at this point would be ---------------- /* !unwanted ---------------- +See also the "Sparse Checkout" section of linkgit:git-read-tree[1] to +learn more about the gitignore-style patterns used in sparse +checkouts. -CONE PATTERN SET ----------------- -The full pattern set allows for arbitrary pattern matches and complicated -inclusion/exclusion rules. These can result in O(N*M) pattern matches when -updating the index, where N is the number of patterns and M is the number -of paths in the index. To combat this performance issue, a more restricted -pattern set is allowed when `core.sparseCheckoutCone` is enabled. +INTERNALS -- CONE PATTERN SET +----------------------------- -The accepted patterns in the cone pattern set are: +In cone mode, only directories are accepted, but they are translated into +the same gitignore-style patterns used in the full pattern set. We refer +to the particular patterns used in those mode as being of one of two types: 1. *Recursive:* All paths inside a directory are included. 2. *Parent:* All files immediately inside a directory are included. -In addition to the above two patterns, we also expect that all files in the -root directory are included. If a recursive pattern is added, then all -leading directories are added as parent patterns. - -By default, when running `git sparse-checkout init`, the root directory is -added as a parent pattern. At this point, the sparse-checkout file contains -the following patterns: +Since cone mode always includes files at the toplevel, when running +`git sparse-checkout set` with no directories specified, the toplevel +directory is added as a parent pattern. At this point, the +sparse-checkout file contains the following patterns: ---------------- /* !/*/ ---------------- -This says "include everything in root, but nothing two levels below root." +This says "include everything immediately under the toplevel +directory, but nothing at any level below that." -When in cone mode, the `git sparse-checkout set` subcommand takes a list of -directories instead of a list of sparse-checkout patterns. In this mode, -the command `git sparse-checkout set A/B/C` sets the directory `A/B/C` as -a recursive pattern, the directories `A` and `A/B` are added as parent -patterns. The resulting sparse-checkout file is now +When in cone mode, the `git sparse-checkout set` subcommand takes a +list of directories. The command `git sparse-checkout set A/B/C` sets +the directory `A/B/C` as a recursive pattern, the directories `A` and +`A/B` are added as parent patterns. The resulting sparse-checkout file +is now ---------------- /* @@ -203,14 +394,18 @@ patterns. The resulting sparse-checkout file is now Here, order matters, so the negative patterns are overridden by the positive patterns that appear lower in the file. -If `core.sparseCheckoutCone=true`, then Git will parse the sparse-checkout file -expecting patterns of these types. Git will warn if the patterns do not match. -If the patterns do match the expected format, then Git will use faster hash- -based algorithms to compute inclusion in the sparse-checkout. +Unless `core.sparseCheckoutCone` is explicitly set to `false`, Git will +parse the sparse-checkout file expecting patterns of these types. Git will +warn if the patterns do not match. If the patterns do match the expected +format, then Git will use faster hash-based algorithms to compute inclusion +in the sparse-checkout. If they do not match, git will behave as though +`core.sparseCheckoutCone` was false, regardless of its setting. -In the cone mode case, the `git sparse-checkout list` subcommand will list the -directories that define the recursive patterns. For the example sparse-checkout -file above, the output is as follows: +In the cone mode case, despite the fact that full patterns are written +to the $GIT_DIR/info/sparse-checkout file, the `git sparse-checkout +list` subcommand will list the directories that define the recursive +patterns. For the example sparse-checkout file above, the output is as +follows: -------------------------- $ git sparse-checkout list @@ -222,19 +417,9 @@ case-insensitive check. This corrects for case mismatched filenames in the 'git sparse-checkout set' command to reflect the expected cone in the working directory. -When changing the sparse-checkout patterns in cone mode, Git will inspect each -tracked directory that is not within the sparse-checkout cone to see if it -contains any untracked files. If all of those files are ignored due to the -`.gitignore` patterns, then the directory will be deleted. If any of the -untracked files within that directory is not ignored, then no deletions will -occur within that directory and a warning message will appear. If these files -are important, then reset your sparse-checkout definition so they are included, -use `git add` and `git commit` to store them, then remove any remaining files -manually to ensure Git can behave optimally. - -SUBMODULES ----------- +INTERNALS -- SUBMODULES +----------------------- If your repository contains one or more submodules, then submodules are populated based on interactions with the `git submodule` command. |