summaryrefslogtreecommitdiff
path: root/unpack-trees.c
AgeCommit message (Collapse)Author
2018-08-02Merge branch 'jm/cache-entry-from-mem-pool'Junio C Hamano
For a large tree, the index needs to hold many cache entries allocated on heap. These cache entries are now allocated out of a dedicated memory pool to amortize malloc(3) overhead. * jm/cache-entry-from-mem-pool: block alloc: add validations around cache_entry lifecyle block alloc: allocate cache entries from mem_pool mem-pool: fill out functionality mem-pool: add life cycle management functions mem-pool: only search head block for available space block alloc: add lifecycle APIs for cache_entry structs read-cache: teach make_cache_entry to take object_id read-cache: teach refresh_cache_entry to take istate
2018-07-24Merge branch 'mk/merge-in-sparse-checkout'Junio C Hamano
"git reset --merge" (hence "git merge ---abort") and "git reset --hard" had trouble working correctly in a sparsely checked out working tree after a conflict, which has been corrected. * mk/merge-in-sparse-checkout: unpack-trees: do not fail reset because of unmerged skipped entry
2018-07-18Merge branch 'sb/object-store-grafts'Junio C Hamano
The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-07-11unpack-trees: do not fail reset because of unmerged skipped entryMax Kirillov
After modify/delete merge conflict happens in a file skipped by sparse checkout, "git reset --merge", which implements the "--abort" actions, and "git reset --hard" fail with message "Entry * not uptodate. Cannot update sparse checkout." As explained in [1], the up-to-date checker mistakenly treats conflicted entry which does not exist in HEAD as still skipped by sparse checkout. Use the fix suggested in [1]. Also, add test case which verifies the issue is fixed. [1] https://public-inbox.org/git/20180616051444.GA29754@duynguyen.home/ Signed-off-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Max Kirillov <max@max630.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-03block alloc: allocate cache entries from mem_poolJameson Miller
When reading large indexes from disk, a portion of the time is dominated in malloc() calls. This can be mitigated by allocating a large block of memory and manage it ourselves via memory pools. This change moves the cache entry allocation to be on top of memory pools. Design: The index_state struct will gain a notion of an associated memory_pool from which cache_entries will be allocated from. When reading in the index from disk, we have information on the number of entries and their size, which can guide us in deciding how large our initial memory allocation should be. When an index is discarded, the associated memory_pool will be discarded as well - so the lifetime of a cache_entry is tied to the lifetime of the index_state that it was allocated for. In the case of a Split Index, the following rules are followed. 1st, some terminology is defined: Terminology: - 'the_index': represents the logical view of the index - 'split_index': represents the "base" cache entries. Read from the split index file. 'the_index' can reference a single split_index, as well as cache_entries from the split_index. `the_index` will be discarded before the `split_index` is. This means that when we are allocating cache_entries in the presence of a split index, we need to allocate the entries from the `split_index`'s memory pool. This allows us to follow the pattern that `the_index` can reference cache_entries from the `split_index`, and that the cache_entries will not be freed while they are still being referenced. Managing transient cache_entry structs: Cache entries are usually allocated for an index, but this is not always the case. Cache entries are sometimes allocated because this is the type that the existing checkout_entry function works with. Because of this, the existing code needs to handle cache entries associated with an index / memory pool, and those that only exist transiently. Several strategies were contemplated around how to handle this: Chosen approach: An extra field was added to the cache_entry type to track whether the cache_entry was allocated from a memory pool or not. This is currently an int field, as there are no more available bits in the existing ce_flags bit field. If / when more bits are needed, this new field can be turned into a proper bit field. Alternatives: 1) Do not include any information about how the cache_entry was allocated. Calling code would be responsible for tracking whether the cache_entry needed to be freed or not. Pro: No extra memory overhead to track this state Con: Extra complexity in callers to handle this correctly. The extra complexity and burden to not regress this behavior in the future was more than we wanted. 2) cache_entry would gain knowledge about which mem_pool allocated it Pro: Could (potentially) do extra logic to know when a mem_pool no longer had references to any cache_entry Con: cache_entry would grow heavier by a pointer, instead of int We didn't see a tangible benefit to this approach 3) Do not add any extra information to a cache_entry, but when freeing a cache entry, check if the memory exists in a region managed by existing mem_pools. Pro: No extra memory overhead to track state Con: Extra computation is performed when freeing cache entries We decided tracking and iterating over known memory pool regions was less desirable than adding an extra field to track this stae. Signed-off-by: Jameson Miller <jamill@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-03block alloc: add lifecycle APIs for cache_entry structsJameson Miller
It has been observed that the time spent loading an index with a large number of entries is partly dominated by malloc() calls. This change is in preparation for using memory pools to reduce the number of malloc() calls made to allocate cahce entries when loading an index. Add an API to allocate and discard cache entries, abstracting the details of managing the memory backing the cache entries. This commit does actually change how memory is managed - this will be done in a later commit in the series. This change makes the distinction between cache entries that are associated with an index and cache entries that are not associated with an index. A main use of cache entries is with an index, and we can optimize the memory management around this. We still have other cases where a cache entry is not persisted with an index, and so we need to handle the "transient" use case as well. To keep the congnitive overhead of managing the cache entries, there will only be a single discard function. This means there must be enough information kept with the cache entry so that we know how to discard them. A summary of the main functions in the API is: make_cache_entry: create cache entry for use in an index. Uses specified parameters to populate cache_entry fields. make_empty_cache_entry: Create an empty cache entry for use in an index. Returns cache entry with empty fields. make_transient_cache_entry: create cache entry that is not used in an index. Uses specified parameters to populate cache_entry fields. make_empty_transient_cache_entry: create cache entry that is not used in an index. Returns cache entry with empty fields. discard_cache_entry: A single function that knows how to discard a cache entry regardless of how it was allocated. Signed-off-by: Jameson Miller <jamill@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-30Merge branch 'ma/unpack-trees-free-msgs'Junio C Hamano
Leak plugging. * ma/unpack-trees-free-msgs: unpack_trees_options: free messages when done argv-array: return the pushed string from argv_push*() merge-recursive: provide pair of `unpack_trees_{start,finish}()` merge: setup `opts` later in `checkout_fast_forward()`
2018-05-30Merge branch 'bc/object-id'Junio C Hamano
Conversion from uchar[20] to struct object_id continues. * bc/object-id: (42 commits) merge-one-file: compute empty blob object ID add--interactive: compute the empty tree value Update shell scripts to compute empty tree object ID sha1_file: only expose empty object constants through git_hash_algo dir: use the_hash_algo for empty blob object ID sequencer: use the_hash_algo for empty tree object ID cache-tree: use is_empty_tree_oid sha1_file: convert cached object code to struct object_id builtin/reset: convert use of EMPTY_TREE_SHA1_BIN builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX wt-status: convert two uses of EMPTY_TREE_SHA1_HEX submodule: convert several uses of EMPTY_TREE_SHA1_HEX sequencer: convert one use of EMPTY_TREE_SHA1_HEX merge: convert empty tree constant to the_hash_algo builtin/merge: switch tree functions to use object_id builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo sha1-file: add functions for hex empty tree and blob OIDs builtin/receive-pack: avoid hard-coded constants for push certs diff: specify abbreviation size in terms of the_hash_algo upload-pack: replace use of several hard-coded constants ...
2018-05-30Merge branch 'js/use-bug-macro'Junio C Hamano
Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths
2018-05-23Merge branch 'en/unpack-trees-split-index-fix'Junio C Hamano
The split-index feature had a long-standing and dormant bug in certain use of the in-core merge machinery, which has been fixed. * en/unpack-trees-split-index-fix: unpack_trees: fix breakage when o->src_index != o->dst_index
2018-05-23Merge branch 'en/rename-directory-detection-reboot'Junio C Hamano
Rename detection logic in "diff" family that is used in "merge" has learned to guess when all of x/a, x/b and x/c have moved to z/a, z/b and z/c, it is likely that x/d added in the meantime would also want to move to z/d by taking the hint that the entire directory 'x' moved to 'z'. A bug causing dirty files involved in a rename to be overwritten during merge has also been fixed as part of this work. Incidentally, this also avoids updating a file in the working tree after a (non-trivial) merge whose result matches what our side originally had. * en/rename-directory-detection-reboot: (36 commits) merge-recursive: fix check for skipability of working tree updates merge-recursive: make "Auto-merging" comment show for other merges merge-recursive: fix remainder of was_dirty() to use original index merge-recursive: fix was_tracked() to quit lying with some renamed paths t6046: testcases checking whether updates can be skipped in a merge merge-recursive: avoid triggering add_cacheinfo error with dirty mod merge-recursive: move more is_dirty handling to merge_content merge-recursive: improve add_cacheinfo error handling merge-recursive: avoid spurious rename/rename conflict from dir renames directory rename detection: new testcases showcasing a pair of bugs merge-recursive: fix remaining directory rename + dirty overwrite cases merge-recursive: fix overwriting dirty files involved in renames merge-recursive: avoid clobbering untracked files with directory renames merge-recursive: apply necessary modifications for directory renames merge-recursive: when comparing files, don't include trees merge-recursive: check for file level conflicts then get new name merge-recursive: add computation of collisions due to dir rename & merging merge-recursive: check for directory level conflicts merge-recursive: add get_directory_renames() merge-recursive: make a helper function for cleanup for handle_renames ...
2018-05-22unpack_trees_options: free messages when doneMartin Ågren
The strings allocated in `setup_unpack_trees_porcelain()` are never freed. Provide a function `clear_unpack_trees_porcelain()` to do so and call it where we use `setup_unpack_trees_porcelain()`. The only non-trivial user is `unpack_trees_start()`, where we should place the new call in `unpack_trees_finish()`. We keep the string pointers in an array, mixing pointers to static memory and memory that we allocate on the heap. We also keep several copies of the individual pointers. So we need to make sure that we do not free what we must not free and that we do not double-free. Let a separate argv_array take ownership of all the strings we create so that we can easily free them. Zero the whole array of string pointers to make sure that we do not leave any dangling pointers. Note that we only take responsibility for the memory allocated in `setup_unpack_trees_porcelain()` and not any other members of the `struct unpack_trees_options`. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16object-store: move object access functions to object-store.hStefan Beller
This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-08merge-recursive: fix overwriting dirty files involved in renamesElijah Newren
This fixes an issue that existed before my directory rename detection patches that affects both normal renames and renames implied by directory rename detection. Additional codepaths that only affect overwriting of dirty files that are involved in directory rename detection will be added in a subsequent commit. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-08Merge branch 'sb/submodule-move-nested'Junio C Hamano
Moving a submodule that itself has submodule in it with "git mv" forgot to make necessary adjustment to the nested sub-submodules; now the codepath learned to recurse into the submodules. * sb/submodule-move-nested: submodule: fixup nested submodules after moving the submodule submodule-config: remove submodule_from_cache submodule-config: add repository argument to submodule_from_{name, path} submodule-config: allow submodule_free to handle arbitrary repositories grep: remove "repo" arg from non-supporting funcs submodule.h: drop declaration of connect_work_tree_and_git_dir
2018-05-06Replace all die("BUG: ...") calls by BUG() onesJohannes Schindelin
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in 588a538ae55 (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02Update struct index_state to use struct object_idbrian m. carlson
Adjust struct index_state to use struct object_id instead of unsigned char [20]. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02unpack_trees: fix breakage when o->src_index != o->dst_indexElijah Newren
Currently, all callers of unpack_trees() set o->src_index == o->dst_index. The code in unpack_trees() does not correctly handle them being different. There are two separate issues: First, there is the possibility of memory corruption. Since unpack_trees() creates a temporary index in o->result and then discards o->dst_index and overwrites it with o->result, in the special case that o->src_index == o->dst_index, it is safe to just reuse o->src_index's split_index for o->result. However, when src and dst are different, reusing o->src_index's split_index for o->result will cause the split_index to be shared. If either index then has entries replaced or removed, it will result in the other index referring to free()'d memory. Second, we can drop the index extensions. Previously, we were moving index extensions from o->dst_index to o->result. Since o->src_index is the one that will have the necessary extensions (o->dst_index is likely to be a new index temporary index created to store the results), we should be moving the index extensions from there. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11Revert "Merge branch 'en/rename-directory-detection'"Junio C Hamano
This reverts commit e4bb62fa1eeee689744b413e29a50b4d1dae6886, reversing changes made to 468165c1d8a442994a825f3684528361727cd8c0. The topic appears to inflict severe regression in renaming merges, even though the promise of it was that it would improve them. We do not yet know which exact change in the topic was wrong, but in the meantime, let's play it safe and revert it out of 'master' before real Git-using projects are harmed. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-09Merge branch 'en/rename-directory-detection'Junio C Hamano
Rename detection logic in "diff" family that is used in "merge" has learned to guess when all of x/a, x/b and x/c have moved to z/a, z/b and z/c, it is likely that x/d added in the meantime would also want to move to z/d by taking the hint that the entire directory 'x' moved to 'z'. A bug causing dirty files involved in a rename to be overwritten during merge has also been fixed as part of this work. * en/rename-directory-detection: (29 commits) merge-recursive: ensure we write updates for directory-renamed file merge-recursive: avoid spurious rename/rename conflict from dir renames directory rename detection: new testcases showcasing a pair of bugs merge-recursive: fix remaining directory rename + dirty overwrite cases merge-recursive: fix overwriting dirty files involved in renames merge-recursive: avoid clobbering untracked files with directory renames merge-recursive: apply necessary modifications for directory renames merge-recursive: when comparing files, don't include trees merge-recursive: check for file level conflicts then get new name merge-recursive: add computation of collisions due to dir rename & merging merge-recursive: check for directory level conflicts merge-recursive: add get_directory_renames() merge-recursive: make a helper function for cleanup for handle_renames merge-recursive: split out code for determining diff_filepairs merge-recursive: make !o->detect_rename codepath more obvious merge-recursive: fix leaks of allocated renames and diff_filepairs merge-recursive: introduce new functions to handle rename logic merge-recursive: move the get_renames() function directory rename detection: tests for handling overwriting dirty files directory rename detection: tests for handling overwriting untracked files ...
2018-03-29Merge branch 'jh/partial-clone'Junio C Hamano
Hotfix. * jh/partial-clone: upload-pack: disable object filtering when disabled by config unpack-trees: release oid_array after use in check_updates()
2018-03-29submodule-config: allow submodule_free to handle arbitrary repositoriesStefan Beller
At some point we may want to rename the function so that it describes what it actually does as 'submodule_free' doesn't quite describe that this clears a repository's submodule cache. But that's beyond the scope of this series. While at it remove the extern key word from its declaration. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-25unpack-trees: release oid_array after use in check_updates()René Scharfe
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-06Merge branch 'bw/c-plus-plus'Junio C Hamano
Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...
2018-02-27merge-recursive: fix overwriting dirty files involved in renamesElijah Newren
This fixes an issue that existed before my directory rename detection patches that affects both normal renames and renames implied by directory rename detection. Additional codepaths that only affect overwriting of dirty files that are involved in directory rename detection will be added in a subsequent commit. Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-27Merge branch 'nd/fix-untracked-cache-invalidation'Junio C Hamano
Some bugs around "untracked cache" feature have been fixed. * nd/fix-untracked-cache-invalidation: dir.c: ignore paths containing .git when invalidating untracked cache dir.c: stop ignoring opendir() error in open_cached_dir() dir.c: fix missing dir invalidation in untracked code dir.c: avoid stat() in valid_cached_dir() status: add a failing test showing a core.untrackedCache bug
2018-02-22unpack-trees: rename 'new' variablesBrandon Williams
Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-13Merge branch 'jh/partial-clone'Junio C Hamano
The machinery to clone & fetch, which in turn involves packing and unpacking objects, have been told how to omit certain objects using the filtering mechanism introduced by the jh/object-filtering topic, and also mark the resulting pack as a promisor pack to tolerate missing objects, taking advantage of the mechanism introduced by the jh/fsck-promisors topic. * jh/partial-clone: t5616: test bulk prefetch after partial fetch fetch: inherit filter-spec from partial clone t5616: end-to-end tests for partial clone fetch-pack: restore save_commit_buffer after use unpack-trees: batch fetching of missing blobs clone: partial clone partial-clone: define partial clone settings in config fetch: support filters fetch: refactor calculation of remote list fetch-pack: test support excluding large blobs fetch-pack: add --no-filter fetch-pack, index-pack, transport: partial clone upload-pack: add object filtering for partial clone
2018-02-07dir.c: ignore paths containing .git when invalidating untracked cacheNguyễn Thái Ngọc Duy
read_directory() code ignores all paths named ".git" even if it's not a valid git repository. See treat_path() for details. Since ".git" is basically invisible to read_directory(), when we are asked to invalidate a path that contains ".git", we can safely ignore it because the slow path would not consider it anyway. This helps when fsmonitor is used and we have a real ".git" repo at worktree top. Occasionally .git/index will be updated and if the fsmonitor hook does not filter it, untracked cache is asked to invalidate the path ".git/index". Without this patch, we invalidate the root directory unncessarily, which: - makes read_directory() fall back to slow path for root directory (slower) - makes the index dirty (because UNTR extension is updated). Depending on the index size, writing it down could also be slow. A note about the new "safe_path" knob. Since this new check could be relatively expensive, avoid it when we know it's not needed. If the path comes from the index, it can't contain ".git". If it does contain, we may be screwed up at many more levels, not just this one. Noticed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05unpack-trees: oneway_merge to update submodulesStefan Beller
When there is a one way merge, each submodule needs to be one way merged as well, if we're asked to recurse into submodules. In case of a submodule, check if it is up-to-date, otherwise set the flag CE_UPDATE, which will trigger an update of it in the phase updating the tree later. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08unpack-trees: batch fetching of missing blobsJonathan Tan
When running checkout, first prefetch all blobs that are to be updated but are missing. This means that only one pack is downloaded during such operations, instead of one per missing blob. This operates only on the blob level - if a repository has a missing tree, they are still fetched one at a time. This does not use the delayed checkout mechanism introduced in commit 2841e8f ("convert: add "status=delayed" to filter process protocol", 2017-06-30) due to significant conceptual differences - in particular, for partial clones, we already know what needs to be fetched based on the contents of the local repo alone, whereas for status=delayed, it is the filter process that tells us what needs to be checked in the end. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-21Merge branch 'bp/fsmonitor'Junio C Hamano
We learned to talk to watchman to speed up "git status" and other operations that need to see which paths have been modified. * bp/fsmonitor: fsmonitor: preserve utf8 filenames in fsmonitor-watchman log fsmonitor: read entirety of watchman output fsmonitor: MINGW support for watchman integration fsmonitor: add a performance test fsmonitor: add a sample integration script for Watchman fsmonitor: add test cases for fsmonitor extension split-index: disable the fsmonitor extension when running the split index test fsmonitor: add a test tool to dump the index extension update-index: add fsmonitor support to update-index ls-files: Add support in ls-files to display the fsmonitor valid bit fsmonitor: add documentation for the fsmonitor extension. fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. update-index: add a new --force-write-index option preload-index: add override to enable testing preload-index bswap: add 64 bit endianness helper get_be64
2017-10-16refs: convert resolve_gitlink_ref to struct object_idbrian m. carlson
Convert the declaration and definition of resolve_gitlink_ref to use struct object_id and apply the following semantic patch: @@ expression E1, E2, E3; @@ - resolve_gitlink_ref(E1, E2, E3.hash) + resolve_gitlink_ref(E1, E2, &E3) @@ expression E1, E2, E3; @@ - resolve_gitlink_ref(E1, E2, E3->hash) + resolve_gitlink_ref(E1, E2, E3) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16Convert remaining callers of resolve_gitlink_ref to object_idbrian m. carlson
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01fsmonitor: teach git to optionally utilize a file system monitor to speed up ↵Ben Peart
detecting new or changed files. When the index is read from disk, the fsmonitor index extension is used to flag the last known potentially dirty index entries. The registered core.fsmonitor command is called with the time the index was last updated and returns the list of files changed since that time. This list is used to flag any additional dirty cache entries and untracked cache directories. We can then use this valid state to speed up preload_index(), ie_match_stat(), and refresh_cache_ent() as they do not need to lstat() files to detect potential changes for those entries marked CE_FSMONITOR_VALID. In addition, if the untracked cache is turned on valid_cached_dir() can skip checking directories for new or changed files as fsmonitor will invalidate the cache only for those directories that have been identified as having potential changes. To keep the CE_FSMONITOR_VALID state accurate during git operations; when git updates a cache entry to match the current state on disk, it will now set the CE_FSMONITOR_VALID bit. Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit is cleared and the corresponding untracked cache directory is marked invalid. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10Merge branch 'ls/convert-filter-progress'Junio C Hamano
The codepath to call external process filter for smudge/clean operation learned to show the progress meter. * ls/convert-filter-progress: convert: display progress for filtered objects that have been delayed
2017-09-10Merge branch 'ma/up-to-date'Junio C Hamano
Message and doc updates. * ma/up-to-date: treewide: correct several "up-to-date" to "up to date" Documentation/user-manual: update outdated example output
2017-08-27Merge branch 'bw/submodule-config-cleanup'Junio C Hamano
Code clean-up to avoid mixing values read from the .gitmodules file and values read from the .git/config file. * bw/submodule-config-cleanup: submodule: remove gitmodules_config unpack-trees: improve loading of .gitmodules submodule-config: lazy-load a repository's .gitmodules file submodule-config: move submodule-config functions to submodule-config.c submodule-config: remove support for overlaying repository config diff: stop allowing diff to have submodules configured in .git/config submodule: remove submodule_config callback routine unpack-trees: don't respect submodule.update submodule: don't rely on overlayed config when setting diffopts fetch: don't overlay config with submodule-config submodule--helper: don't overlay config in update-clone submodule--helper: don't overlay config in remote_submodule_branch add, reset: ensure submodules can be added or reset submodule: don't use submodule_from_name t7411: check configuration parsing errors
2017-08-24convert: display progress for filtered objects that have been delayedLars Schneider
In 2841e8f ("convert: add "status=delayed" to filter process protocol", 2017-06-30) we taught the filter process protocol to delayed responses. These responses are processed after the "Checking out files" phase. If the processing takes noticeable time, then the user might think Git is stuck. Display the progress of the delayed responses to let the user know that Git is still processing objects. This works very well for objects that can be filtered quickly. If filtering of an individual object takes noticeable time, then the user might still think that Git is stuck. However, in that case the user would at least know what Git is doing. It would be technical more correct to display "Checking out files whose content filtering has been delayed". For brevity we only print "Filtering content". The finish_delayed_checkout() call was moved below the stop_progress() call in unpack-trees.c to ensure that the "Checking out files" progress is properly stopped before the "Filtering content" progress starts in finish_delayed_checkout(). Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Suggested-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24Merge branch 'jc/simplify-progress'Junio C Hamano
The API to start showing progress meter after a short delay has been simplified. * jc/simplify-progress: progress: simplify "delayed" progress API
2017-08-24Merge branch 'rs/object-id'Junio C Hamano
Conversion from uchar[20] to struct object_id continues. * rs/object-id: tree-walk: convert fill_tree_descriptor() to object_id
2017-08-23treewide: correct several "up-to-date" to "up to date"Martin Ågren
Follow the Oxford style, which says to use "up-to-date" before the noun, but "up to date" after it. Don't change plumbing (specifically send-pack.c, but transport.c (git push) also has the same string). This was produced by grepping for "up-to-date" and "up to date". It turned out we only had to edit in one direction, removing the hyphens. Fix a typo in Documentation/git-diff-index.txt while we're there. Reported-by: Jeffrey Manian <jeffrey.manian@gmail.com> Reported-by: STEVEN WHITE <stevencharleswhitevoices@gmail.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-22Merge branch 'bw/grep-recurse-submodules'Junio C Hamano
"git grep --recurse-submodules" has been reworked to give a more consistent output across submodule boundary (and do its thing without having to fork a separate process). * bw/grep-recurse-submodules: grep: recurse in-process using 'struct repository' submodule: merge repo_read_gitmodules and gitmodules_config submodule: check for unmerged .gitmodules outside of config parsing submodule: check for unstaged .gitmodules outside of config parsing submodule: remove fetch.recursesubmodules from submodule-config parsing submodule: remove submodule.fetchjobs from submodule-config parsing config: add config_from_gitmodules cache.h: add GITMODULES_FILE macro repository: have the_repository use the_index repo_read_index: don't discard the index
2017-08-19progress: simplify "delayed" progress APIJunio C Hamano
We used to expose the full power of the delayed progress API to the callers, so that they can specify, not just the message to show and expected total amount of work that is used to compute the percentage of work performed so far, the percent-threshold parameter P and the delay-seconds parameter N. The progress meter starts to show at N seconds into the operation only if we have not yet completed P per-cent of the total work. Most callers used either (0%, 2s) or (50%, 1s) as (P, N), but there are oddballs that chose more random-looking values like 95%. For a smoother workload, (50%, 1s) would allow us to start showing the progress meter earlier than (0%, 2s), while keeping the chance of not showing progress meter for long running operation the same as the latter. For a task that would take 2s or more to complete, it is likely that less than half of it would complete within the first second, if the workload is smooth. But for a spiky workload whose earlier part is easier, such a setting is likely to fail to show the progress meter entirely and (0%, 2s) is more appropriate. But that is merely a theory. Realistically, it is of dubious value to ask each codepath to carefully consider smoothness of their workload and specify their own setting by passing two extra parameters. Let's simplify the API by dropping both parameters and have everybody use (0%, 2s). Oh, by the way, the percent-threshold parameter and the structure member were consistently misspelled, which also is now fixed ;-) Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-14tree-walk: convert fill_tree_descriptor() to object_idRené Scharfe
All callers of fill_tree_descriptor() have been converted to object_id already, so convert that function as well. As a nice side-effect we get rid of NULL checks in tree-diff.c, as fill_tree_descriptor() already does them for us. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11Merge branch 'ls/filter-process-delayed'Junio C Hamano
The filter-process interface learned to allow a process with long latency give a "delayed" response. * ls/filter-process-delayed: convert: add "status=delayed" to filter process protocol convert: refactor capabilities negotiation convert: move multiple file filter error handling to separate function convert: put the flags field before the flag itself for consistent style t0021: write "OUT <size>" only on success t0021: make debug log file name configurable t0021: keep filter log files on comparison
2017-08-03unpack-trees: improve loading of .gitmodulesBrandon Williams
When recursing submodules 'check_updates()' needs to have strict control over the submodule-config subsystem to ensure that the gitmodules file has been read before checking cache entries which are marked for removal as well ensuring the proper gitmodules file is read before updating cache entries. Because of this let's not rely on callers of 'check_updates()' to read the gitmodules file before calling 'check_updates()' and handle the reading explicitly. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-03unpack-trees: don't respect submodule.updateBrandon Williams
The 'submodule.update' config was historically used and respected by the 'submodule update' command because update handled a variety of different ways it updated a submodule. As we begin teaching other commands about submodules it makes more sense for the different settings of 'submodule.update' to be handled by the individual commands themselves (checkout, rebase, merge, etc) so it shouldn't be respected by the native checkout command. Also remove the overlaying of the repository's config (via using 'submodule_config()') from the commands which use the unpack-trees logic (checkout, read-tree, reset). Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-02cache.h: add GITMODULES_FILE macroBrandon Williams
Add a macro to be used when specifying the '.gitmodules' file and convert any existing hard coded '.gitmodules' file strings to use the new macro. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30convert: add "status=delayed" to filter process protocolLars Schneider
Some `clean` / `smudge` filters may require a significant amount of time to process a single blob (e.g. the Git LFS smudge filter might perform network requests). During this process the Git checkout operation is blocked and Git needs to wait until the filter is done to continue with the checkout. Teach the filter process protocol, introduced in edcc8581 ("convert: add filter.<driver>.process option", 2016-10-16), to accept the status "delayed" as response to a filter request. Upon this response Git continues with the checkout operation. After the checkout operation Git calls "finish_delayed_checkout" which queries the filter for remaining blobs. If the filter is still working on the completion, then the filter is expected to block. If the filter has completed all remaining blobs then an empty response is expected. Git has a multiple code paths that checkout a blob. Support delayed checkouts only in `clone` (in unpack-trees.c) and `checkout` operations for now. The optimization is most effective in these code paths as all files of the tree are processed. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>