summaryrefslogtreecommitdiff
path: root/fsmonitor.c
AgeCommit message (Collapse)Author
2020-01-13fsmonitor: handle version 2 of the hooks that will use opaque tokenKevin Willford
Some file monitors like watchman will use something other than a timestamp to keep better track of what changes happen in between calls to query the fsmonitor. The clockid in watchman is a string. Now that the index is storing an opaque token for the last update the code needs to be updated to pass that opaque token to a verion 2 of the fsmonitor hook. Because there are repos that already have version 1 of the hook and we want them to continue to work when git is updated, we need to handle both version 1 and version 2 of the hook. In order to do that a config value is being added core.fsmonitorHookVersion to force what version of the hook should be used. When this is not set it will default to -1 and then the code will attempt to call version 2 of the hook first. If that fails it will fallback to trying version 1. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-13fsmonitor: change last update timestamp on the index_state to opaque tokenKevin Willford
Some file system monitors might not use or take a timestamp for processing and in the case of watchman could have race conditions with using a timestamp. Watchman uses something called a clockid that is used for race free queries to it. The clockid for watchman is simply a string. Change the fsmonitor_last_update from being a uint64_t to a char pointer so that any arbitrary data can be stored in it and passed back to the fsmonitor. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-21unpack-trees: skip stat on fsmonitor-valid filesUtsav Shah
The index might be aware that a file hasn't modified via fsmonitor, but unpack-trees did not pay attention to it and checked via ie_match_stat which can be inefficient on certain filesystems. This significantly slows down commands that run oneway_merge, like checkout and reset --hard. This patch makes oneway_merge check whether a file is considered unchanged through fsmonitor and skips ie_match_stat on it. unpack-trees also now correctly copies over fsmonitor validity state from the source index. Finally, for correctness, we force a refresh of fsmonitor state in tweak_fsmonitor. After this change, commands like stash (that use reset --hard internally) go from 8s or more to ~2s on a 250k file repository on a mac. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Utsav Shah <utsav@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-13fsmonitor: do not compare bitmap size with size of split indexJunio C Hamano
3444ec2e ("fsmonitor: don't fill bitmap with entries to be removed", 2019-10-11) added a handful of sanity checks that make sure that a bit position in fsmonitor bitmap does not go beyond the end of the index. As each bit in the bitmap corresponds to a path in the index, this is the right check most of the time. Except for the case when we are in the split-index mode and looking at a delta index that is to be overlayed on the base index but before the base index has actually been merged in, namely in read_ and write_fsmonitor_extension(). In these codepaths, the entries in the split/delta index is typically a small subset of the entire set of paths (otherwise why would we be using split-index?), so the bitmap used by the fsmonitor is almost always larger than the number of entries in the partial index, and the incorrect comparison would trigger the BUG(). Reported-by: Utsav Shah <ukshah2@illinois.edu> Helped-by: Kevin Willford <Kevin.Willford@microsoft.com> Helped-by: William Baker <William.Baker@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-12fsmonitor: don't fill bitmap with entries to be removedWilliam Baker
While doing some testing with fsmonitor enabled I found that git commands would segfault after staging and unstaging an untracked file. Looking at the crash it appeared that fsmonitor_ewah_callback was attempting to adjust bits beyond the bounds of the index cache. Digging into how this could happen it became clear that the fsmonitor extension must have been written with more bits than there were entries in the index. The root cause ended up being that fill_fsmonitor_bitmap was populating fsmonitor_dirty with bits for all entries in the index, even those that had been marked for removal. To solve this problem fill_fsmonitor_bitmap has been updated to skip entries with the the CE_REMOVE flag set. With this change the bits written for the fsmonitor extension will be consistent with the index entries written by do_write_index. Additionally, BUG checks have been added to detect if the number of bits in fsmonitor_dirty should ever exceed the number of entries in the index again. Another option that was considered was moving the call to fill_fsmonitor_bitmap closer to where the index is written (and where the fsmonitor extension itself is written). However, that did not work as the fsmonitor_dirty bitmap must be filled before the index is split during writing. Signed-off-by: William Baker <William.Baker@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-09Merge branch 'cb/fsmonitor-intfix'Junio C Hamano
Variable type fix. * cb/fsmonitor-intfix: fsmonitor: avoid signed integer overflow / infinite loop
2019-06-18fsmonitor: avoid signed integer overflow / infinite loopCarlo Marcelo Arenas Belón
883e248b8a ("fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.", 2017-09-22) uses an int in a loop that would wrap if index_state->cache_nr (unsigned) is bigger than INT_MAX Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-08fsmonitor: force a refresh after the index was discardedJohannes Schindelin
With this change, the `index_state` struct becomes the new home for the flag that says whether the fsmonitor hook has been run, i.e. it is now per-index. It also gets re-set when the index is discarded, fixing the bug demonstrated by the "test_expect_failure" test added in the preceding commit. In that case fsmonitor-enabled Git would miss updates under certain circumstances, see that preceding commit for details. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-30Merge branch 'jk/snprintf-truncation'Junio C Hamano
Avoid unchecked snprintf() to make future code auditing easier. * jk/snprintf-truncation: fmt_with_err: add a comment that truncation is OK shorten_unambiguous_ref: use xsnprintf fsmonitor: use internal argv_array of struct child_process log_write_email_headers: use strbufs http: use strbufs instead of fixed buffers
2018-05-21fsmonitor: use internal argv_array of struct child_processRené Scharfe
Avoid magic array sizes and indexes by constructing the fsmonitor command line using the embedded argv_array of the child_process. The resulting code is shorter and easier to extend. Getting rid of the snprintf() calls is a bonus -- even though the buffers were big enough here to avoid truncation -- as it makes auditing the remaining callers easier. Inspired-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-25Merge branch 'bp/fsmonitor-prime-index'Junio C Hamano
The index file is updated to record the fsmonitor section after a full scan was made, to avoid wasting the effort that has already spent. * bp/fsmonitor-prime-index: fsmonitor: force index write after full scan
2018-04-25Merge branch 'bp/fsmonitor-bufsize-fix'Junio C Hamano
Fix an unexploitable (because the oversized contents are not under attacker's control) buffer overflow. * bp/fsmonitor-bufsize-fix: fsmonitor: fix incorrect buffer size when printing version number
2018-04-11fsmonitor: force index write after full scanBen Peart
fsmonitor currently only flags the index as dirty if the extension is being added or removed. This is a performance optimization that recognizes you can stat() a lot of files in less time than it takes to write out an updated index. This patch makes a small enhancement and flags the index dirty if we end up having to stat() all files and scan the entire working directory. The assumption being that must be expensive or you would not have turned on the feature. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11fsmonitor: fix incorrect buffer size when printing version numberBen Peart
This is a trivial bug fix for passing the incorrect size to snprintf() when outputting the version. It should be passing the size of the destination buffer rather than the size of the value being printed. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-07dir.c: ignore paths containing .git when invalidating untracked cacheNguyễn Thái Ngọc Duy
read_directory() code ignores all paths named ".git" even if it's not a valid git repository. See treat_path() for details. Since ".git" is basically invisible to read_directory(), when we are asked to invalidate a path that contains ".git", we can safely ignore it because the slow path would not consider it anyway. This helps when fsmonitor is used and we have a real ".git" repo at worktree top. Occasionally .git/index will be updated and if the fsmonitor hook does not filter it, untracked cache is asked to invalidate the path ".git/index". Without this patch, we invalidate the root directory unncessarily, which: - makes read_directory() fall back to slow path for root directory (slower) - makes the index dirty (because UNTR extension is updated). Depending on the index size, writing it down could also be slow. A note about the new "safe_path" knob. Since this new check could be relatively expensive, avoid it when we know it's not needed. If the path comes from the index, it can't contain ".git". If it does contain, we may be screwed up at many more levels, not just this one. Noticed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-10fsmonitor: store fsmonitor bitmap before splitting indexAlex Vandiver
ba1b9cac ("fsmonitor: delay updating state until after split index is merged", 2017-10-27) resolved the problem of the fsmonitor data being applied to the non-base index when reading; however, a similar problem exists when writing the index. Specifically, writing of the fsmonitor extension happens only after the work to split the index has been applied -- as such, the information in the index is only for the non-"base" index, and thus the extension information contains only partial data. When saving, compute the ewah bitmap before the index is split, and store it in the fsmonitor_dirty field, mirroring the behavior that occurred during reading. fsmonitor_dirty is kept from being leaked by being freed when the extension data is written -- which always happens precisely once, no matter the split index configuration. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01fsmonitor: delay updating state until after split index is mergedAlex Vandiver
If the fsmonitor extension is used in conjunction with the split index extension, the set of entries in the index when it is first loaded is only a subset of the real index. This leads to only the non-"base" index being marked as CE_FSMONITOR_VALID. Delay the expansion of the ewah bitmap until after tweak_split_index has been called to merge in the base index as well. The new fsmonitor_dirty is kept from being leaked by dint of being cleaned up in post_read_index_from, which is guaranteed to be called after do_read_index in read_index_from. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-30fsmonitor: set the PWD to the top of the working treeAlex Vandiver
The fsmonitor command inherits the PWD of its caller, which may be anywhere in the working copy. This makes is difficult for the fsmonitor command to operate on the whole repository. Specifically, for the watchman integration, this causes each subdirectory to get its own watch entry. Set the CWD to the top of the working directory, for consistency. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01fsmonitor: teach git to optionally utilize a file system monitor to speed up ↵Ben Peart
detecting new or changed files. When the index is read from disk, the fsmonitor index extension is used to flag the last known potentially dirty index entries. The registered core.fsmonitor command is called with the time the index was last updated and returns the list of files changed since that time. This list is used to flag any additional dirty cache entries and untracked cache directories. We can then use this valid state to speed up preload_index(), ie_match_stat(), and refresh_cache_ent() as they do not need to lstat() files to detect potential changes for those entries marked CE_FSMONITOR_VALID. In addition, if the untracked cache is turned on valid_cached_dir() can skip checking directories for new or changed files as fsmonitor will invalidate the cache only for those directories that have been identified as having potential changes. To keep the CE_FSMONITOR_VALID state accurate during git operations; when git updates a cache entry to match the current state on disk, it will now set the CE_FSMONITOR_VALID bit. Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit is cleared and the corresponding untracked cache directory is marked invalid. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>