summaryrefslogtreecommitdiff
path: root/builtin
AgeCommit message (Collapse)Author
2019-10-11Merge branch 'bc/object-id-part17'Junio C Hamano
Preparation for SHA-256 upgrade continues. * bc/object-id-part17: (26 commits) midx: switch to using the_hash_algo builtin/show-index: replace sha1_to_hex rerere: replace sha1_to_hex builtin/receive-pack: replace sha1_to_hex builtin/index-pack: replace sha1_to_hex packfile: replace sha1_to_hex wt-status: convert struct wt_status to object_id cache: remove null_sha1 builtin/worktree: switch null_sha1 to null_oid builtin/repack: write object IDs of the proper length pack-write: use hash_to_hex when writing checksums sequencer: convert to use the_hash_algo bisect: switch to using the_hash_algo sha1-lookup: switch hard-coded constants to the_hash_algo config: use the_hash_algo in abbrev comparison combine-diff: replace GIT_SHA1_HEXSZ with the_hash_algo bundle: switch to use the_hash_algo connected: switch GIT_SHA1_HEXSZ to the_hash_algo show-index: switch hard-coded constants to the_hash_algo blame: remove needless comparison with GIT_SHA1_HEXSZ ...
2019-10-11Merge branch 'en/clean-nested-with-ignored'Junio C Hamano
"git clean" fixes. * en/clean-nested-with-ignored: dir: special case check for the possibility that pathspec is NULL clean: fix theoretical path corruption clean: rewrap overly long line clean: avoid removing untracked files in a nested git repository clean: disambiguate the definition of -d git-clean.txt: do not claim we will delete files with -n/--dry-run dir: add commentary explaining match_pathspec_item's return value dir: if our pathspec might match files under a dir, recurse into it dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case dir: also check directories for matching pathspecs dir: fix off-by-one error in match_pathspec_item dir: fix typo in comment t7300: add testcases showing failure to clean specified pathspecs
2019-10-09Merge branch 'sg/name-rev-cutoff-underflow-fix'Junio C Hamano
Integer arithmetic fix. * sg/name-rev-cutoff-underflow-fix: name-rev: avoid cutoff timestamp underflow
2019-10-07Merge branch 'bw/submodule-helper-usage-fix'Junio C Hamano
Typofix. * bw/submodule-helper-usage-fix: builtin/submodule--helper: fix usage string for 'update-clone'
2019-10-07Merge branch 'gs/commit-graph-progress'Junio C Hamano
* gs/commit-graph-progress: commit-graph: add --[no-]progress to write and verify
2019-10-07Merge branch 'ms/fetch-follow-tag-optim'Junio C Hamano
The code used in following tags in "git fetch" has been optimized. * ms/fetch-follow-tag-optim: fetch: use oidset to keep the want OIDs for faster lookup
2019-10-07Merge branch 'jk/commit-graph-cleanup'Junio C Hamano
A pair of small fixups to "git commit-graph" have been applied. * jk/commit-graph-cleanup: commit-graph: turn off save_commit_buffer commit-graph: don't show progress percentages while expanding reachable commits
2019-10-07Merge branch 'jk/partial-clone-sparse-blob'Junio C Hamano
The name of the blob object that stores the filter specification for sparse cloning/fetching was interpreted in a wrong place in the code, causing Git to abort. * jk/partial-clone-sparse-blob: list-objects-filter: use empty string instead of NULL for sparse "base" list-objects-filter: give a more specific error sparse parsing error list-objects-filter: delay parsing of sparse oid t5616: test cloning/fetching with sparse:oid=<oid> filter
2019-10-07Merge branch 'tg/stash-refresh-index'Junio C Hamano
"git stash" learned to write refreshed index back to disk. * tg/stash-refresh-index: stash: make sure to write refreshed cache merge: use refresh_and_write_cache factor out refresh_and_write_cache function
2019-09-30Merge branch 'ds/commit-graph-on-fetch'Junio C Hamano
A configuration variable tells "git fetch" to write the commit graph after finishing. * ds/commit-graph-on-fetch: fetch: add fetch.writeCommitGraph config setting
2019-09-30Merge branch 'bw/rebase-autostash-keep-current-branch'Junio C Hamano
"git rebase --autostash <upstream> <branch>", when <branch> is different from the current branch, incorrectly moved the tip of the current branch, which has been corrected. * bw/rebase-autostash-keep-current-branch: builtin/rebase.c: Remove pointless message builtin/rebase.c: make sure the active branch isn't moved when autostashing
2019-09-30Merge branch 'ds/include-exclude'Junio C Hamano
The internal code originally invented for ".gitignore" processing got reshuffled and renamed to make it less tied to "excluding" and stress more that it is about "matching", as it has been reused for things like sparse checkout specification that want to check if a path is "included". * ds/include-exclude: unpack-trees: rename 'is_excluded_from_list()' treewide: rename 'exclude' methods to 'pattern' treewide: rename 'EXCL_FLAG_' to 'PATTERN_FLAG_' treewide: rename 'struct exclude_list' to 'struct pattern_list' treewide: rename 'struct exclude' to 'struct path_pattern'
2019-09-30Merge branch 'dl/rebase-i-keep-base'Junio C Hamano
"git rebase --keep-base <upstream>" tries to find the original base of the topic being rebased and rebase on top of that same base, which is useful when running the "git rebase -i" (and its limited variant "git rebase -x"). The command also has learned to fast-forward in more cases where it can instead of replaying to recreate identical commits. * dl/rebase-i-keep-base: rebase: teach rebase --keep-base rebase tests: test linear branch topology rebase: fast-forward --fork-point in more cases rebase: fast-forward --onto in more cases rebase: refactor can_fast_forward into goto tower t3432: test for --no-ff's interaction with fast-forward t3432: distinguish "noop-same" v.s. "work-same" in "same head" tests t3432: test rebase fast-forward behavior t3431: add rebase --fork-point tests
2019-09-30Merge branch 'jk/misc-uninitialized-fixes'Junio C Hamano
Various fixes to codepaths gcc 9 had trouble following dataflow. * jk/misc-uninitialized-fixes: pack-objects: drop packlist index_pos optimization test-read-cache: drop namelen variable diff-delta: set size out-parameter to 0 for NULL delta bulk-checkin: zero-initialize hashfile_checkpoint pack-objects: use object_id in packlist_alloc() git-am: handle missing "author" when parsing commit
2019-09-30Merge branch 'rs/get-tagged-oid'Junio C Hamano
Code cleanup. * rs/get-tagged-oid: use get_tagged_oid() tag: factor out get_tagged_oid()
2019-09-30Merge branch 'tg/push-all-in-mirror-forbidden'Junio C Hamano
Fix an earlier regression to "git push --all" which should have been forbidden when the target remote repository is set to be a mirror. * tg/push-all-in-mirror-forbidden: push: disallow --all and refspecs when remote.<name>.mirror is set
2019-09-30Merge branch 'nd/switch-and-restore'Junio C Hamano
Resurrect a performance hack. * nd/switch-and-restore: checkout: add simple check for 'git checkout -b'
2019-09-30Merge branch 'rs/strbuf-detach'Junio C Hamano
Straighten out the use of strbuf_detach() API function. * rs/strbuf-detach: grep: use return value of strbuf_detach() log-tree: always use return value of strbuf_detach()
2019-09-28builtin/submodule--helper: fix usage string for 'update-clone'Bert Wesarg
Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-28name-rev: avoid cutoff timestamp underflowSZEDER Gábor
When 'git name-rev' is invoked with commit-ish parameters, it tries to save some work, and doesn't visit commits older than the committer date of the oldest given commit minus a one day worth of slop. Since our 'timestamp_t' is an unsigned type, this leads to a timestamp underflow when the committer date of the oldest given commit is within a day of the UNIX epoch. As a result the cutoff timestamp ends up far-far in the future, and 'git name-rev' doesn't visit any commits, and names each given commit as 'undefined'. Check whether subtracting the slop from the oldest committer date would lead to an underflow, and use no cutoff in that case. We don't have a TIME_MIN constant, dddbad728c (timestamp_t: a new data type for timestamps, 2017-04-26) didn't add one, so do it now. Note that the type of the cutoff timestamp variable used to be signed before 5589e87fd8 (name-rev: change a "long" variable to timestamp_t, 2017-05-20). The behavior was still the same even back then, but the underflow didn't happen when substracting the slop from the oldest committer date, but when comparing the signed cutoff timestamp with unsigned committer dates in name_rev(). IOW, this underflow bug is as old as 'git name-rev' itself. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-20stash: make sure to write refreshed cacheThomas Gummerer
When converting stash into C, calls to 'git update-index --refresh' were replaced with the 'refresh_cache()' function. That is fine as long as the index is only needed in-core, and not re-read from disk. However in many cases we do actually need the refreshed index to be written to disk, for example 'merge_recursive_generic()' discards the in-core index before re-reading it from disk, and in the case of 'apply --quiet', the 'refresh_cache()' we currently have is pointless without writing the index to disk. Always write the index after refreshing it to ensure there are no regressions in this compared to the scripted stash. In the future we can consider avoiding the write where possible after making sure none of the subsequent calls actually need the refreshed cache, and it is not expected to be refreshed after stash exits or it is written somewhere else already. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-20merge: use refresh_and_write_cacheThomas Gummerer
Use the 'refresh_and_write_cache()' convenience function introduced in the last commit, instead of refreshing and writing the index manually in merge.c Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-20factor out refresh_and_write_cache functionThomas Gummerer
Getting the lock for the index, refreshing it and then writing it is a pattern that happens more than once throughout the codebase, and isn't trivial to get right. Factor out the refresh_and_write_cache function from builtin/am.c to read-cache.c, so it can be re-used in other places in a subsequent commit. Note that we return different error codes for failing to refresh the cache, and failing to write the index. The current caller only cares about failing to write the index. However for other callers we're going to convert in subsequent patches we will need this distinction. Helped-by: Martin Ågren <martin.agren@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-18commit-graph: add --[no-]progress to write and verifyGarima Singh
Add --[no-]progress to git commit-graph write and verify. The progress feature was introduced in 7b0f229 ("commit-graph write: add progress output", 2018-09-17) but the ability to opt-out was overlooked. Signed-off-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-18Merge branch 'md/list-objects-filter-combo'Junio C Hamano
The list-objects-filter API (used to create a sparse/lazy clone) learned to take a combined filter specification. * md/list-objects-filter-combo: list-objects-filter-options: make parser void list-objects-filter-options: clean up use of ALLOC_GROW list-objects-filter-options: allow mult. --filter strbuf: give URL-encoding API a char predicate fn list-objects-filter-options: make filter_spec a string_list list-objects-filter-options: move error check up list-objects-filter: implement composite filters list-objects-filter-options: always supply *errbuf list-objects-filter: put omits set in filter struct list-objects-filter: encapsulate filter components
2019-09-18Merge branch 'cc/multi-promisor'Junio C Hamano
Teach the lazy clone machinery that there can be more than one promisor remote and consult them in order when downloading missing objects on demand. * cc/multi-promisor: Move core_partial_clone_filter_default to promisor-remote.c Move repository_format_partial_clone to promisor-remote.c Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} remote: add promisor and partial clone config to the doc partial-clone: add multiple remotes in the doc t0410: test fetching from many promisor remotes builtin/fetch: remove unique promisor remote limitation promisor-remote: parse remote.*.partialclonefilter Use promisor_remote_get_direct() and has_promisor_remote() promisor-remote: use repository_format_partial_clone promisor-remote: add promisor_remote_reinit() promisor-remote: implement promisor_remote_get_direct() Add initial support for many promisor remotes fetch-object: make functions return an error code t0410: remove pipes after git commands
2019-09-18Merge branch 'js/pre-merge-commit-hook'Junio C Hamano
A new "pre-merge-commit" hook has been introduced. * js/pre-merge-commit-hook: merge: --no-verify to bypass pre-merge-commit hook git-merge: honor pre-merge-commit hook merge: do no-verify like commit t7503: verify proper hook execution
2019-09-18Merge branch 'jk/drop-release-pack-memory'Junio C Hamano
xmalloc() used to have a mechanism to ditch memory and address space resources as the last resort upon seeing an allocation failure from the underlying malloc(), which made the code complex and thread-unsafe with dubious benefit, as major memory resource users already do limit their uses with various other mechanisms. It has been simplified away. * jk/drop-release-pack-memory: packfile: drop release_pack_memory()
2019-09-18Merge branch 'js/rebase-r-strategy'Junio C Hamano
"git rebase --rebase-merges" learned to drive different merge strategies and pass strategy specific options to them. * js/rebase-r-strategy: t3427: accelerate this test by using fast-export and fast-import rebase -r: do not (re-)generate root commits with `--root` *and* `--onto` t3418: test `rebase -r` with merge strategies t/lib-rebase: prepare for testing `git rebase --rebase-merges` rebase -r: support merge strategies other than `recursive` t3427: fix another incorrect assumption t3427: accommodate for the `rebase --merge` backend having been replaced t3427: fix erroneous assumption t3427: condense the unnecessarily repetitive test cases into three t3427: move the `filter-branch` invocation into the `setup` case t3427: simplify the `setup` test case significantly t3427: add a clarifying comment rebase: fold git-rebase--common into the -p backend sequencer: the `am` and `rebase--interactive` scripts are gone .gitignore: there is no longer a built-in `git-rebase--interactive` t3400: stop referring to the scripted rebase Drop unused git-rebase--am.sh
2019-09-17clean: fix theoretical path corruptionElijah Newren
cmd_clean() had the following code structure: struct strbuf abs_path = STRBUF_INIT; for_each_string_list_item(item, &del_list) { strbuf_addstr(&abs_path, prefix); strbuf_addstr(&abs_path, item->string); PROCESS(&abs_path); strbuf_reset(&abs_path); } where I've elided a bunch of unnecessary details and PROCESS(&abs_path) represents a big chunk of code rather than an actual function call. One piece of PROCESS was: if (lstat(abs_path.buf, &st)) continue; which would cause the strbuf_reset() to be missed -- meaning that the next path to be handled would have two paths concatenated. This path used to use die_errno() instead of continue prior to commit 396049e5fb62 ("git-clean: refactor git-clean into two phases", 2013-06-25), but my understanding of how correct_untracked_entries() works is that it will prevent both dir/ and dir/file from being in the list to clean so this should be dead code and the die_errno() should be safe. But I hesitate to remove it since I am not certain. However, we can fix both this bug and possible similar future bugs by simply moving the strbuf_reset(&abs_path) to the beginning of the loop. It'll result in N calls to strbuf_reset() instead of N-1, but that's a small price to pay to avoid sneaky bugs like this. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-17clean: rewrap overly long lineElijah Newren
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-17clean: avoid removing untracked files in a nested git repositoryElijah Newren
Users expect files in a nested git repository to be left alone unless sufficiently forced (with two -f's). Unfortunately, in certain circumstances, git would delete both tracked (and possibly dirty) files and untracked files within a nested repository. To explain how this happens, let's contrast a couple cases. First, take the following example setup (which assumes we are already within a git repo): git init nested cd nested >tracked git add tracked git commit -m init >untracked cd .. In this setup, everything works as expected; running 'git clean -fd' will result in fill_directory() returning the following paths: nested/ nested/tracked nested/untracked and then correct_untracked_entries() would notice this can be compressed to nested/ and then since "nested/" is a directory, we would call remove_dirs("nested/", ...), which would check is_nonbare_repository_dir() and then decide to skip it. However, if someone also creates an ignored file: >nested/ignored then running 'git clean -fd' would result in fill_directory() returning the same paths: nested/ nested/tracked nested/untracked but correct_untracked_entries() will notice that we had ignored entries under nested/ and thus simplify this list to nested/tracked nested/untracked Since these are not directories, we do not call remove_dirs() which was the only place that had the is_nonbare_repository_dir() safety check -- resulting in us deleting both the untracked file and the tracked (and possibly dirty) file. One possible fix for this issue would be walking the parent directories of each path and checking if they represent nonbare repositories, but that would be wasteful. Even if we added caching of some sort, it's still a waste because we should have been able to check that "nested/" represented a nonbare repository before even descending into it in the first place. Add a DIR_SKIP_NESTED_GIT flag to dir_struct.flags and use it to prevent fill_directory() and friends from descending into nested git repos. With this change, we also modify two regression tests added in commit 91479b9c72f1 ("t7300: add tests to document behavior of clean and nested git", 2015-06-15). That commit, nor its series, nor the six previous iterations of that series on the mailing list discussed why those tests coded the expectation they did. In fact, it appears their purpose was simply to test _existing_ behavior to make sure that the performance changes didn't change the behavior. However, these two tests directly contradicted the manpage's claims that two -f's were required to delete files/directories under a nested git repository. While one could argue that the user gave an explicit path which matched files/directories that were within a nested repository, there's a slippery slope that becomes very difficult for users to understand once you go down that route (e.g. what if they specified "git clean -f -d '*.c'"?) It would also be hard to explain what the exact behavior was; avoid such problems by making it really simple. Also, clean up some grammar errors describing this functionality in the git-clean manpage. Finally, there are still a couple bugs with -ffd not cleaning out enough (e.g. missing the nested .git) and with -ffdX possibly cleaning out the wrong files (paying attention to outer .gitignore instead of inner). This patch does not address these cases at all (and does not change the behavior relative to those flags), it only fixes the handling when given a single -f. See https://public-inbox.org/git/20190905212043.GC32087@szeder.dev/ for more discussion of the -ffd[X?] bugs. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-17clean: disambiguate the definition of -dElijah Newren
The -d flag pre-dated git-clean's ability to have paths specified. As such, the default for git-clean was to only remove untracked files in the current directory, and -d existed to allow it to recurse into subdirectories. The interaction of paths and the -d option appears to not have been carefully considered, as evidenced by numerous bugs and a dearth of tests covering such pairings in the testsuite. The definition turns out to be important, so let's look at some of the various ways one could interpret the -d option: A) Without -d, only look in subdirectories which contain tracked files under them; with -d, also look in subdirectories which are untracked for files to clean. B) Without specified paths from the user for us to delete, we need to have some kind of default, so...without -d, only look in subdirectories which contain tracked files under them; with -d, also look in subdirectories which are untracked for files to clean. The important distinction here is that choice B says that the presence or absence of '-d' is irrelevant if paths are specified. The logic behind option B is that if a user explicitly asked us to clean a specified pathspec, then we should clean anything that matches that pathspec. Some examples may clarify. Should git clean -f untracked_dir/file remove untracked_dir/file or not? It seems crazy not to, but a strict reading of option A says it shouldn't be removed. How about git clean -f untracked_dir/file1 tracked_dir/file2 or git clean -f untracked_dir_1/file1 untracked_dir_2/file2 ? Should it remove either or both of these files? Should it require multiple runs to remove both the files listed? (If this sounds like a crazy question to even ask, see the commit message of "t7300: Add some testcases showing failure to clean specified pathspecs" added earlier in this patch series.) What if -ffd were used instead of -f -- should that allow these to be removed? Should it take multiple invocations with -ffd? What if a glob (such as '*tracked*') were used instead of spelling out the directory names? What if the filenames involved globs, such as git clean -f '*.o' or git clean -f '*/*.o' ? The current documentation actually suggests a definition that is slightly different than choice A, and the implementation prior to this series provided something radically different than either choices A or B. (The implementation, though, was clearly just buggy). There may be other choices as well. However, for almost any given choice of definition for -d that I can think of, some of the examples above will appear buggy to the user. The only case that doesn't have negative surprises is choice B: treat a user-specified path as a request to clean all untracked files which match that path specification, including recursing into any untracked directories. Change the documentation and basic implementation to use this definition. There were two regression tests that indirectly depended on the current implementation, but neither was about subdirectory handling. These two tests were introduced in commit 5b7570cfb41c ("git-clean: add tests for relative path", 2008-03-07) which was solely created to add coverage for the changes in commit fb328947c8e ("git-clean: correct printing relative path", 2008-03-07). Both tests specified a directory that happened to have an untracked subdirectory, but both were only checking that the resulting printout of a file that was removed was shown with a relative path. Update these tests appropriately. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-16fetch: use oidset to keep the want OIDs for faster lookupMasaya Suzuki
During git-fetch, the client checks if the advertised tags' OIDs are already in the fetch request's want OID set. This check is done in a linear scan. For a repository that has a lot of refs, repeating this scan takes 15+ minutes. In order to speed this up, create a oid_set for other refs' OIDs. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-16list-objects-filter: delay parsing of sparse oidJeff King
The list-objects-filter code has two steps to its initialization: 1. parse_list_objects_filter() makes sure the spec is a filter we know about and is syntactically correct. This step is done by "rev-list" or "upload-pack" that is going to apply a filter, but also by "git clone" or "git fetch" before they send the spec across the wire. 2. list_objects_filter__init() runs the type-specific initialization (using function pointers established in step 1). This happens at the start of traverse_commit_list_filtered(), when we're about to actually use the filter. It's a good idea to parse as much as we can in step 1, in order to catch problems early (e.g., a blob size limit that isn't a number). But one thing we _shouldn't_ do is resolve any oids at that step (e.g., for sparse-file contents specified by oid). In the case of a fetch, the oid has to be resolved on the remote side. The current code does resolve the oid during the parse phase, but ignores any error (which we must do, because we might just be sending the spec across the wire). This leads to two bugs: - if we're not in a repository (e.g., because it's git-clone parsing the spec), then we trigger a BUG() trying to resolve the name - if we did hit the error case, we still have to notice that later and bail. The code path in rev-list handles this, but the one in upload-pack does not, leading to a segfault. We can fix both by moving the oid resolution into the sparse-oid init function. At that point we know we have a repository (because we're about to traverse), and handling the error there fixes the segfault. As a bonus, we can drop the NULL sparse_oid_value check in rev-list, since this is now handled in the sparse-oid-filter init function. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-09Merge branch 'cb/fetch-set-upstream'Junio C Hamano
"git fetch" learned "--set-upstream" option to help those who first clone from their private fork they intend to push to, add the true upstream via "git remote add" and then "git fetch" from it. * cb/fetch-set-upstream: pull, fetch: add --set-upstream option
2019-09-09Merge branch 'en/checkout-mismerge-fix'Junio C Hamano
Fix a mismerge that happened in 2.22 timeframe. * en/checkout-mismerge-fix: checkout: remove duplicate code
2019-09-09Merge branch 'ds/feature-macros'Junio C Hamano
A mechanism to affect the default setting for a (related) group of configuration variables is introduced. * ds/feature-macros: repo-settings: create feature.experimental setting repo-settings: create feature.manyFiles setting repo-settings: parse core.untrackedCache commit-graph: turn on commit-graph by default t6501: use 'git gc' in quiet mode repo-settings: consolidate some config settings
2019-09-09commit-graph: turn off save_commit_bufferJeff King
The commit-graph tool may read a lot of commits, but it only cares about parsing their metadata (parents, trees, etc) and doesn't ever show the messages to the user. And so it should not need save_commit_buffer, which is meant for holding onto the object data of parsed commits so that we can show them later. In fact, it's quite harmful to do so. According to massif, the max heap of "git commit-graph write --reachable" in linux.git before/after this patch (removing the commit graph file in between) goes from ~1.1GB to ~270MB. Which isn't surprising, since the difference is about the sum of the uncompressed sizes of all commits in the repository, and this was equivalent to leaking them. This obviously helps if you're under memory pressure, but even without it, things go faster. My before/after times for that command (without massif) went from 12.521s to 11.874s, a speedup of ~5%. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-07builtin/rebase.c: Remove pointless messageBen Wijen
When doing 'git rebase --autostash <upstream> <master>' with a dirty worktree a 'HEAD is now at ...' message is emitted, which is pointless as it refers to the old active branch which isn't actually moved. This commit removes the 'HEAD is now at...' message. Signed-off-by: Ben Wijen <ben@wijen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-07builtin/rebase.c: make sure the active branch isn't moved when autostashingBen Wijen
Consider the following scenario: git checkout not-the-master work work work git rebase --autostash upstream master Here 'rebase --autostash <upstream> <branch>' incorrectly moves the active branch (not-the-master) to master (before the rebase). The expected behavior: (58794775:/git-rebase.sh:526) AUTOSTASH=$(git stash create autostash) git reset --hard git checkout master git rebase upstream git stash apply $AUTOSTASH The actual behavior: (6defce2b:/builtin/rebase.c:1062) AUTOSTASH=$(git stash create autostash) git reset --hard master git checkout master git rebase upstream git stash apply $AUTOSTASH This commit reinstates the 'legacy script' behavior as introduced with 58794775: rebase: implement --[no-]autostash and rebase.autostash Signed-off-by: Ben Wijen <ben@wijen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-06pack-objects: drop packlist index_pos optimizationJeff King
Once upon a time, the code to add an object to our packing list in pack-objects all lived in a single function. It computed the position within the hash table once, then used it to check if the object was already present, and if not, to add it. Later, in 2834bc27c1 (pack-objects: refactor the packing list, 2013-10-24), this was split into two functions: packlist_find() and packlist_alloc(). We ended up with an "index_pos" variable that gets passed through several functions to make it from one to the other. The resulting code is rather confusing to follow. The "index_pos" variable is sometimes undefined, if we don't yet have a hash table. This works out in practice because in that case packlist_alloc() won't use it at all, since it will have to create/grow the hash table. But it's hard to verify that, and it does cause gcc 9.2.1's -Wmaybe-uninitialized to complain when compiled with "-flto -O3" (rightfully, since we do pass the uninitialized value as a function parameter, even if nobody ends up using it). All of this is to save computing the hash index again when we're inserting into the hash table, which I found doesn't make a measurable difference in the program runtime (which is not surprising, since we're doing all kinds of other heavyweight things for each object). Let's just drop this index_pos variable entirely, simplifying the code (and pleasing the compiler). We might be better still refactoring this custom hash table to use one of our existing implementations (an oidmap, or a kh_oid_map). I stopped short of that here, but this would be the likely first step towards that anyway. Reported-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-06pack-objects: use object_id in packlist_alloc()Jeff King
The only caller of packlist_alloc() already has a "struct object_id", and we immediately copy the hash they pass us into our own object_id. Let's avoid the unnecessary round-trip to a raw sha1 pointer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-06git-am: handle missing "author" when parsing commitJeff King
We try to parse the "author" line out of a commit buffer. We handle the case that split_ident_line() doesn't work, but we don't do any error checking that we found an "author" line in the first place! This would cause us to segfault on such a corrupt object. Let's put in an explicit NULL check (we can just die(), which is what a bogus split would do, too). As a bonus, this silences a warning when compiling with gcc 9.2.1 using "-flto -O3", which claims that ident_len may be uninitialized (it would only be if we had a NULL here). Reported-by: Stephan Beyer <s-beyer@gmx.net> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05use get_tagged_oid()René Scharfe
Avoid derefencing ->tagged without checking for NULL by using the convenience wrapper for getting the ID of the tagged object. It die()s when encountering a broken tag instead of segfaulting. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05treewide: rename 'exclude' methods to 'pattern'Derrick Stolee
The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames several methods defined in dir.h to make more sense with the renamed 'struct exclude_list' to 'struct pattern_list' and 'struct exclude' to 'struct path_pattern': * last_exclude_matching() -> last_matching_pattern() * parse_exclude() -> parse_path_pattern() In addition, the word 'exclude' was replaced with 'pattern' in the methods below: * add_exclude_list() * add_excludes_from_file_to_list() * add_excludes_from_file() * add_excludes_from_blob_to_list() * add_exclude() * clear_exclude_list() A few methods with the word "exclude" remain. These will be handled seperately. In particular, the method "is_excluded()" is concretely about the .gitignore file relative to a specific directory. This is the important boundary between library and consumer: is_excluded() cares about .gitignore, but is_excluded() calls last_matching_pattern() to make that decision. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05treewide: rename 'EXCL_FLAG_' to 'PATTERN_FLAG_'Derrick Stolee
The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit replaces 'EXCL_FLAG_' to 'PATTERN_FLAG_' in the names of the flags used on 'struct path_pattern'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05treewide: rename 'struct exclude_list' to 'struct pattern_list'Derrick Stolee
The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a 'struct exclude_list' makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames 'struct exclude_list' to 'struct pattern_list' and renames several variables called 'el' to 'pl'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-05treewide: rename 'struct exclude' to 'struct path_pattern'Derrick Stolee
The first consumer of pattern-matching filenames was the .gitignore feature. In that context, storing a list of patterns as a list of 'struct exclude' items makes sense. However, the sparse-checkout feature then adopted these structures and methods, but with the opposite meaning: these patterns match the files that should be included! It would be clearer to rename this entire library as a "pattern matching" library, and the callers apply exclusion/inclusion logic accordingly based on their needs. This commit renames 'struct exclude' to 'struct path_pattern' and renames several variable names to match. 'struct pattern' was already taken by attr.c, and this more completely describes that the patterns are specific to file paths. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-03fetch: add fetch.writeCommitGraph config settingDerrick Stolee
The commit-graph feature is now on by default, and is being written during 'git gc' by default. Typically, Git only writes a commit-graph when a 'git gc --auto' command passes the gc.auto setting to actualy do work. This means that a commit-graph will typically fall behind the commits that are being used every day. To stay updated with the latest commits, add a step to 'git fetch' to write a commit-graph after fetching new objects. The fetch.writeCommitGraph config setting enables writing a split commit-graph, so on average the cost of writing this file is very small. Occasionally, the commit-graph chain will collapse to a single level, and this could be slow for very large repos. For additional use, adjust the default to be true when feature.experimental is enabled. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>