summaryrefslogtreecommitdiff
path: root/builtin/repack.c
AgeCommit message (Collapse)Author
2019-11-10Merge branch 'wb/midx-progress'Junio C Hamano
The code to generate multi-pack index learned to show (or not to show) progress indicators. * wb/midx-progress: multi-pack-index: add [--[no-]progress] option. midx: honor the MIDX_PROGRESS flag in midx_repack midx: honor the MIDX_PROGRESS flag in verify_midx_file midx: add progress to expire_midx_packs midx: add progress to write_midx_file midx: add MIDX_PROGRESS flag
2019-10-23midx: add MIDX_PROGRESS flagWilliam Baker
Add the MIDX_PROGRESS flag and update the write|verify|expire|repack functions in midx.h to accept a flags parameter. The MIDX_PROGRESS flag indicates whether the caller of the function would like progress information to be displayed. This patch only changes the method prototypes and does not change the functionality. The functionality change will be handled by a later patch. Signed-off-by: William Baker <William.Baker@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-16fetch-pack: write fetched refs to .promisorJonathan Tan
The specification of promisor packfiles (in partial-clone.txt) states that the .promisor files that accompany packfiles do not matter (just like .keep files), so whenever a packfile is fetched from the promisor remote, Git has been writing empty .promisor files. But these files could contain more useful information. So instead of writing empty files, write the refs fetched to these files. This makes it easier to debug issues with partial clones, as we can identify what refs (and their associated hashes) were fetched at the time the packfile was downloaded, and if necessary, compare those hashes against what the promisor remote reports now. This is implemented by teaching fetch-pack to write its own non-empty .promisor file whenever it knows the name of the pack's lockfile. This covers the case wherein the user runs "git fetch" with an internal protocol or HTTP protocol v2 (fetch_refs_via_pack() in transport.c sets lock_pack) and with HTTP protocol v0/v1 (fetch_git() in remote-curl.c passes "--lock-pack" to "fetch-pack"). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Acked-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-11Merge branch 'bc/object-id-part17'Junio C Hamano
Preparation for SHA-256 upgrade continues. * bc/object-id-part17: (26 commits) midx: switch to using the_hash_algo builtin/show-index: replace sha1_to_hex rerere: replace sha1_to_hex builtin/receive-pack: replace sha1_to_hex builtin/index-pack: replace sha1_to_hex packfile: replace sha1_to_hex wt-status: convert struct wt_status to object_id cache: remove null_sha1 builtin/worktree: switch null_sha1 to null_oid builtin/repack: write object IDs of the proper length pack-write: use hash_to_hex when writing checksums sequencer: convert to use the_hash_algo bisect: switch to using the_hash_algo sha1-lookup: switch hard-coded constants to the_hash_algo config: use the_hash_algo in abbrev comparison combine-diff: replace GIT_SHA1_HEXSZ with the_hash_algo bundle: switch to use the_hash_algo connected: switch GIT_SHA1_HEXSZ to the_hash_algo show-index: switch hard-coded constants to the_hash_algo blame: remove needless comparison with GIT_SHA1_HEXSZ ...
2019-09-18Merge branch 'cc/multi-promisor'Junio C Hamano
Teach the lazy clone machinery that there can be more than one promisor remote and consult them in order when downloading missing objects on demand. * cc/multi-promisor: Move core_partial_clone_filter_default to promisor-remote.c Move repository_format_partial_clone to promisor-remote.c Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} remote: add promisor and partial clone config to the doc partial-clone: add multiple remotes in the doc t0410: test fetching from many promisor remotes builtin/fetch: remove unique promisor remote limitation promisor-remote: parse remote.*.partialclonefilter Use promisor_remote_get_direct() and has_promisor_remote() promisor-remote: use repository_format_partial_clone promisor-remote: add promisor_remote_reinit() promisor-remote: implement promisor_remote_get_direct() Add initial support for many promisor remotes fetch-object: make functions return an error code t0410: remove pipes after git commands
2019-08-19builtin/repack: write object IDs of the proper lengthbrian m. carlson
Use the_hash_algo when calling xwrite with a hex object ID so that the proper amount of data is written. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-31repack: simplify handling of auto-bitmaps and .keep filesJeff King
Commit 7328482253 (repack: disable bitmaps-by-default if .keep files exist, 2019-06-29) taught repack to prefer disabling bitmaps to duplicating objects (unless bitmaps were asked for explicitly). But there's an easier way to do this: if we keep passing the --honor-pack-keep flag to pack-objects when auto-enabling bitmaps, then pack-objects already makes the same decision (it will disable bitmaps rather than duplicate). Better still, pack-objects can actually decide to do so based not just on the presence of a .keep file, but on whether that .keep file actually impacts the new pack we're making (so if we're racing with a push or fetch, for example, their temporary .keep file will not block us from generating bitmaps if they haven't yet updated their refs). And because repack uses the --write-bitmap-index-quiet flag, we don't have to worry about pack-objects generating confusing warnings when it does see a .keep file. We can confirm this by tweaking the .keep test to check repack's stderr. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-31repack: silence warnings when auto-enabled bitmaps cannot be builtJeff King
Depending on various config options, a full repack may not be able to build a reachability bitmap index (e.g., if pack.packSizeLimit forces us to write multiple packs). In these cases pack-objects may write a warning to stderr. Since 36eba0323d (repack: enable bitmaps by default on bare repos, 2019-03-14), we may generate these warnings even when the user did not explicitly ask for bitmaps. This has two downsides: - it can be confusing, if they don't know what bitmaps are - a daemonized auto-gc will write this to its log file, and the presence of the warning may suppress further auto-gc (until gc.logExpiry has elapsed) Let's have repack communicate to pack-objects that the choice to turn on bitmaps was not made explicitly by the user, which in turn allows pack-objects to suppress these warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-19Merge branch 'ew/repack-with-bitmaps-by-default'Junio C Hamano
Generation of pack bitmaps are now disabled when .keep files exist, as these are mutually exclusive features. * ew/repack-with-bitmaps-by-default: repack: disable bitmaps-by-default if .keep files exist
2019-07-19Merge branch 'ds/midx-expire-repack'Junio C Hamano
"git multi-pack-index" learned expire and repack subcommands. * ds/midx-expire-repack: t5319: use 'test-tool path-utils' instead of 'ls -l' t5319-multi-pack-index.sh: test batch size zero midx: add test that 'expire' respects .keep files multi-pack-index: test expire while adding packs midx: implement midx_repack() multi-pack-index: prepare 'repack' subcommand multi-pack-index: implement 'expire' subcommand midx: refactor permutation logic and pack sorting midx: simplify computation of pack name lengths multi-pack-index: prepare for 'expire' subcommand Docs: rearrange subcommands for multi-pack-index repack: refactor pack deletion for future use
2019-07-09Merge branch 'ds/close-object-store'Junio C Hamano
The commit-graph file is now part of the "files that the runtime may keep open file descriptors on, all of which would need to be closed when done with the object store", and the file descriptor to an existing commit-graph file now is closed before "gc" finalizes a new instance to replace it. * ds/close-object-store: packfile: rename close_all_packs to close_object_store packfile: close commit-graph in close_all_packs commit-graph: use raw_object_store when closing
2019-07-01repack: disable bitmaps-by-default if .keep files existEric Wong
Bitmaps aren't useful with multiple packs, and users with .keep files ended up with redundant packs when bitmaps got enabled by default in bare repos. So detect when .keep files exist and stop enabling bitmaps by default in that case. Wasteful (but otherwise harmless) race conditions with .keep files documented by Jeff King still apply and there's a chance we'd still end up with redundant data on the FS: https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/ v2: avoid subshell in test case, be multi-index aware Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos") Signed-off-by: Eric Wong <e@80x24.org> Helped-by: Jeff King <peff@peff.net> Reported-by: Janos Farkas <chexum@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25Use promisor_remote_get_direct() and has_promisor_remote()Christian Couder
Instead of using the repository_format_partial_clone global and fetch_objects() directly, let's use has_promisor_remote() and promisor_remote_get_direct(). This way all the configured promisor remotes will be taken into account, not only the one specified by extensions.partialClone. Also when cloning or fetching using a partial clone filter, remote.origin.promisor will be set to "true" instead of setting extensions.partialClone to "origin". This makes it possible to use many promisor remote just by fetching from them. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-12packfile: rename close_all_packs to close_object_storeDerrick Stolee
The close_all_packs() method is now responsible for more than just pack-files. It also closes the commit-graph and the multi-pack-index. Rename the function to be more descriptive of its larger role. The name also fits because the input parameter is a raw_object_store. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-11repack: refactor pack deletion for future useDerrick Stolee
The repack builtin deletes redundant pack-files and their associated .idx, .promisor, .bitmap, and .keep files. We will want to re-use this logic in the future for other types of repack, so pull the logic into 'unlink_pack_path()' in packfile.c. The 'ignore_keep' parameter is enabled for the use in repack, but will be important for a future caller. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-18repack: enable bitmaps by default on bare reposEric Wong
A typical use case for bare repos is for serving clones and fetches to clients. Enable bitmaps by default on bare repos to make it easier for admins to host git repos in a performant way. Signed-off-by: Eric Wong <e@80x24.org> Helped-by: Jeff King <peff@peff.net> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-18Merge branch 'js/gc-repack-close-before-remove'Junio C Hamano
"git gc" and "git repack" did not close the open packfiles that they found unneeded before removing them, which didn't work on a platform incapable of removing an open file. This has been corrected. * js/gc-repack-close-before-remove: gc/repack: release packs when needed
2019-01-11gc/repack: release packs when neededJohannes Schindelin
On Windows, files cannot be removed nor renamed if there are still handles held by a process. To remedy that, we introduced the close_all_packs() function. Earlier, we made sure that the packs are released just before `git gc` is spawned, in case that gc wants to remove no-longer needed packs. But this developer forgot that gc itself also needs to let go of packs, e.g. when consolidating all packs via the --aggressive option. Likewise, `git repack -d` wants to delete obsolete packs and therefore needs to close all pack handles, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-04Merge branch 'nd/i18n'Junio C Hamano
More _("i18n") markings. * nd/i18n: fsck: mark strings for translation fsck: reduce word legos to help i18n parse-options.c: mark more strings for translation parse-options.c: turn some die() to BUG() parse-options: replace opterror() with optname() repack: mark more strings for translation remote.c: mark messages for translation remote.c: turn some error() or die() to BUG() reflog: mark strings for translation read-cache.c: add missing colon separators read-cache.c: mark more strings for translation read-cache.c: turn die("internal error") to BUG() attr.c: mark more string for translation archive.c: mark more strings for translation alias.c: mark split_cmdline_strerror() strings for translation git.c: mark more strings for translation
2018-11-13Merge branch 'ds/test-multi-pack-index'Junio C Hamano
Tests for the recently introduced multi-pack index machinery. * ds/test-multi-pack-index: packfile: close multi-pack-index in close_all_packs multi-pack-index: define GIT_TEST_MULTI_PACK_INDEX midx: close multi-pack-index on repack midx: fix broken free() in close_midx()
2018-11-12repack: mark more strings for translationNguyễn Thái Ngọc Duy
Two strings are slightly updated to be consistent with the rest: die() starts with lowercase. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-06Merge branch 'js/shallow-and-fetch-prune'Junio C Hamano
"git repack" in a shallow clone did not correctly update the shallow points in the repository, leading to a repository that does not pass fsck. * js/shallow-and-fetch-prune: repack -ad: prune the list of shallow commits shallow: offer to prune only non-existing entries repack: point out a bug handling stale shallow info
2018-10-25repack -ad: prune the list of shallow commitsJohannes Schindelin
`git repack` can drop unreachable commits without further warning, making the corresponding entries in `.git/shallow` invalid, which causes serious problems when deepening the branches. One scenario where unreachable commits are dropped by `git repack` is when a `git fetch --prune` (or even a `git fetch` when a ref was force-pushed in the meantime) can make a commit unreachable that was reachable before. Therefore it is not safe to assume that a `git repack -adlf` will keep unreachable commits alone (under the assumption that they had not been packed in the first place, which is an assumption at least some of Git's code seems to make). This is particularly important to keep in mind when looking at the `.git/shallow` file: if any commits listed in that file become unreachable, it is not a problem, but if they go missing, it *is* a problem. One symptom of this problem is that a deepening fetch may now fail with fatal: error in object: unshallow <commit-hash> To avoid this problem, let's prune the shallow list in `git repack` when the `-d` option is passed, unless `-A` is passed, too (which would force the now-unreachable objects to be turned into loose objects instead of being deleted). Additionally, we also need to take `--keep-reachable` and `--unpack-unreachable=<date>` into account. Note: an alternative solution discussed during the review of this patch was to teach `git fetch` to simply ignore entries in .git/shallow if the corresponding commits do not exist locally. A quick test, however, revealed that the .git/shallow file is written during a shallow *clone*, in which case the commits do not exist, either, but the "shallow" line *does* need to be sent. Therefore, this approach would be a lot more finicky than the approach presented by the this patch. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-22multi-pack-index: define GIT_TEST_MULTI_PACK_INDEXDerrick Stolee
The multi-pack-index feature is tested in isolation by t5319-multi-pack-index.sh, but there are many more interesting scenarios in the test suite surrounding pack-file data shapes and interactions. Since the multi-pack-index is an optional data structure, it does not make sense to include it by default in those tests. Instead, add a new GIT_TEST_MULTI_PACK_INDEX environment variable that enables core.multiPackIndex and writes a multi-pack-index after each 'git repack' command. This adds extra test coverage when needed. There are a few spots in the test suite that need to react to this change: * t5319-multi-pack-index.sh: there is a test that checks that 'git repack' deletes the multi-pack-index. Disable the environment variable to ensure this still happens. * t5310-pack-bitmaps.sh: One test moves a pack-file from the object directory to an alternate. This breaks the multi-pack-index, so delete the multi-pack-index at this point, if it exists. * t9300-fast-import.sh: One test verifies the number of files in the .git/objects/pack directory is exactly 8. Exclude the multi-pack-index from this count so it is still 8 in all cases. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-22midx: close multi-pack-index on repackDerrick Stolee
When repacking, we may remove pack-files. This invalidates the multi-pack-index (if it exists). Previously, we removed the multi-pack-index file before removing any pack-file. In some cases, the repack command may load the multi-pack-index into memory. This may lead to later in-memory references to the non-existent pack- files. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-15builtin/repack: replace hard-coded constantsbrian m. carlson
Note that while the error messages here are not translated, the end user should never see them. We invoke git pack-objects shortly before both invocations, so we can be fairly certain that the data we're receiving is in fact valid. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-17Merge branch 'cc/delta-islands'Junio C Hamano
Lift code from GitHub to restrict delta computation so that an object that exists in one fork is not made into a delta against another object that does not appear in the same forked repository. * cc/delta-islands: pack-objects: move 'layer' into 'struct packing_data' pack-objects: move tree_depth into 'struct packing_data' t5320: tests for delta islands repack: add delta-islands support pack-objects: add delta-islands support pack-objects: refactor code into compute_layer_order() Add delta-islands.{c,h}
2018-08-20Sync 'ds/multi-pack-index' to v2.19.0-rc0Junio C Hamano
* ds/multi-pack-index: (23 commits) midx: clear midx on repack packfile: skip loading index if in multi-pack-index midx: prevent duplicate packfile loads midx: use midx in approximate_object_count midx: use existing midx when writing new one midx: use midx in abbreviation calculations midx: read objects from multi-pack-index config: create core.multiPackIndex setting midx: write object offsets midx: write object id fanout chunk midx: write object ids in a chunk midx: sort and deduplicate objects from packfiles midx: read pack names into array multi-pack-index: write pack names in chunk multi-pack-index: read packfile list packfile: generalize pack directory list t5319: expand test data multi-pack-index: load into memory midx: write header information to lockfile multi-pack-index: add 'write' verb ...
2018-08-20Merge branch 'jt/repack-promisor-packs'Junio C Hamano
After a partial clone, repeated fetches from promisor remote would have accumulated many packfiles marked with .promisor bit without getting them coalesced into fewer packfiles, hurting performance. "git repack" now learned to repack them. * jt/repack-promisor-packs: repack: repack promisor objects if -a or -A is set repack: refactor setup of pack-objects cmd
2018-08-16repack: add delta-islands supportJeff King
Implement simple support for --delta-islands option and repack.useDeltaIslands config variable in git repack. This allows users to setup delta islands in their config and get the benefit of less disk usage while cloning and fetching is still quite fast and not much more CPU intensive. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-09repack: repack promisor objects if -a or -A is setJonathan Tan
Currently, repack does not touch promisor packfiles at all, potentially causing the performance of repositories that have many such packfiles to drop. Therefore, repack all promisor objects if invoked with -a or -A. This is done by an additional invocation of pack-objects on all promisor objects individually given, which takes care of deduplication and allows the resulting packfiles to respect flags such as --max-pack-size. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-09repack: refactor setup of pack-objects cmdJonathan Tan
A subsequent patch will teach repack to run pack-objects with some same and some different arguments if repacking of promisor objects is required. Refactor the setup of the pack-objects cmd so that setting up the arguments common to both is done in a function. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20midx: clear midx on repackDerrick Stolee
If a 'git repack' command replaces existing packfiles, then we must clear the existing multi-pack-index before moving the packfiles it references. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16repack: add --keep-pack optionNguyễn Thái Ngọc Duy
We allow to keep existing packs by having companion .keep files. This is helpful when a pack is permanently kept. In the next patch, git-gc just wants to keep a pack temporarily, for one pack-objects run. git-gc can use --keep-pack for this use case. A note about why the pack_keep field cannot be reused and pack_keep_in_core has to be added. This is about the case when --keep-pack is specified together with either --keep-unreachable or --unpack-unreachable, but --honor-pack-keep is NOT specified. In this case, we want to exclude objects from the packs specified on command line, not from ones with .keep files. If only one bit flag is used, we have to clear pack_keep on pack files with the .keep file. But we can't make any assumption about unreachable objects in .keep packs. If "pack_keep" field is false for .keep packs, we could potentially pull lots of unreachable objects into the new pack, or unpack them loose. The safer approach is ignore all packs with either .keep file or --keep-pack. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08gc: do not repack promisor packfilesJonathan Tan
Teach gc to stop traversal at promisor objects, and to leave promisor packfiles alone. This has the effect of only repacking non-promisor packfiles, and preserves the distinction between promisor packfiles and non-promisor packfiles. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'bw/config-h'Junio C Hamano
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-15config: don't include config.h by defaultBrandon Williams
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-26repack: accept --threads=<n> and pass it down to pack-objectsJunio C Hamano
We already do so for --window=<n> and --depth=<n>; this will help when the user wants to force --threads=1 for reproducible testing without getting affected by racing multiple threads. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-29repack: die on incremental + write-bitmap-indexDavid Turner
The bitmap index only works for single packs, so requesting an incremental repack with bitmap indexes makes no sense. Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-13Merge branch 'va/i18n-even-more'Junio C Hamano
More markings of messages for i18n, with updates to various tests to pass GETTEXT_POISON tests. One patch from the original submission dropped due to conflicts with jk/upload-pack-hook, which is still in flux. * va/i18n-even-more: (38 commits) t5541: become resilient to GETTEXT_POISON i18n: branch: mark comment when editing branch description for translation i18n: unmark die messages for translation i18n: submodule: escape shell variables inside eval_gettext i18n: submodule: join strings marked for translation i18n: init-db: join message pieces i18n: remote: allow translations to reorder message i18n: remote: mark URL fallback text for translation i18n: standardise messages i18n: sequencer: add period to error message i18n: merge: change command option help to lowercase i18n: merge: mark messages for translation i18n: notes: mark options for translation i18n: notes: mark strings for translation i18n: transport-helper.c: change N_() call to _() i18n: bisect: mark strings for translation t5523: use test_i18ngrep for negation t4153: fix negated test_i18ngrep call t9003: become resilient to GETTEXT_POISON tests: unpack-trees: update to use test_i18n* functions ...
2016-06-17i18n: standardise messagesVasco Almeida
Standardise messages in order to save translators some work. Nuances fixed in this commit: "failed to read %s" "read of %s failed" "detach the HEAD at named commit" "detach HEAD at named commit" "removing '%s' failed" "failed to remove '%s'" "index file corrupt" "corrupt index file" "failed to read %s" "read of %s failed" "detach the HEAD at named commit" "detach HEAD at named commit" Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14repack: extend --keep-unreachable to loose objectsJeff King
If you use "repack -adk" currently, we will pack all objects that are already packed into the new pack, and then drop the old packs. However, loose unreachable objects will be left as-is. In theory these are meant to expire eventually with "git prune". But if you are using "repack -k", you probably want to keep things forever and therefore do not run "git prune" at all. Meaning those loose objects may build up over time and end up fooling any object-count heuristics (such as the one done by "gc --auto", though since git-gc does not support "repack -k", this really applies to whatever custom scripts people might have driving "repack -k"). With this patch, we instead stuff any loose unreachable objects into the pack along with the already-packed unreachable objects. This may seem wasteful, but it is really no more so than using "repack -k" in the first place. We are at a slight disadvantage, in that we have no useful ordering for the result, or names to hand to the delta code. However, this is again no worse than what "repack -k" is already doing for the packed objects. The packing of these objects doesn't matter much because they should not be accessed frequently (unless they actually _do_ become referenced, but then they would get moved to a different part of the packfile during the next repack). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-14repack: add --keep-unreachable optionJeff King
The usual way to do a full repack (and what is done by git-gc) is to run "repack -Ad --unpack-unreachable=<when>", which will loosen any unreachable objects newer than "<when>", and drop any older ones. This is a safer alternative to "repack -ad", because "<when>" becomes a grace period during which we will not drop any new objects that are about to be referenced. However, it isn't perfectly safe. It's always possible that a process is about to reference an old object. Even if that process were to take care to update the timestamp on the object, there is no atomicity with a simultaneously running "repack" process. So while unlikely, there is a small race wherein we may drop an object that is in the process of being referenced. If you do automated repacking on a large number of active repositories, you may hit it eventually, and the result is a corrupted repository. It would be nice to fix that race in the long run, but it's complicated. In the meantime, there is a much simpler strategy for automated repository maintenance: do not drop objects at all. We already have a "--keep-unreachable" option in pack-objects; we just need to plumb it through from git-repack. Note that this _isn't_ plumbed through from git-gc, so at this point it's strictly a tool for people doing their own advanced repository maintenance strategy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15strbuf: introduce strbuf_getline_{lf,nul}()Junio C Hamano
The strbuf_getline() interface allows a byte other than LF or NUL as the line terminator, but this is only because I wrote these codepaths anticipating that there might be a value other than NUL and LF that could be useful when I introduced line_termination long time ago. No useful caller that uses other value has emerged. By now, it is clear that the interface is overly broad without a good reason. Many codepaths have hardcoded preference to read either LF terminated or NUL terminated records from their input, and then call strbuf_getline() with LF or NUL as the third parameter. This step introduces two thin wrappers around strbuf_getline(), namely, strbuf_getline_lf() and strbuf_getline_nul(), and mechanically rewrites these call sites to call either one of them. The changes contained in this patch are: * introduction of these two functions in strbuf.[ch] * mechanical conversion of all callers to strbuf_getline() with either '\n' or '\0' as the third parameter to instead call the respective thin wrapper. After this step, output from "git grep 'strbuf_getline('" would become a lot smaller. An interim goal of this series is to make this an empty set, so that we can have strbuf_getline_crlf() take over the shorter name strbuf_getline(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-26Merge branch 'jk/repository-extension'Junio C Hamano
Prepare for Git on-disk repository representation to undergo backward incompatible changes by introducing a new repository format version "1", with an extension mechanism. * jk/repository-extension: introduce "preciousObjects" repository extension introduce "extensions" form of core.repositoryformatversion
2015-08-10prefer mkpathdup to mkpath in assignmentsJeff King
As with the previous commit to git_path, assigning the result of mkpath is suspicious, since it is not clear whether we will still depend on the value after it may have been overwritten by subsequent calls. This patch converts low-hanging fruit to use mkpathdup instead of mkpath (with the downside that we must remember to free the result). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-25introduce "preciousObjects" repository extensionJeff King
If this extension is used in a repository, then no operations should run which may drop objects from the object storage. This can be useful if you are sharing that storage with other repositories whose refs you cannot see. For instance, if you do: $ git clone -s parent child $ git -C parent config extensions.preciousObjects true $ git -C parent config core.repositoryformatversion 1 you now have additional safety when running git in the parent repository. Prunes and repacks will bail with an error, and `git gc` will skip those operations (it will continue to pack refs and do other non-object operations). Older versions of git, when run in the repository, will fail on every operation. Note that we do not set the preciousObjects extension by default when doing a "clone -s", as doing so breaks backwards compatibility. It is a decision the user should make explicitly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-11Merge branch 'nd/multiple-work-trees'Junio C Hamano
A replacement for contrib/workdir/git-new-workdir that does not rely on symbolic links and make sharing of objects and refs safer by making the borrowee and borrowers aware of each other. * nd/multiple-work-trees: (41 commits) prune --worktrees: fix expire vs worktree existence condition t1501: fix test with split index t2026: fix broken &&-chain t2026 needs procondition SANITY git-checkout.txt: a note about multiple checkout support for submodules checkout: add --ignore-other-wortrees checkout: pass whole struct to parse_branchname_arg instead of individual flags git-common-dir: make "modules/" per-working-directory directory checkout: do not fail if target is an empty directory t2025: add a test to make sure grafts is working from a linked checkout checkout: don't require a work tree when checking out into a new one git_path(): keep "info/sparse-checkout" per work-tree count-objects: report unused files in $GIT_DIR/worktrees/... gc: support prune --worktrees gc: factor out gc.pruneexpire parsing code gc: style change -- no SP before closing parenthesis checkout: clean up half-prepared directories in --to mode checkout: reject if the branch is already checked out elsewhere prune: strategies for linked checkouts checkout: support checking out into a new working directory ...
2015-03-25Merge branch 'jk/prune-with-corrupt-refs'Junio C Hamano
"git prune" used to largely ignore broken refs when deciding which objects are still being used, which could spread an existing small damage and make it a larger one. * jk/prune-with-corrupt-refs: refs.c: drop curate_packed_refs repack: turn on "ref paranoia" when doing a destructive repack prune: turn on ref_paranoia flag refs: introduce a "ref paranoia" flag t5312: test object deletion code paths in a corrupted repository
2015-03-20repack: turn on "ref paranoia" when doing a destructive repackJeff King
If we are repacking with "-ad", we will drop any unreachable objects. Likewise, using "-Ad --unpack-unreachable=<time>" will drop any old, unreachable objects. In these cases, we want to make sure the reachability we compute with "--all" is complete. We can do this by passing GIT_REF_PARANOIA=1 in the environment to pack-objects. Note that "-Ad" is safe already, because it only loosens unreachable objects. It is up to "git prune" to avoid deleting them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>