summaryrefslogtreecommitdiff
path: root/cache.h
AgeCommit message (Collapse)Author
2017-01-10Merge branch 'jc/git-open-cloexec'Junio C Hamano
The codeflow of setting NOATIME and CLOEXEC on file descriptors Git opens has been simplified. We may want to drop the tip one, but we'll see. * jc/git-open-cloexec: sha1_file: stop opening files with O_NOATIME git_open_cloexec(): use fcntl(2) w/ FD_CLOEXEC fallback git_open(): untangle possible NOATIME and CLOEXEC interactions
2016-11-16compression: unify pack.compression configuration parsingJunio C Hamano
There are three codepaths that use a variable whose name is pack_compression_level to affect how objects and deltas sent to a packfile is compressed. Unlike zlib_compression_level that controls the loose object compression, however, this variable was static to each of these codepaths. Two of them read the pack.compression configuration variable, using core.compression as the default, and one of them also allowed overriding it from the command line. The other codepath in bulk-checkin did not pay any attention to the configuration. Unify the configuration parsing to git_default_config(), where we implement the parsing of core.loosecompression and core.compression and make the former override the latter, by moving code to parse pack.compression and also allow core.compression to give default to this variable. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-03sha1_file: stop opening files with O_NOATIMEJunio C Hamano
When we open object files, we try to do so with O_NOATIME. This dates back to 144bde78e9 (Use O_NOATIME when opening the sha1 files., 2005-04-23), which is an optimization to avoid creating a bunch of dirty inodes when we're accessing many objects. But a few things have changed since then: 1. In June 2005, git learned about packfiles, which means we would do a lot fewer atime updates (rather than one per object access, we'd generally get one per packfile). 2. In late 2006, Linux learned about "relatime", which is generally the default on modern installs. So performance around atimes updates is a non-issue there these days. All the world isn't Linux, but as it turns out, Linux is the only platform to implement O_NOATIME in the first place. So it's very unlikely that this code is helping anybody these days. Helped-by: Jeff King <peff@peff.net> [jc: took idea and log message from peff] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-31Merge branch 'ls/git-open-cloexec'Junio C Hamano
Git generally does not explicitly close file descriptors that were open in the parent process when spawning a child process, but most of the time the child does not want to access them. As Windows does not allow removing or renaming a file that has a file descriptor open, a slow-to-exit child can even break the parent process by holding onto them. Use O_CLOEXEC flag to open files in various codepaths. * ls/git-open-cloexec: read-cache: make sure file handles are not inherited by child processes sha1_file: open window into packfiles with O_CLOEXEC sha1_file: rename git_open_noatime() to git_open()
2016-10-28git_open(): untangle possible NOATIME and CLOEXEC interactionsJunio C Hamano
The way we structured the fallback/retry mechanism for opening with O_NOATIME and O_CLOEXEC meant that if we failed due to lack of support to open the file with O_NOATIME option (i.e. EINVAL), we would still try to drop O_CLOEXEC first and retry, and then drop O_NOATIME. A platform on which O_NOATIME is defined in the header without support from the kernel wouldn't have a chance to open with O_CLOEXEC option due to this code structure. Arguably, O_CLOEXEC is more important than O_NOATIME, as the latter is mostly about performance, while the former can affect correctness. Instead use O_CLOEXEC to open the file, and then use fcntl(2) to set O_NOATIME on the resulting file descriptor. open(2) itself does not cause atime to be updated according to Linus [*1*]. The helper to do the former can be usable in the codepath in ce_compare_data() that was recently added to open a file descriptor with O_CLOEXEC; use it while we are at it. *1* <CA+55aFw83E+zOd+z5h-CA-3NhrLjVr-anL6pubrSWttYx3zu8g@mail.gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-27Merge branch 'jk/no-looking-at-dotgit-outside-repo'Junio C Hamano
Update "git diff --no-index" codepath not to try to peek into .git/ directory that happens to be under the current directory, when we know we are operating outside any repository. * jk/no-looking-at-dotgit-outside-repo: diff: handle sha1 abbreviations outside of repository diff_aligned_abbrev: use "struct oid" diff_unique_abbrev: rename to diff_aligned_abbrev find_unique_abbrev: use 4-buffer ring test-*-cache-tree: setup git dir read info/{attributes,exclude} only when in repository
2016-10-27Merge branch 'jk/abbrev-auto'Junio C Hamano
Updates the way approximate count of total objects is computed while attempting to come up with a unique abbreviated object name, which in turn needs to estimate how many hexdigits are necessary to ensure uniqueness. * jk/abbrev-auto: find_unique_abbrev: move logic out of get_short_sha1()
2016-10-27Merge branch 'lt/abbrev-auto'Junio C Hamano
Allow the default abbreviation length, which has historically been 7, to scale as the repository grows. The logic suggests to use 12 hexdigits for the Linux kernel, and 9 to 10 for Git itself. * lt/abbrev-auto: abbrev: auto size the default abbreviation abbrev: prepare for new world order abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing
2016-10-26find_unique_abbrev: use 4-buffer ringJeff King
Some code paths want to format multiple abbreviated sha1s in the same output line. Because we use a single static buffer for our return value, they have to either break their output into several calls or allocate their own arrays and use find_unique_abbrev_r(). Intead, let's mimic sha1_to_hex() and use a ring of several buffers, so that the return value stays valid through multiple calls. This shortens some of the callers, and makes it harder to for them to make a silly mistake. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-26Merge branch 'jk/fetch-quick-tag-following'Junio C Hamano
When fetching from a remote that has many tags that are irrelevant to branches we are following, we used to waste way too many cycles when checking if the object pointed at by a tag (that we are not going to fetch!) exists in our repository too carefully. * jk/fetch-quick-tag-following: fetch: use "quick" has_sha1_file for tag following
2016-10-26Merge branch 'bw/ls-files-recurse-submodules'Junio C Hamano
"git ls-files" learned "--recurse-submodules" option that can be used to get a listing of tracked files across submodules (i.e. this only works with "--cached" option, not for listing untracked or ignored files). This would be a useful tool to sit on the upstream side of a pipe that is read with xargs to work on all working tree files from the top-level superproject. * bw/ls-files-recurse-submodules: ls-files: add pathspec matching for submodules ls-files: pass through safe options for --recurse-submodules ls-files: optionally recurse into submodules git: make super-prefix option
2016-10-25sha1_file: rename git_open_noatime() to git_open()Lars Schneider
This function is meant to be used when reading from files in the object store, and the original objective was to avoid smudging atime of loose object files too often, hence its name. Because we'll be extending its role in the next commit to also arrange the file descriptors they return auto-closed in the child processes, rename it to lose "noatime" part that is too specific. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17Merge branch 'jk/alt-odb-cleanup'Junio C Hamano
Codepaths involved in interacting alternate object store have been cleaned up. * jk/alt-odb-cleanup: alternates: use fspathcmp to detect duplicates sha1_file: always allow relative paths to alternates count-objects: report alternates via verbose mode fill_sha1_file: write into a strbuf alternates: store scratch buffer as strbuf fill_sha1_file: write "boring" characters alternates: use a separate scratch space alternates: encapsulate alt->base munging alternates: provide helper for allocating alternate alternates: provide helper for adding to alternates list link_alt_odb_entry: refactor string handling link_alt_odb_entry: handle normalize_path errors t5613: clarify "too deep" recursion tests t5613: do not chdir in main process t5613: whitespace/style cleanups t5613: use test_must_fail t5613: drop test_valid_repo function t5613: drop reachable_via function
2016-10-17Merge branch 'jk/quarantine-received-objects'Junio C Hamano
In order for the receiving end of "git push" to inspect the received history and decide to reject the push, the objects sent from the sending end need to be made available to the hook and the mechanism for the connectivity check, and this was done traditionally by storing the objects in the receiving repository and letting "git gc" to expire it. Instead, store the newly received objects in a temporary area, and make them available by reusing the alternate object store mechanism to them only while we decide if we accept the check, and once we decide, either migrate them to the repository or purge them immediately. * jk/quarantine-received-objects: tmp-objdir: do not migrate files starting with '.' tmp-objdir: put quarantine information in the environment receive-pack: quarantine objects until pre-receive accepts tmp-objdir: introduce API for temporary object directories check_connected: accept an env argument
2016-10-14fetch: use "quick" has_sha1_file for tag followingJeff King
When we auto-follow tags in a fetch, we look at all of the tags advertised by the remote and fetch ones where we don't already have the tag, but we do have the object it peels to. This involves a lot of calls to has_sha1_file(), some of which we can reasonably expect to fail. Since 45e8a74 (has_sha1_file: re-check pack directory before giving up, 2013-08-30), this may cause many calls to reprepare_packed_git(), which is potentially expensive. This has gone unnoticed for several years because it requires a fairly unique setup to matter: 1. You need to have a lot of packs on the client side to make reprepare_packed_git() expensive (the most expensive part is finding duplicates in an unsorted list, which is currently quadratic). 2. You need a large number of tag refs on the server side that are candidates for auto-following (i.e., that the client doesn't have). Each one triggers a re-read of the pack directory. 3. Under normal circumstances, the client would auto-follow those tags and after one large fetch, (2) would no longer be true. But if those tags point to history which is disconnected from what the client otherwise fetches, then it will never auto-follow, and those candidates will impact it on every fetch. So when all three are true, each fetch pays an extra O(nr_tags * nr_packs^2) cost, mostly in string comparisons on the pack names. This was exacerbated by 47bf4b0 (prepare_packed_git_one: refactor duplicate-pack check, 2014-06-30) which uses a slightly more expensive string check, under the assumption that the duplicate check doesn't happen very often (and it shouldn't; the real problem here is how often we are calling reprepare_packed_git()). This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice accuracy for speed, in cases where we might be racy with a simultaneous repack. This is similar to the fix in 0eeb077 (index-pack: avoid excessive re-reading of pack directory, 2015-06-09). As with that case, it's OK for has_sha1_file() occasionally say "no I don't have it" when we do, because the worst case is not a corruption, but simply that we may fail to auto-follow a tag that points to it. Here are results from the included perf script, which sets up a situation similar to the one described above: Test HEAD^ HEAD ---------------------------------------------------------- 5550.4: fetch 11.21(10.42+0.78) 0.08(0.04+0.02) -99.3% Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10Merge branch 'jk/pack-objects-optim-mru'Junio C Hamano
"git pack-objects" in a repository with many packfiles used to spend a lot of time looking for/at objects in them; the accesses to the packfiles are now optimized by checking the most-recently-used packfile first. * jk/pack-objects-optim-mru: pack-objects: use mru list when iterating over packs pack-objects: break delta cycles before delta-search phase sha1_file: make packed_object_info public provide an initializer for "struct object_info"
2016-10-10tmp-objdir: put quarantine information in the environmentJeff King
The presence of the GIT_QUARANTINE_PATH variable lets any called programs know that they're operating in a temporary object directory (and where that directory is). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10alternates: store scratch buffer as strbufJeff King
We pre-size the scratch buffer to hold a loose object filename of the form "xx/yyyy...", which leads to allocation code that is hard to verify. We have to use some magic numbers during the initial allocation, and then writers must blindly assume that the buffer is big enough. Using a strbuf makes it more clear that we cannot overflow. Unfortunately, we do still need some magic numbers to grow our strbuf before calling fill_sha1_path(), but the strbuf growth is much closer to the point of use. This makes it easier to see that it's correct, and opens the possibility of pushing it even further down if fill_sha1_path() learns to work on strbufs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10alternates: use a separate scratch spaceJeff King
The alternate_object_database struct uses a single buffer both for storing the path to the alternate, and as a scratch buffer for forming object names. This is efficient (since otherwise we'd end up storing the path twice), but it makes life hard for callers who just want to know the path to the alternate. They have to remember to stop reading after "alt->name - alt->base" bytes, and to subtract one for the trailing '/'. It would be much simpler if they could simply access a NUL-terminated path string. We could encapsulate this in a function which puts a NUL in the scratch buffer and returns the string, but that opens up questions about the lifetime of the result. The first time another caller uses the alternate, the scratch buffer may get other data tacked onto it. Let's instead just store the root path separately from the scratch buffer. There aren't enough alternates being stored for the duplicated data to matter for performance, and this keeps things simple and safe for the callers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10alternates: provide helper for allocating alternateJeff King
Allocating a struct alternate_object_database is tricky, as we must over-allocate the buffer to provide scratch space, and then put in particular '/' and NUL markers. Let's encapsulate this in a function so that the complexity doesn't leak into callers (and so that we can modify it later). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10alternates: provide helper for adding to alternates listJeff King
The submodule code wants to temporarily add an alternate object store to our in-memory alt_odb list, but does it manually. Let's provide a helper so it can reuse the code in link_alt_odb_entry(). While we're adding our new add_to_alternates_memory(), let's document add_to_alternates_file(), as the two are related. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10git: make super-prefix optionBrandon Williams
Add a super-prefix environment variable 'GIT_INTERNAL_SUPER_PREFIX' which can be used to specify a path from above a repository down to its root. When such a super-prefix is specified, the paths reported by Git are prefixed with it to make them relative to that directory "above". The paths given by the user on the command line (e.g. "git subcmd --output-file=path/to/a/file" and pathspecs) are taken relative to the directory "above" to match. The immediate use of this option is by commands which have a --recurse-submodule option in order to give context to submodules about how they were invoked. This option is currently only allowed for builtins which support a super-prefix. Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-06Merge branch 'jk/ambiguous-short-object-names'Junio C Hamano
When given an abbreviated object name that is not (or more realistically, "no longer") unique, we gave a fatal error "ambiguous argument". This error is now accompanied by hints that lists the objects that begins with the given prefix. During the course of development of this new feature, numerous minor bugs were uncovered and corrected, the most notable one of which is that we gave "short SHA1 xxxx is ambiguous." twice without good reason. * jk/ambiguous-short-object-names: get_short_sha1: make default disambiguation configurable get_short_sha1: list ambiguous objects on error for_each_abbrev: drop duplicate objects sha1_array: let callbacks interrupt iteration get_short_sha1: mark ambiguity error for translation get_short_sha1: NUL-terminate hex prefix get_short_sha1: refactor init of disambiguation code get_short_sha1: parse tags when looking for treeish get_sha1: propagate flags to child functions get_sha1: avoid repeating ourselves via ONLY_TO_DIE get_sha1: detect buggy calls with multiple disambiguators
2016-10-04find_unique_abbrev: move logic out of get_short_sha1()Jeff King
The get_short_sha1() is only about reading short sha1s; we do call it in a loop to check "is this long enough" for each object, but otherwise it should not need to know about things like our default_abbrev setting. So instead of asking it to set default_automatic_abbrev as a side-effect, let's just have find_unique_abbrev() pick the right place to start its loop. This requires a separate approximate_object_count() function, but that naturally belongs with the rest of sha1_file.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03Merge branch 'nd/init-core-worktree-in-multi-worktree-world'Junio C Hamano
"git init" tried to record core.worktree in the repository's 'config' file when GIT_WORK_TREE environment variable was set and it was different from where GIT_DIR appears as ".git" at its top, but the logic was faulty when .git is a "gitdir:" file that points at the real place, causing trouble in working trees that are managed by "git worktree". This has been corrected. * nd/init-core-worktree-in-multi-worktree-world: init: kill git_link variable init: do not set unnecessary core.worktree init: kill set_git_dir_init() init: call set_git_dir_init() from within init_db() init: correct re-initialization from a linked worktree
2016-10-03abbrev: auto size the default abbreviationLinus Torvalds
In fairly early days we somehow decided to abbreviate object names down to 7-hexdigits, but as projects grow, it is becoming more and more likely to see such a short object names made in earlier days and recorded in the log messages no longer unique. Currently the Linux kernel project needs 11 to 12 hexdigits, while Git itself needs 10 hexdigits to uniquely identify the objects they have, while many smaller projects may still be fine with the original 7-hexdigit default. One-size does not fit all projects. Introduce a mechanism, where we estimate the number of objects in the repository upon the first request to abbreviate an object name with the default setting and come up with a sane default for the repository. Based on the expectation that we would see collision in a repository with 2^(2N) objects when using object names shortened to first N bits, use sufficient number of hexdigits to cover the number of objects in the repository. Each hexdigit (4-bits) we add to the shortened name allows us to have four times (2-bits) as many objects in the repository. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizingJunio C Hamano
We'll be introducing a new way to decide the default abbreviation length by initialising DEFAULT_ABBREV to -1 to signal the first call to "find unique abbreviation" codepath to compute a reasonable value based on the number of objects we have to avoid collisions. We have long relied on DEFAULT_ABBREV being a positive concrete value that is used as the abbreviation length when no extra configuration or command line option has overridden it. Some codepaths wants to use such a positive concrete default value even before making their first request to actually trigger the computation for the auto sized default. Introduce FALLBACK_DEFAULT_ABBREV and use it to the code that attempts to align the report from "git fetch". For now, this macro is also used to initialize the default_abbrev variable, but the auto-sizing code will use -1 and then use the value of FALLBACK_DEFAULT_ABBREV as the starting point of auto-sizing. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-27get_short_sha1: make default disambiguation configurableJeff King
When we find ambiguous short sha1s, we may get a disambiguation rule from our caller's context. But if we don't, we fall back to treating all sha1s the same, even though most projects will tend to refer only to commits by their short sha1s. This patch introduces a configuration option that lets the user pick a different fallback (e.g., only commits). It's possible that we may want to make this the default, but it's a good idea to start as a config option for two reasons: 1. It lets people experiment with this and see if it's a good idea (i.e., the "tend to" above is an assumption; we don't really know if this will break some obscure cases). 2. Even if we do flip the default, it gives people an escape hatch if it causes problems (you can sometimes override it by asking for "1234^{tree}", but not all combinations are possible). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26Merge branch 'rs/checkout-init-macro'Junio C Hamano
Code cleanup. * rs/checkout-init-macro: introduce CHECKOUT_INIT
2016-09-26Merge branch 'tg/add-chmod+x-fix'Junio C Hamano
"git add --chmod=+x <pathspec>" added recently only toggled the executable bit for paths that are either new or modified. This has been corrected to flip the executable bit for all paths that match the given pathspec. * tg/add-chmod+x-fix: t3700-add: do not check working tree file mode without POSIXPERM t3700-add: create subdirectory gently add: modify already added files when --chmod is given read-cache: introduce chmod_index_entry update-index: add test for chmod flags
2016-09-26get_sha1: detect buggy calls with multiple disambiguatorsJeff King
The get_sha1() family of functions takes a flags field, but some of the flags are mutually exclusive. In particular, we can only handle one disambiguating function, and the flags quietly override each other. Let's instead detect these as programming bugs. Technically some of the flags are supersets of the others, so treating COMMITTISH|TREEISH as just COMMITTISH is not wrong, but it's a good sign the caller is confused. And certainly asking for BLOB|TREE does not work. We can do the check easily with some bit-twiddling, and as a bonus, the bit-mask of disambiguators will come in handy in a future patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-25init: call set_git_dir_init() from within init_db()Nguyễn Thái Ngọc Duy
The next commit requires that set_git_dir_init() must be called before init_db(). Let's make sure nobody can do otherwise. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-22introduce CHECKOUT_INITRené Scharfe
Add a static initializer for struct checkout and use it throughout the code base. It's shorter, avoids a memset(3) call and makes sure the base_dir member is initialized to a valid (empty) string. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21Merge branch 'jk/setup-sequence-update'Junio C Hamano
There were numerous corner cases in which the configuration files are read and used or not read at all depending on the directory a Git command was run, leading to inconsistent behaviour. The code to set-up repository access at the beginning of a Git process has been updated to fix them. * jk/setup-sequence-update: t1007: factor out repeated setup init: reset cached config when entering new repo init: expand comments explaining config trickery config: only read .git/config from configured repos test-config: setup git directory t1302: use "git -C" pager: handle early config pager: use callbacks instead of configset pager: make pager_program a file-local static pager: stop loading git_default_config() pager: remove obsolete comment diff: always try to set up the repository diff: handle --no-index prefixes consistently diff: skip implicit no-index check when given --no-index patch-id: use RUN_SETUP_GENTLY hash-object: always try to set up the git repository
2016-09-19Merge branch 'bc/object-id'Junio C Hamano
The "unsigned char sha1[20]" to "struct object_id" conversion continues. Notable changes in this round includes that ce->sha1, i.e. the object name recorded in the cache_entry, turns into an object_id. It had merge conflicts with a few topics in flight (Christian's "apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's "status --porcelain-v2"). Extra sets of eyes double-checking for mismerges are highly appreciated. * bc/object-id: builtin/reset: convert to use struct object_id builtin/commit-tree: convert to struct object_id builtin/am: convert to struct object_id refs: add an update_ref_oid function. sha1_name: convert get_sha1_mb to struct object_id builtin/update-index: convert file to struct object_id notes: convert init_notes to use struct object_id builtin/rm: convert to use struct object_id builtin/blame: convert file to use struct object_id Convert read_mmblob to take struct object_id. notes-merge: convert struct notes_merge_pair to struct object_id builtin/checkout: convert some static functions to struct object_id streaming: make stream_blob_to_fd take struct object_id builtin: convert textconv_object to use struct object_id builtin/cat-file: convert some static functions to struct object_id builtin/cat-file: convert struct expand_data to use struct object_id builtin/log: convert some static functions to use struct object_id builtin/blame: convert struct origin to use struct object_id builtin/apply: convert static functions to struct object_id cache: convert struct cache_entry to use struct object_id
2016-09-15add: modify already added files when --chmod is givenThomas Gummerer
When the chmod option was added to git add, it was hooked up to the diff machinery, meaning that it only works when the version in the index differs from the version on disk. As the option was supposed to mirror the chmod option in update-index, which always changes the mode in the index, regardless of the status of the file, make sure the option behaves the same way in git add. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15read-cache: introduce chmod_index_entryThomas Gummerer
As there are chmod options for both add and update-index, introduce a new chmod_index_entry function to do the work. Use it in update-index, while it will be used in add in the next patch. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13init: reset cached config when entering new repoJeff King
After we copy the templates into place, we re-read the config in case we copied in a default config file. But since git_config() is backed by a cache these days, it's possible that the call will not actually touch the filesystem at all; we need to tell it that something has changed behind the scenes. Note that we also need to reset the shared_repository config. At first glance, it seems like this should probably just be folded into git_config_clear(). But unfortunately that is not quite right. The shared repository value may come from config, _or_ it may have been set manually. So only the caller who knows whether or not they set it is the one who can clear it (and indeed, if you _do_ put it into git_config_clear(), then many tests fail, as we have to clear the config cache any time we set a new config variable). There are three tests here. The first two actually pass already, though it's largely luck: they just don't happen to actually read any config before we enter the new repo. But the third one does fail without this patch; we look at core.sharedrepository while creating the directory, but need to make sure the value from the template config overrides it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13config: only read .git/config from configured reposJeff King
When git_config() runs, it looks in the system, user-wide, and repo-level config files. It gets the latter by calling git_pathdup(), which in turn calls get_git_dir(). If we haven't set up the git repository yet, this may simply return ".git", and we will look at ".git/config". This seems like it would be helpful (presumably we haven't set up the repository yet, so it tries to find it), but it turns out to be a bad idea for a few reasons: - it's not sufficient, and therefore hides bugs in a confusing way. Config will be respected if commands are run from the top-level of the working tree, but not from a subdirectory. - it's not always true that we haven't set up the repository _yet_; we may not want to do it at all. For instance, if you run "git init /some/path" from inside another repository, it should not load config from the existing repository. - there might be a path ".git/config", but it is not the actual repository we would find via setup_git_directory(). This may happen, e.g., if you are storing a git repository inside another git repository, but have munged one of the files in such a way that the inner repository is not valid (e.g., by removing HEAD). We have at least two bugs of the second type in git-init, introduced by ae5f677 (lazily load core.sharedrepository, 2016-03-11). It causes init to use git_configset(), which loads all of the config, including values from the current repo (if any). This shows up in two ways: 1. If we happen to be in an existing repository directory, we'll read and respect core.sharedrepository from it, even though it should have no bearing on the new repository. A new test in t1301 covers this. 2. Similarly, if we're in an existing repo that sets core.logallrefupdates, that will cause init to fail to set it in a newly created repository (because it thinks that the user's templates already did so). A new test in t0001 covers this. We also need to adjust an existing test in t1302, which gives another example of why this patch is an improvement. That test creates an embedded repository with a bogus core.repositoryformatversion of "99". It wants to make sure that we actually stop at the bogus repo rather than continuing upward to find the outer repo. So it checks that "git config core.repositoryformatversion" returns 99. But that only works because we blindly read ".git/config", even though we _know_ we're in a repository whose vintage we do not understand. After this patch, we avoid reading config from the unknown vintage repository at all, which is a safer choice. But we need to tweak the test, since core.repositoryformatversion will not return 99; it will claim that it could not find the variable at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13pager: make pager_program a file-local staticJeff King
This variable is only ever used by the routines in pager.c, and other parts of the code should always use those routines (like git_pager()) to make decisions about which pager to use. Let's reduce its scope to prevent accidents. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-12Merge branch 'rs/hex2chr'Junio C Hamano
* rs/hex2chr: introduce hex2chr() for converting two hexadecimal digits to a character
2016-09-12Merge branch 'jk/diff-submodule-diff-inline'Junio C Hamano
The "git diff --submodule={short,log}" mechanism has been enhanced to allow "--submodule=diff" to show the patch between the submodule commits bound to the superproject. * jk/diff-submodule-diff-inline: diff: teach diff to display submodule difference with an inline diff submodule: refactor show_submodule_summary with helper function submodule: convert show_submodule_summary to use struct object_id * allow do_submodule_path to work even if submodule isn't checked out diff: prepare for additional submodule formats graph: add support for --line-prefix on all graph-aware output diff.c: remove output_prefix_length field cache: add empty_tree_oid object and helper function
2016-09-09Merge branch 'sb/submodule-clone-rr'Junio C Hamano
"git clone --resurse-submodules --reference $path $URL" is a way to reduce network transfer cost by borrowing objects in an existing $path repository when cloning the superproject from $URL; it learned to also peek into $path for presense of corresponding repositories of submodules and borrow objects from there when able. * sb/submodule-clone-rr: clone: recursive and reference option triggers submodule alternates clone: implement optional references clone: clarify option_reference as required clone: factor out checking for an alternate path submodule--helper update-clone: allow multiple references submodule--helper module-clone: allow multiple references t7408: merge short tests, factor out testing method t7408: modernize style
2016-09-09Merge branch 'jk/reflog-date' into maintJunio C Hamano
The reflog output format is documented better, and a new format --date=unix to report the seconds-since-epoch (without timezone) has been added. * jk/reflog-date: date: clarify --date=raw description date: add "unix" format date: document and test "raw-local" mode doc/pretty-formats: explain shortening of %gd doc/pretty-formats: describe index/time formats for %gd doc/rev-list-options: explain "-g" output formats doc/rev-list-options: clarify "commit@{Nth}" for "-g" option
2016-09-09Merge branch 'jc/renormalize-merge-kill-safer-crlf' into maintJunio C Hamano
"git merge" with renormalization did not work well with merge-recursive, due to "safer crlf" conversion kicking in when it shouldn't. * jc/renormalize-merge-kill-safer-crlf: merge: avoid "safer crlf" during recording of merge results convert: unify the "auto" handling of CRLF
2016-09-07sha1_name: convert get_sha1_mb to struct object_idbrian m. carlson
All of the callers of this function use struct object_id, so rename it to get_oid_mb and make it take struct object_id instead of unsigned char *. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07cache: convert struct cache_entry to use struct object_idbrian m. carlson
Convert struct cache_entry to use struct object_id by applying the following semantic patch and the object_id transforms from contrib, plus the actual change to the struct: @@ struct cache_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct cache_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07introduce hex2chr() for converting two hexadecimal digits to a characterRené Scharfe
Add and use a helper function that decodes the char value of two hexadecimal digits. It returns a negative number on error, avoids running over the end of the given string and doesn't shift negative values. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-01allow do_submodule_path to work even if submodule isn't checked outJacob Keller
Currently, do_submodule_path will attempt locating the .git directory by using read_gitfile on <path>/.git. If this fails it just assumes the <path>/.git is actually a git directory. This is good because it allows for handling submodules which were cloned in a regular manner first before being added to the superproject. Unfortunately this fails if the <path> is not actually checked out any longer, such as by removing the directory. Fix this by checking if the directory we found is actually a gitdir. In the case it is not, attempt to lookup the submodule configuration and find the name of where it is stored in the .git/modules/ directory of the superproject. If we can't locate the submodule configuration, this might occur because for example a submodule gitlink was added but the corresponding .gitmodules file was not properly updated. A die() here would not be pleasant to the users of submodule diff formats, so instead, modify do_submodule_path() to return an error code: - git_pathdup_submodule() returns NULL when we fail to find a path. - strbuf_git_path_submodule() propagates the error code to the caller. Modify the callers of these functions to check the error code and fail properly. This ensures we don't attempt to use a bad path that doesn't match the corresponding submodule. Because this change fixes add_submodule_odb() to work even if the submodule is not checked out, update the wording of the submodule log diff format to correctly display that the submodule is "not initialized" instead of "not checked out" Add tests to ensure this change works as expected. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-01cache: add empty_tree_oid object and helper functionJacob Keller
Similar to is_null_oid(), and is_empty_blob_sha1() add an empty_tree_oid along with helper function is_empty_tree_oid(). For completeness, also add an "is_empty_tree_sha1()", "is_empty_blob_sha1()", "is_empty_tree_oid()" and "is_empty_blob_oid()" helpers. To ensure we only get one singleton, implement EMPTY_BLOB_SHA1_BIN as simply getting the hash of empty_blob_oid structure. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>