summaryrefslogtreecommitdiff
path: root/sha1_file.c
AgeCommit message (Collapse)Author
2015-10-20Merge branch 'jk/war-on-sprintf'Junio C Hamano
Many allocations that is manually counted (correctly) that are followed by strcpy/sprintf have been replaced with a less error prone constructs such as xstrfmt. Macintosh-specific breakage was noticed and corrected in this reroll. * jk/war-on-sprintf: (70 commits) name-rev: use strip_suffix to avoid magic numbers use strbuf_complete to conditionally append slash fsck: use for_each_loose_file_in_objdir Makefile: drop D_INO_IN_DIRENT build knob fsck: drop inode-sorting code convert strncpy to memcpy notes: document length of fanout path with a constant color: add color_set helper for copying raw colors prefer memcpy to strcpy help: clean up kfmclient munging receive-pack: simplify keep_arg computation avoid sprintf and strcpy with flex arrays use alloc_ref rather than hand-allocating "struct ref" color: add overflow checks for parsing colors drop strcpy in favor of raw sha1_to_hex use sha1_to_hex_r() instead of strcpy daemon: use cld->env_array when re-spawning stat_tracking_info: convert to argv_array http-push: use an argv_array for setup_revisions fetch-pack: use argv_array for index-pack / unpack-objects ...
2015-10-15Merge branch 'js/clone-dissociate'Junio C Hamano
"git clone --dissociate" runs a big "git repack" process at the end, and it helps to close file descriptors that are open on the packs and their idx files before doing so on filesystems that cannot remove a file that is still open. * js/clone-dissociate: clone --dissociate: avoid locking pack files sha1_file.c: add a function to release all packs sha1_file: consolidate code to close a pack's file descriptor t5700: demonstrate a Windows file locking issue with `git clone --dissociate`
2015-10-07sha1_file.c: add a function to release all packsJohannes Schindelin
On Windows, files that are in use cannot be removed or renamed. That means that we have to release pack files when we are about to, say, repack them. Let's introduce a convenient function to close all the pack files and their idx files. While at it, we consolidate the close windows/close fd/close index stanza in `free_pack_by_name()` into the `close_pack()` function that is used by the new `close_all_packs()` function to avoid repeated code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05sha1_file: consolidate code to close a pack's file descriptorJohannes Schindelin
There was a lot of repeated code to close the file descriptor of a given pack. Let's just refactor this code into a single function. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05avoid sprintf and strcpy with flex arraysJeff King
When we are allocating a struct with a FLEX_ARRAY member, we generally compute the size of the array and then sprintf or strcpy into it. Normally we could improve a dynamic allocation like this by using xstrfmt, but it doesn't work here; we have to account for the size of the rest of the struct. But we can improve things a bit by storing the length that we use for the allocation, and then feeding it to xsnprintf or memcpy, which makes it more obvious that we are not writing more than the allocated number of bytes. It would be nice if we had some kind of helper for allocating generic flex arrays, but it doesn't work that well: - the call signature is a little bit unwieldy: d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...); You need offsetof here instead of just writing to the end of the base size, because we don't know how the struct is packed (partially this is because FLEX_ARRAY might not be zero, though we can account for that; but the size of the struct may actually be rounded up for alignment, and we can't know that). - some sites do clever things, like over-allocating because they know they will write larger things into the buffer later (e.g., struct packed_git here). So we're better off to just write out each allocation (or add type-specific helpers, though many of these are one-off allocations anyway). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05write_loose_object: convert to strbufJeff King
When creating a loose object tempfile, we use a fixed PATH_MAX-sized buffer, and strcpy directly into it. This isn't buggy, because we do a rough check of the size, but there's no verification that our guesstimate of the required space is enough (in fact, it's several bytes too big for the current naming scheme). Let's switch to a strbuf, which makes this much easier to verify. The allocation overhead should be negligible, since we are replacing a static buffer with a static strbuf, and we'll only need to allocate on the first call. While we're here, we can also document a subtle interaction with mkstemp that would be easy to overlook. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25sha1_get_pack_name: use a strbufJeff King
We do some manual memory computation here, and there's no check that our 60 is not overflowed by the raw sprintf (it isn't, because the "which" parameter is never longer than "pack"). We can simplify this greatly with a strbuf. Technically the end result is not identical, as the original took care not to rewrite the object directory on each call for performance reasons. We could do that here, too (by saving the baselen and resetting to it), but it's not worth the complexity; this function is not called a lot (generally once per packfile that we open). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25use strip_suffix and xstrfmt to replace suffixJeff King
When we want to convert "foo.pack" to "foo.idx", we do it by duplicating the original string and then munging the bytes in place. Let's use strip_suffix and xstrfmt instead, which has several advantages: 1. It's more clear what the intent is. 2. It does not implicitly rely on the fact that strlen(".idx") <= strlen(".pack") to avoid an overflow. 3. We communicate the assumption that the input file ends with ".pack" (and get a run-time check that this is so). 4. We drop calls to strcpy, which makes auditing the code base easier. Likewise, we can do this to convert ".pack" to ".bitmap", avoiding some manual memory computation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25add_packed_git: convert strcpy into xsnprintfJeff King
We have the path "foo.idx", and we create a buffer big enough to hold "foo.pack" and "foo.keep", and then strcpy straight into it. This isn't a bug (we have enough space), but it's very hard to tell from the strcpy that this is so. Let's instead use strip_suffix to take off the ".idx", record the size of our allocation, and use xsnprintf to make sure we don't violate our assumptions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25use xsnprintf for generating git object headersJeff King
We generally use 32-byte buffers to format git's "type size" header fields. These should not generally overflow unless you can produce some truly gigantic objects (and our types come from our internal array of constant strings). But it is a good idea to use xsnprintf to make sure this is the case. Note that we slightly modify the interface to write_sha1_file_prepare, which nows uses "hdrlen" as an "in" parameter as well as an "out" (on the way in it stores the allocated size of the header, and on the way out it returns the ultimate size of the header). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-09Sync with 2.5.2Junio C Hamano
2015-09-04Sync with 2.4.9Junio C Hamano
2015-09-04Sync with 2.3.9Junio C Hamano
2015-09-04Sync with 2.2.3Junio C Hamano
2015-09-04read_info_alternates: handle paths larger than PATH_MAXJeff King
This function assumes that the relative_base path passed into it is no larger than PATH_MAX, and writes into a fixed-size buffer. However, this path may not have actually come from the filesystem; for example, add_submodule_odb generates a path using a strbuf and passes it in. This is hard to trigger in practice, though, because the long submodule directory would have to exist on disk before we would try to open its info/alternates file. We can easily avoid the bug, though, by simply creating the filename on the heap. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-04Merge branch 'cb/open-noatime-clear-errno' into maintJunio C Hamano
When trying to see that an object does not exist, a state errno leaked from our "first try to open a packfile with O_NOATIME and then if it fails retry without it" logic on a system that refuses O_NOATIME. This confused us and caused us to die, saying that the packfile is unreadable, when we should have just reported that the object does not exist in that packfile to the caller. * cb/open-noatime-clear-errno: git_open_noatime: return with errno=0 on success
2015-08-25Merge branch 'cb/open-noatime-clear-errno'Junio C Hamano
When trying to see that an object does not exist, a state errno leaked from our "first try to open a packfile with O_NOATIME and then if it fails retry without it" logic on a system that refuses O_NOATIME. This confused us and caused us to die, saying that the packfile is unreadable, when we should have just reported that the object does not exist in that packfile to the caller. * cb/open-noatime-clear-errno: git_open_noatime: return with errno=0 on success
2015-08-19Merge branch 'jk/git-path'Junio C Hamano
git_path() and mkpath() are handy helper functions but it is easy to misuse, as the callers need to be careful to keep the number of active results below 4. Their uses have been reduced. * jk/git-path: memoize common git-path "constant" files get_repo_path: refactor path-allocation find_hook: keep our own static buffer refs.c: remove_empty_directories can take a strbuf refs.c: avoid git_path assignment in lock_ref_sha1_basic refs.c: avoid repeated git_path calls in rename_tmp_log refs.c: simplify strbufs in reflog setup and writing path.c: drop git_path_submodule refs.c: remove extra git_path calls from read_loose_refs remote.c: drop extraneous local variable from migrate_file prefer mkpathdup to mkpath in assignments prefer git_pathdup to git_path in some possibly-dangerous cases add_to_alternates_file: don't add duplicate entries t5700: modernize style cache.h: complete set of git_path_submodule helpers cache.h: clarify documentation for git_path, et al
2015-08-19Merge branch 'jc/finalize-temp-file'Junio C Hamano
Long overdue micro clean-up. * jc/finalize-temp-file: sha1_file.c: rename move_temp_to_file() to finalize_object_file()
2015-08-12git_open_noatime: return with errno=0 on successClemens Buchacher
In read_sha1_file_extended we die if read_object fails with a fatal error. We detect a fatal error if errno is non-zero and is not ENOENT. If the object could not be read because it does not exist, this is not considered a fatal error and we want to return NULL. Somewhere down the line, read_object calls git_open_noatime to open a pack index file, for example. We first try open with O_NOATIME. If O_NOATIME fails with EPERM, we retry without O_NOATIME. When the second open succeeds, errno is however still set to EPERM from the first attempt. When we finally determine that the object does not exist, read_object returns NULL and read_sha1_file_extended dies with a fatal error: fatal: failed to read object <sha1>: Operation not permitted Fix this by resetting errno to zero before we call open again. Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Clemens Buchacher <clemens.buchacher@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10add_to_alternates_file: don't add duplicate entriesJeff King
The add_to_alternates_file function blindly uses hold_lock_file_for_append to copy the existing contents, and then adds the new line to it. This has two minor problems: 1. We might add duplicate entries, which are ugly and inefficient. 2. We do not check that the file ends with a newline, in which case we would bogusly append to the final line. This is quite unlikely in practice, though, as we call this function only from git-clone, so presumably we are the only writers of the file (and we always add a newline). Instead of using hold_lock_file_for_append, let's copy the file line by line, which ensures all records are properly terminated. If we see an extra line, we can simply abort the update (there is no point in even copying the rest, as we know that it would be identical to the original). As a bonus, we also get rid of some calls to the static-buffer mkpath and git_path functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10sha1_file.c: rename move_temp_to_file() to finalize_object_file()Junio C Hamano
Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not loosen, 2009-03-25), we kept reminding ourselves: NEEDSWORK: this should be renamed to finalize_temp_file() as "moving" is only a part of what it does, when no patch between master to pu changes the call sites of this function. without doing anything about it. Let's do so. The purpose of this function was not to move but to finalize. The detail of the primarily implementation of finalizing was to link the temporary file to its final name and then to unlink, which wasn't even "moving". The alternative implementation did "move" by calling rename(2), which is a fun tangent. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-27Merge branch 'jk/fix-refresh-utime' into maintJunio C Hamano
Fix a small bug in our use of umask() return value. * jk/fix-refresh-utime: check_and_freshen_file: fix reversed success-check
2015-07-27Merge branch 'jk/index-pack-reduce-recheck' into maintJunio C Hamano
Disable "have we lost a race with competing repack?" check while receiving a huge object transfer that runs index-pack. * jk/index-pack-reduce-recheck: index-pack: avoid excessive re-reading of pack directory
2015-07-10Merge branch 'jk/fix-refresh-utime'Junio C Hamano
Fix a small bug in our use of umask() return value. * jk/fix-refresh-utime: check_and_freshen_file: fix reversed success-check
2015-07-09Merge branch 'jk/maint-for-each-packed-object'Junio C Hamano
The for_each_packed_object() API function did not iterate over objects in a packfile that hasn't been used yet. * jk/maint-for-each-packed-object: for_each_packed_object: automatically open pack index
2015-07-08check_and_freshen_file: fix reversed success-checkJeff King
When we want to write out a loose object file, we have always first made sure we don't already have the object somewhere. Since 33d4221 (write_sha1_file: freshen existing objects, 2014-10-15), we also update the timestamp on the file, so that a simultaneous prune knows somebody is likely to reference it soon. If our utime() call fails, we treat this the same as not having the object in the first place; the safe thing to do is write out another copy. However, the loose-object check accidentally inverts the utime() check; it returns failure _only_ when the utime() call actually succeeded. Thus it was failing to protect us there, and in the normal case where utime() succeeds, it caused us to pointlessly write out and link the object. This passed our freshening tests, because writing out the new object is certainly _one_ way of updating its utime. So the normal case was inefficient, but not wrong. While we're here, let's also drop a comment in front of the check_and_freshen functions, making a note of their return type (since it is not our usual "0 for success, -1 for error"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-25Merge branch 'jk/diagnose-config-mmap-failure' into maintJunio C Hamano
The configuration reader/writer uses mmap(2) interface to access the files; when we find a directory, it barfed with "Out of memory?". * jk/diagnose-config-mmap-failure: xmmap(): drop "Out of memory?" config.c: rewrite ENODEV into EISDIR when mmap fails config.c: avoid xmmap error messages config.c: fix mmap leak when writing config read-cache.c: drop PROT_WRITE from mmap of index
2015-06-24Merge branch 'jk/index-pack-reduce-recheck'Junio C Hamano
Disable "have we lost a race with competing repack?" check while receiving a huge object transfer that runs index-pack. * jk/index-pack-reduce-recheck: index-pack: avoid excessive re-reading of pack directory
2015-06-22for_each_packed_object: automatically open pack indexJeff King
When for_each_packed_object is called, we call prepare_packed_git() to make sure we have the actual list of packs. But the latter does not actually open the pack indices, meaning that pack->nr_objects may simply be 0 if the pack has not otherwise been used since the program started. In practice, this didn't come up for the current callers, because they iterate the packed objects only after iterating all reachable objects (so for it to matter you would have to have a pack consisting only of unreachable objects). But it is a dangerous and confusing interface that should be fixed for future callers. Note that we do not end the iteration when a pack cannot be opened, but we do return an error. That lets you complete the iteration even in actively-repacked repository where an .idx file may racily go away, but it also lets callers know that they may not have gotten the complete list (which the current reachability-check caller does care about). We have to tweak one of the prune tests due to the changed return value; an earlier test creates bogus .idx files and does not clean them up. Having to make this tweak is a good thing; it means we will not prune in a broken repository, and the test confirms that we do not negatively impact a more lenient caller, count-objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-16Merge branch 'jh/filter-empty-contents' into maintJunio C Hamano
The clean/smudge interface did not work well when filtering an empty contents (failed and then passed the empty input through). It can be argued that a filter that produces anything but empty for an empty input is nonsense, but if the user wants to do strange things, then why not? * jh/filter-empty-contents: sha1_file: pass empty buffer to index empty file
2015-06-11Merge branch 'jk/diagnose-config-mmap-failure'Junio C Hamano
The configuration reader/writer uses mmap(2) interface to access the files; when we find a directory, it barfed with "Out of memory?". * jk/diagnose-config-mmap-failure: xmmap(): drop "Out of memory?" config.c: rewrite ENODEV into EISDIR when mmap fails config.c: avoid xmmap error messages config.c: fix mmap leak when writing config read-cache.c: drop PROT_WRITE from mmap of index
2015-06-09index-pack: avoid excessive re-reading of pack directoryJeff King
Since 45e8a74 (has_sha1_file: re-check pack directory before giving up, 2013-08-30), we spend extra effort for has_sha1_file to give the right answer when somebody else is repacking. Usually this effort does not matter, because after finding that the object does not exist, the next step is usually to die(). However, some code paths make a large number of has_sha1_file checks which are _not_ expected to return 1. The collision test in index-pack.c is such a case. On a local system, this can cause a performance slowdown of around 5%. But on a system with high-latency system calls (like NFS), it can be much worse. This patch introduces a "quick" flag to has_sha1_file which callers can use when they would prefer high performance at the cost of false negatives during repacks. There may be other code paths that can use this, but the index-pack one is the most obviously critical, so we'll start with switching that one. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-05Merge branch 'jk/sha1-file-reduce-useless-warnings' into maintJunio C Hamano
* jk/sha1-file-reduce-useless-warnings: sha1_file: squelch "packfile cannot be accessed" warnings
2015-06-01Merge branch 'jh/filter-empty-contents'Junio C Hamano
The clean/smudge interface did not work well when filtering an empty contents (failed and then passed the empty input through). It can be argued that a filter that produces anything but empty for an empty input is nonsense, but if the user wants to do strange things, then why not? * jh/filter-empty-contents: sha1_file: pass empty buffer to index empty file
2015-05-28xmmap(): drop "Out of memory?"Junio C Hamano
We show that message with die_errno(), but the OS is ought to know why mmap(2) failed much better than we do. There is no reason for us to say "Out of memory?" here. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-28config.c: avoid xmmap error messagesJeff King
The config-writing code uses xmmap to map the existing config file, which will die if the map fails. This has two downsides: 1. The error message is not very helpful, as it lacks any context about the file we are mapping: $ mkdir foo $ git config --file=foo some.key value fatal: Out of memory? mmap failed: No such device 2. We normally do not die in this code path; instead, we'd rather report the error and return an appropriate exit status (which is part of the public interface documented in git-config.1). This patch introduces a "gentle" form of xmmap which lets us produce our own error message. We do not want to use mmap directly, because we would like to use the other compatibility elements of xmmap (e.g., handling 0-length maps portably). The end result is: $ git.compile config --file=foo some.key value error: unable to mmap 'foo': No such device $ echo $? 3 Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-26Merge branch 'jc/hash-object' into maintJunio C Hamano
"hash-object --literally" introduced in v2.2 was not prepared to take a really long object type name. * jc/hash-object: write_sha1_file(): do not use a separate sha1[] array t1007: add hash-object --literally tests hash-object --literally: fix buffer overrun with extra-long object type git-hash-object.txt: document --literally option
2015-05-19Merge branch 'kn/cat-file-literally'Junio C Hamano
Add the "--allow-unknown-type" option to "cat-file" to allow inspecting loose objects of an experimental or a broken type. * kn/cat-file-literally: t1006: add tests for git cat-file --allow-unknown-type cat-file: teach cat-file a '--allow-unknown-type' option cat-file: make the options mutually exclusive sha1_file: support reading from a loose object of unknown type
2015-05-18sha1_file: pass empty buffer to index empty fileJim Hill
`git add` of an empty file with a filter pops complaints from `copy_fd` about a bad file descriptor. This traces back to these lines in sha1_file.c:index_core: if (!size) { ret = index_mem(sha1, NULL, size, type, path, flags); The problem here is that content to be added to the index can be supplied from an fd, or from a memory buffer, or from a pathname. This call is supplying a NULL buffer pointer and a zero size. Downstream logic takes the complete absence of a buffer to mean the data is to be found elsewhere -- for instance, these, from convert.c: if (params->src) { write_err = (write_in_full(child_process.in, params->src, params->size) < 0); } else { write_err = copy_fd(params->fd, child_process.in); } ~If there's a buffer, write from that, otherwise the data must be coming from an open fd.~ Perfectly reasonable logic in a routine that's going to write from either a buffer or an fd. So change `index_core` to supply an empty buffer when indexing an empty file. There's a patch out there that instead changes the logic quoted above to take a `-1` fd to mean "use the buffer", but it seems to me that the distinction between a missing buffer and an empty one carries intrinsic semantics, where the logic change is adapting the code to handle incorrect arguments. Signed-off-by: Jim Hill <gjthill@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-13Merge branch 'jk/prune-mtime' into maintJunio C Hamano
Access to objects in repositories that borrow from another one on a slow NFS server unnecessarily got more expensive due to recent code becoming more cautious in a naive way not to lose objects to pruning. * jk/prune-mtime: sha1_file: only freshen packs once per run sha1_file: freshen pack objects before loose reachable: only mark local objects as recent
2015-05-11Merge branch 'jc/hash-object'Junio C Hamano
"hash-object --literally" introduced in v2.2 was not prepared to take a really long object type name. * jc/hash-object: write_sha1_file(): do not use a separate sha1[] array t1007: add hash-object --literally tests hash-object --literally: fix buffer overrun with extra-long object type git-hash-object.txt: document --literally option
2015-05-11Merge branch 'jk/sha1-file-reduce-useless-warnings'Junio C Hamano
* jk/sha1-file-reduce-useless-warnings: sha1_file: squelch "packfile cannot be accessed" warnings
2015-05-11Merge branch 'nd/multiple-work-trees'Junio C Hamano
A replacement for contrib/workdir/git-new-workdir that does not rely on symbolic links and make sharing of objects and refs safer by making the borrowee and borrowers aware of each other. * nd/multiple-work-trees: (41 commits) prune --worktrees: fix expire vs worktree existence condition t1501: fix test with split index t2026: fix broken &&-chain t2026 needs procondition SANITY git-checkout.txt: a note about multiple checkout support for submodules checkout: add --ignore-other-wortrees checkout: pass whole struct to parse_branchname_arg instead of individual flags git-common-dir: make "modules/" per-working-directory directory checkout: do not fail if target is an empty directory t2025: add a test to make sure grafts is working from a linked checkout checkout: don't require a work tree when checking out into a new one git_path(): keep "info/sparse-checkout" per work-tree count-objects: report unused files in $GIT_DIR/worktrees/... gc: support prune --worktrees gc: factor out gc.pruneexpire parsing code gc: style change -- no SP before closing parenthesis checkout: clean up half-prepared directories in --to mode checkout: reject if the branch is already checked out elsewhere prune: strategies for linked checkouts checkout: support checking out into a new working directory ...
2015-05-06sha1_file: support reading from a loose object of unknown typeKarthik Nayak
Update sha1_loose_object_info() to optionally allow it to read from a loose object file of unknown/bogus type; as the function usually returns the type of the object it read in the form of enum for known types, add an optional "typename" field to receive the name of the type in textual form and a flag to indicate the reading of a loose object file of unknown/bogus type. Add parse_sha1_header_extended() which acts as a wrapper around parse_sha1_header() allowing more information to be obtained. Add unpack_sha1_header_to_strbuf() to unpack sha1 headers of unknown/corrupt objects which have a unknown sha1 header size to a strbuf structure. This was written by Junio C Hamano but tested by me. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Hepled-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-06Merge branch 'jk/prune-mtime'Junio C Hamano
Access to objects in repositories that borrow from another one on a slow NFS server unnecessarily got more expensive due to recent code becoming more cautious in a naive way not to lose objects to pruning. * jk/prune-mtime: sha1_file: only freshen packs once per run sha1_file: freshen pack objects before loose reachable: only mark local objects as recent
2015-05-05write_sha1_file(): do not use a separate sha1[] arrayJunio C Hamano
In the beginning, write_sha1_file() did not have a way to tell the caller the name of the object it wrote to the caller. This was changed in d6d3f9d0 (This implements the new "recursive tree" write-tree., 2005-04-09) by adding the "returnsha1" parameter to the function so that the callers who are interested in the value can optionally pass a pointer to receive it. It turns out that all callers do want to know the name of the object it just has written. Nobody passes a NULL to this parameter, hence it is not necessary to use a separate sha1[] array to receive the result from write_sha1_file_prepare(), and copy the result to the returnsha1 supplied by the caller. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-05hash-object --literally: fix buffer overrun with extra-long object typeEric Sunshine
"hash-object" learned in 5ba9a93 (hash-object: add --literally option, 2014-09-11) to allow crafting a corrupt/broken object of unknown type. When the user-provided type is particularly long, however, it can overflow the relatively small stack-based character array handed to write_sha1_file_prepare() by hash_sha1_file() and write_sha1_file(), leading to stack corruption (and crash). Introduce a custom helper to allow arbitrarily long typenames just for "hash-object --literally". [jc: Eric's original used a strbuf in the more common codepaths, and I rewrote it to avoid penalizing the non-literally code. Bugs are mine] Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-20sha1_file: only freshen packs once per runJeff King
Since 33d4221 (write_sha1_file: freshen existing objects, 2014-10-15), we update the mtime of existing objects that we would have written out (had they not existed). For the common case in which many objects are packed, we may update the mtime on a single packfile repeatedly. This can result in a noticeable performance problem if calling utime() is expensive (e.g., because your storage is on NFS). We can fix this by keeping a per-pack flag that lets us freshen only once per program invocation. An alternative would be to keep the packed_git.mtime flag up to date as we freshen, and freshen only once every N seconds. In practice, it's not worth the complexity. We are racing against prune expiration times here, which inherently must be set to accomodate reasonable program running times (because they really care about the time between an object being written and it becoming referenced, and the latter is typically the last step a program takes). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-20sha1_file: freshen pack objects before looseJeff King
When writing out an object file, we first check whether it already exists and if so optimize out the write. Prior to 33d4221, we did this by calling has_sha1_file(), which will check for packed objects followed by loose. Since that commit, we check loose objects first. For the common case of a repository whose objects are mostly packed, this means we will make a lot of extra access() system calls checking for loose objects. We should follow the same packed-then-loose order that all of our other lookups use. Reported-by: Stefan Saasen <ssaasen@atlassian.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>