summaryrefslogtreecommitdiff
path: root/refs.c
AgeCommit message (Collapse)Author
2013-01-10Merge branch 'jk/repack-ref-racefix' into maintJunio C Hamano
* jk/repack-ref-racefix: refs: do not use cached refs in repack_without_ref
2012-12-21refs: do not use cached refs in repack_without_refJeff King
When we delete a ref that is packed, we rewrite the whole packed-refs file and simply omit the ref that no longer exists. However, we base the rewrite on whatever happens to be in our refs cache, not what is necessarily on disk. That opens us up to a race condition if another process is simultaneously packing the refs, as we will overwrite their newly-made pack-refs file with our potentially stale data, losing commits. You can demonstrate the race like this: # setup some repositories git init --bare parent && (cd parent && git config core.logallrefupdates true) && git clone parent child && (cd child && git commit --allow-empty -m base) # in one terminal, repack the refs repeatedly cd parent && while true; do git pack-refs --all done # in another terminal, simultaneously push updates to # master, and create and delete an unrelated ref cd child && while true; do git push origin HEAD:newbranch && git commit --allow-empty -m foo us=`git rev-parse master` && git push origin master && git push origin :newbranch && them=`git --git-dir=../parent rev-parse master` && if test "$them" != "$us"; then echo >&2 "$them" != "$us" exit 1 fi done In many cases the two processes will conflict over locking the packed-refs file, and the deletion of newbranch will simply fail. But eventually you will hit the race, which happens like this: 1. We push a new commit to master. It is already packed (from the looping pack-refs call). We write the new value (let us call it B) to $GIT_DIR/refs/heads/master, but the old value (call it A) remains in the packed-refs file. 2. We push the deletion of newbranch, spawning a receive-pack process. Receive-pack advertises all refs to the client, causing it to iterate over each ref; it caches the packed refs in memory, which points at the stale value A. 3. Meanwhile, a separate pack-refs process is running. It runs to completion, updating the packed-refs file to point master at B, and deleting $GIT_DIR/refs/heads/master which also pointed at B. 4. Back in the receive-pack process, we get the instruction to delete :newbranch. We take a lock on packed-refs (which works, as the other pack-refs process has already finished). We then rewrite the contents using the cached refs, which contain the stale value A. The resulting packed-refs file points master once again at A. The loose ref which would override it to point at B was deleted (rightfully) in step 3. As a result, master now points at A. The only trace that B ever existed in the parent is in the reflog: the final entry will show master moving from A to B, even though the ref still points at A (so you can detect this race after the fact, because the next reflog entry will move from A to C). We can fix this by invalidating the packed-refs cache after we have taken the lock. This means that we will re-read the packed-refs file, and since we have the lock, we will be sure that what we read will be atomically up-to-date when we write (it may be out of date with respect to loose refs, but that is OK, as loose refs take precedence). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-26Merge branch 'jh/update-ref-d-through-symref'Junio C Hamano
"update-ref -d --deref SYM" to delete a ref through a symbolic ref that points to it did not remove it correctly. * jh/update-ref-d-through-symref: Fix failure to delete a packed ref through a symref t1400-update-ref: Add test verifying bug with symrefs in delete_ref()
2012-11-09Merge branch 'rs/lock-correct-ref-during-delete'Jeff King
When "update-ref -d --no-deref SYM" tried to delete a symbolic ref SYM, it incorrectly locked the underlying reference pointed by SYM, not the symbolic ref itself. * rs/lock-correct-ref-during-delete: refs: lock symref that is to be deleted, not its target
2012-10-21Fix failure to delete a packed ref through a symrefJohan Herland
When deleting a ref through a symref (e.g. using 'git update-ref -d HEAD' to delete refs/heads/master), we would remove the loose ref, but a packed version of the same ref would remain, the end result being that instead of deleting refs/heads/master we would appear to reset it to its state as of the last repack. This patch fixes the issue, by making sure we pass the correct ref name when invoking repack_without_ref() from within delete_ref(). Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16refs: lock symref that is to be deleted, not its targetRené Scharfe
When delete_ref is called on a symref then it locks its target and then either deletes the target or the symref, depending on whether the flag REF_NODEREF was set in the parameter delopt. Instead, simply pass the flag to lock_ref_sha1_basic, which will then either lock the target or the symref, and delete the locked ref. This reimplements part of eca35a25 (Fix git branch -m for symrefs.). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-05peel_ref: check object type before loadingJeff King
The point of peel_ref is to dereference tags; if the base object is not a tag, then we can return early without even loading the object into memory. This patch accomplishes that by checking sha1_object_info for the type. For a packed object, we can get away with just looking in the pack index. For a loose object, we only need to inflate the first couple of header bytes. This is a bit of a gamble; if we do find a tag object, then we will end up loading the content anyway, and the extra lookup will have been wasteful. However, if it is not a tag object, then we save loading the object entirely. Depending on the ratio of non-tags to tags in the input, this can be a minor win or minor loss. However, it does give us one potential major win: if a ref points to a large blob (e.g., via an unannotated tag), then we can avoid looking at it entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-05peel_ref: do not return a null sha1Jeff King
The idea of the peel_ref function is to dereference tag objects recursively until we hit a non-tag, and return the sha1. Conceptually, it should return 0 if it is successful (and fill in the sha1), or -1 if there was nothing to peel. However, the current behavior is much more confusing. For a regular loose ref, the behavior is as described above. But there is an optimization to reuse the peeled-ref value for a ref that came from a packed-refs file. If we have such a ref, we return its peeled value, even if that peeled value is null (indicating that we know the ref definitely does _not_ peel). It might seem like such information is useful to the caller, who would then know not to bother loading and trying to peel the object. Except that they should not bother loading and trying to peel the object _anyway_, because that fallback is already handled by peel_ref. In other words, the whole point of calling this function is that it handles those details internally, and you either get a sha1, or you know that it is not peel-able. This patch catches the null sha1 case internally and converts it into a -1 return value (i.e., there is nothing to peel). This simplifies callers, which do not need to bother checking themselves. Two callers are worth noting: - in pack-objects, a comment indicates that there is a difference between non-peelable tags and unannotated tags. But that is not the case (before or after this patch). Whether you get a null sha1 has to do with internal details of how peel_ref operated. - in show-ref, if peel_ref returns a failure, the caller tries to decide whether to try peeling manually based on whether the REF_ISPACKED flag is set. But this doesn't make any sense. If the flag is set, that does not necessarily mean the ref came from a packed-refs file with the "peeled" extension. But it doesn't matter, because even if it didn't, there's no point in trying to peel it ourselves, as peel_ref would already have done so. In other words, the fallback peeling is guaranteed to fail. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-05peel_ref: use faster deref_tag_noverifyJeff King
When we are asked to peel a ref to a sha1, we internally call deref_tag, which will recursively parse each tagged object until we reach a non-tag. This has the benefit that we will verify our ability to load and parse the pointed-to object. However, there is a performance downside: we may not need to load that object at all (e.g., if we are listing peeled simply listing peeled refs), or it may be a large object that should follow a streaming code path (e.g., an annotated tag of a large blob). It makes more sense for peel_ref to choose the fast thing rather than performing the extra check, for two reasons: 1. We will already sometimes short-circuit the tag parsing in favor of a peeled entry from a packed-refs file. So we are already favoring speed in some cases, and it is not wise for a caller to rely on peel_ref to detect corruption. 2. We already silently ignore much larger corruptions, like a ref that points to a non-existent object, or a tag object that exists but is corrupted. 2. peel_ref is not the right place to check for such a database corruption. It is returning only the sha1 anyway, not the actual object. Any callers which use that sha1 to load an object will soon discover the corruption anyway, so we are really just pushing back the discovery to later in the program. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-29Merge branch 'rs/refs-string-slice'Junio C Hamano
Avoid unnecessary temporary allocations while looking for matching refs inside refs API. By René Scharfe (3) and Junio C Hamano (1) * rs/refs-string-slice: refs: do not create ref_entry when searching refs: use strings directly in find_containing_dir() refs: convert parameter of create_dir_entry() to length-limited string refs: convert parameter of search_ref_dir() to length-limited string
2012-05-29Merge branch 'mh/ref-api-lazy-loose'Junio C Hamano
The code to lazily read loose refs unnecessarily read the refs in a subhierarchy by mistake when we free the data for the subhierarchy. By Michael Haggerty * mh/ref-api-lazy-loose: free_ref_entry(): do not trigger reading of loose refs
2012-05-25Merge branch 'mh/ref-api'Junio C Hamano
Fixes a performance regression in the earlier series.
2012-05-24Avoid sorting if references are added to ref_cache in orderMichael Haggerty
The old code allowed many references to be efficiently added to a single directory, because it just appended the references to the containing directory unsorted without doing any searching (and therefore without requiring any intermediate sorting). But the old code was inefficient when a large number of subdirectories were added to a directory, because the directory always had to be searched to see if the new subdirectory already existed, and this search required the directory to be sorted first. The same was repeated for every new subdirectory, so the time scaled like O(N^2), where N is the number of subdirectories within a single directory. In practice, references are often added to the ref_cache in lexicographic order, for example when reading the packed-refs file. So build some intelligence into add_entry_to_dir() to optimize for the case of references and/or subdirectories being added in lexicographic order: if the existing entries were already sorted, and the new entry comes after the last existing entry, then adjust ref_dir::sorted to reflect the fact that the ref_dir is still sorted. Thanks to Peff for pointing out the performance regression that inspired this change. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22refs: do not create ref_entry when searchingJunio C Hamano
The search_ref_dir() function is about looking up an existing ref_entry in a sorted array of ref_entry stored in dir->entries, but it still allocates a new ref_entry and frees it before returning. This is only because the call to bsearch(3) was coded in a suboptimal way. Unlike the comparison function given to qsort(3), the first parameter to its comparison function does not need to point at an object that is shaped like an element in the array. Introduce a new comparison function that takes a counted string as the key and an element in an array of ref_entry and give it to bsearch(), so that we do not have to allocate a new ref_entry that we will never return to the caller anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22refs: use strings directly in find_containing_dir()René Scharfe
Convert the parameter subdirname of search_for_subdir() to a length-limted string and then simply pass the interesting slice of the refname from find_containing_dir(), thereby avoiding to duplicate the string. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22refs: convert parameter of create_dir_entry() to length-limited stringRené Scharfe
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-22refs: convert parameter of search_ref_dir() to length-limited stringRené Scharfe
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-20free_ref_entry(): do not trigger reading of loose refsMichael Haggerty
Do not call get_ref_dir() from within free_ref_entry(), because that triggers the reading of loose refs, only for them to be freed immediately. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-10Merge branch 'mh/ref-api-lazy-loose'Junio C Hamano
Refs API is updated to lazily read sub-hierarchies of refs/ namespace, so that we do not have to grab everything from the filesystem when we are only interested in listing branches, for example. By Michael Haggerty (17) and Junio C Hamano (1) * mh/ref-api-lazy-loose: refs: fix find_containing_dir() regression refs: read loose references lazily read_loose_refs(): eliminate ref_cache argument struct ref_dir: store a reference to the enclosing ref_cache search_for_subdir(): return (ref_dir *) instead of (ref_entry *) get_ref_dir(): add function for getting a ref_dir from a ref_entry read_loose_refs(): rename function from get_ref_dir() refs: wrap top-level ref_dirs in ref_entries find_containing_dir(): use strbuf in implementation of this function bisect: copy filename string obtained from git_path() do_for_each_reflog(): use a strbuf to hold logfile name do_for_each_reflog(): return early on error get_ref_dir(): take the containing directory as argument refs.c: extract function search_for_subdir() get_ref_dir(): require that the dirname argument ends in '/' get_ref_dir(): rename "base" parameter to "dirname" get_ref_dir(): use a strbuf to hold refname get_ref_dir(): return early if directory cannot be read
2012-05-04refs: fix find_containing_dir() regressionJunio C Hamano
The function used to return NULL when asked to find the containing directory for a ref that does not exist, allowing the caller to omit iteration altogether. But a misconversion in an earlier change "refs.c: extract function search_for_subdir()" started returning the top-level directory entry, forcing callers to walk everything. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03refs: read loose references lazilyMichael Haggerty
Instead of reading the whole directory of loose references the first time any are needed, only read them on demand, one directory at a time. Use a new ref_entry flag bit REF_INCOMPLETE to indicate that the entry represents a REF_DIR that hasn't been read yet. Whenever any entries from such a directory are needed, read all of the loose references from that directory. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03read_loose_refs(): eliminate ref_cache argumentMichael Haggerty
The ref_cache can now be read from the ref_dir. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03struct ref_dir: store a reference to the enclosing ref_cacheMichael Haggerty
This means that a directory ref_entry contains all of the information needed by read_loose_refs(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03search_for_subdir(): return (ref_dir *) instead of (ref_entry *)Michael Haggerty
That is what all the callers want, so give it to them. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03get_ref_dir(): add function for getting a ref_dir from a ref_entryMichael Haggerty
Convert all accesses of a ref_dir within a ref_entry to use this function. This function will later be responsible for reading loose references from disk on demand. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03read_loose_refs(): rename function from get_ref_dir()Michael Haggerty
The new name better describes the function's purpose, and also makes the old name available for a more suitable purpose. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03refs: wrap top-level ref_dirs in ref_entriesMichael Haggerty
Make it turtles all the way down. This affects the loose and packed fields of ref_cache instances. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03find_containing_dir(): use strbuf in implementation of this functionMichael Haggerty
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03do_for_each_reflog(): use a strbuf to hold logfile nameMichael Haggerty
This simplifies the bookkeeping and allows an (artificial) restriction on refname component length to be removed. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03do_for_each_reflog(): return early on errorMichael Haggerty
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03get_ref_dir(): take the containing directory as argumentMichael Haggerty
Previously, the "dir" argument to get_ref_dir() was a pointer to the top-level ref_dir. Change the function to expect a pointer to the ref_dir corresponding to dirname. This allows entries to be added directly to dir, without having to recurse through the reference trie each time (i.e., we can use add_entry_to_dir() instead of add_ref()). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03refs.c: extract function search_for_subdir()Michael Haggerty
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03get_ref_dir(): require that the dirname argument ends in '/'Michael Haggerty
This removes some conditional code and makes it consistent with the way that direntry names are stored. Please note that this function is never used on the top-level .git directory; it is always called for directories at level .git/refs or deeper. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03get_ref_dir(): rename "base" parameter to "dirname"Michael Haggerty
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03get_ref_dir(): use a strbuf to hold refnameMichael Haggerty
This simplifies the bookkeeping and allows an (artificial) restriction on refname component length to be removed. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-02Merge branch 'nd/i18n'Junio C Hamano
More message strings marked for i18n. By Nguyễn Thái Ngọc Duy (10) and Jonathan Nieder (1) * nd/i18n: help: replace underlining "help -a" headers using hyphens with a blank line i18n: bundle: mark strings for translation i18n: index-pack: mark strings for translation i18n: apply: update say_patch_name to give translators complete sentence i18n: apply: mark strings for translation i18n: remote: mark strings for translation i18n: make warn_dangling_symref() automatically append \n i18n: help: mark strings for translation i18n: mark relative dates for translation strbuf: convenience format functions with \n automatically appended Makefile: feed all header files to xgettext
2012-04-25get_ref_dir(): return early if directory cannot be readMichael Haggerty
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-24i18n: make warn_dangling_symref() automatically append \nNguyễn Thái Ngọc Duy
This helps remove \n from translatable strings Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10do_for_each_ref(): only iterate over the subtree that was requestedMichael Haggerty
If the base argument has a "/" chararacter, then only iterate over the reference subdir whose name is the part up to the last "/". Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10refs: store references hierarchicallyMichael Haggerty
Store references hierarchically in a tree that matches the pseudo-directory structure of the reference names. Add a new kind of ref_entry (with flag REF_DIR) to represent a whole subdirectory of references. Sort ref_dirs one subdirectory at a time. NOTE: the dirs can now be sorted as a side-effect of other function calls. Therefore, it would be problematic to do something from a each_ref_fn callback that could provoke the sorting of a directory that is currently being iterated over (i.e., the directory containing the entry that is being processed or any of its parents). This is a bit far-fetched, because a directory is always sorted just before being iterated over. Therefore, read-only accesses cannot trigger the sorting of a directory whose iteration has already started. But if a callback function would add a reference to a parent directory of the reference in the iteration, then try to resolve a reference under that directory, a re-sort could be triggered and cause the iteration to work incorrectly. Nevertheless...add a comment in refs.h warning against modifications during iteration. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10sort_ref_dir(): simplify logicMichael Haggerty
Use the more usual indexing idiom for clarity. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10refs.c: rename ref_array -> ref_dirMichael Haggerty
This purely textual change is in preparation for storing references hierarchically, when the old ref_array structure will represent one "directory" of references. Rename functions that deal with this structure analogously, and also rename the structure's "refs" member to "entries". Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10struct ref_entry: nest the value part in a unionMichael Haggerty
This change is obviously silly by itself, but it is a step towards adding a second member to the union. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10check_refname_component(): return 0 for zero-length componentsMichael Haggerty
Return 0 (instead of -1) for zero-length components. Move the interpretation of zero-length components as illegal to check_refname_format(). This will make it easier to extend check_refname_format() to also check whether directory names are valid. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10free_ref_entry(): new functionMichael Haggerty
Add a function free_ref_entry(). This function will become nontrivial when ref_entry (soon) becomes polymorphic. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10names_conflict(): simplify implementationMichael Haggerty
Save a bunch of lines of code and a couple of strlen() calls. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10repack_without_ref(): reimplement using do_for_each_ref_in_array()Michael Haggerty
It costs a bit of boilerplate, but it means that the function can be ignorant of how cached refs are stored. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10do_for_each_ref_in_arrays(): new functionMichael Haggerty
Extract function do_for_each_ref_in_arrays() from do_for_each_ref(). The new function will be a useful building block for storing refs hierarchically. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10do_for_each_ref_in_array(): new functionMichael Haggerty
Extract function do_for_each_ref_in_array() from do_for_each_ref(). The new function will be a useful building block for storing refs hierarchically. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-10refs: manage current_ref within do_one_ref()Michael Haggerty
Set and clear current_ref within do_one_ref() instead of setting it here and leaving it to somebody else to clear it. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>