summaryrefslogtreecommitdiff
path: root/Documentation/technical
diff options
context:
space:
mode:
authorJeff King <peff@peff.net>2017-01-19 16:33:50 (GMT)
committerJunio C Hamano <gitster@pobox.com>2017-01-19 19:17:20 (GMT)
commitabd5a0026868e4c8e59126469dd76fe078689a27 (patch)
treeb6eccd71464f94de5e0edf066a14b117f98df4ba /Documentation/technical
parentad36dc8b4b165bf9eb3576b42a241164e312d48c (diff)
downloadgit-abd5a0026868e4c8e59126469dd76fe078689a27.zip
git-abd5a0026868e4c8e59126469dd76fe078689a27.tar.gz
git-abd5a0026868e4c8e59126469dd76fe078689a27.tar.bz2
clear_delta_base_cache(): don't modify hashmap while iterating
On Thu, Jan 19, 2017 at 03:03:46PM +0100, Ulrich Spörlein wrote: > > I suspect the patch below may fix things for you. It works around it by > > walking over the lru list (either is fine, as they both contain all > > entries, and since we're clearing everything, we don't care about the > > order). > > Confirmed. With the patch applied, I can import the whole 55G in one go > without any crashes or aborts. Thanks much! Thanks. Here it is rolled up with a commit message. -- >8 -- Subject: clear_delta_base_cache(): don't modify hashmap while iterating Removing entries while iterating causes fast-import to access an already-freed `struct packed_git`, leading to various confusing errors. What happens is that clear_delta_base_cache() drops the whole contents of the cache by iterating over the hashmap, calling release_delta_base_cache() on each entry. That function removes the item from the hashmap. The hashmap code may then shrink the table, but the hashmap_iter struct retains an offset from the old table. As a result, the next call to hashmap_iter_next() may claim that the iteration is done, even though some items haven't been visited. The only caller of clear_delta_base_cache() is fast-import, which wants to clear the cache because it is discarding the packed_git struct for its temporary pack. So by failing to remove all of the entries, we still have references to the freed packed_git. To make things even more confusing, this doesn't seem to trigger with the test suite, because it depends on complexities like the size of the hash table, which entries got cleared, whether we try to access them before they're evicted from the cache, etc. So I've been able to identify the problem with large imports like freebsd's svn import, or a fast-export of linux.git. But nothing that would be reasonable to run as part of the normal test suite. We can fix this easily by iterating over the lru linked list instead of the hashmap. They both contain the same entries, and we can use the "safe" variant of the list iterator, which exists for exactly this case. Let's also add a warning to the hashmap API documentation to reduce the chances of getting bit by this again. Reported-by: Ulrich Spörlein <uqs@freebsd.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'Documentation/technical')
-rw-r--r--Documentation/technical/api-hashmap.txt4
1 files changed, 3 insertions, 1 deletions
diff --git a/Documentation/technical/api-hashmap.txt b/Documentation/technical/api-hashmap.txt
index 28f5a8b..a3f020c 100644
--- a/Documentation/technical/api-hashmap.txt
+++ b/Documentation/technical/api-hashmap.txt
@@ -188,7 +188,9 @@ Returns the removed entry, or NULL if not found.
`void *hashmap_iter_next(struct hashmap_iter *iter)`::
`void *hashmap_iter_first(struct hashmap *map, struct hashmap_iter *iter)`::
- Used to iterate over all entries of a hashmap.
+ Used to iterate over all entries of a hashmap. Note that it is
+ not safe to add or remove entries to the hashmap while
+ iterating.
+
`hashmap_iter_init` initializes a `hashmap_iter` structure.
+