summaryrefslogtreecommitdiff
path: root/hashmap.c
AgeCommit message (Collapse)Author
2019-11-10Fix spelling errors in code commentsElijah Newren
Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: remove type arg from hashmap_{get,put,remove}_entryEric Wong
Since these macros already take a `keyvar' pointer of a known type, we can rely on OFFSETOF_VAR to get the correct offset without relying on non-portable `__typeof__' and `offsetof'. Argument order is also rearranged, so `keyvar' and `member' are sequential as they are used as: `keyvar->member' Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: introduce hashmap_free_entriesEric Wong
`hashmap_free_entries' behaves like `container_of' and passes the offset of the hashmap_entry struct to the internal `hashmap_free_' function, allowing the function to free any struct pointer regardless of where the hashmap_entry field is located. `hashmap_free' no longer takes any arguments aside from the hashmap itself. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: hashmap_{put,remove} return hashmap_entry *Eric Wong
And add *_entry variants to perform container_of as necessary to simplify most callers. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap: use *_entry APIs for iterationEric Wong
Inspired by list_for_each_entry in the Linux kernel. Once again, these are somewhat compromised usability-wise by compilers lacking __typeof__ support. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_cmp_fn takes hashmap_entry paramsEric Wong
Another step in eliminating the requirement of hashmap_entry being the first member of a struct. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_get{,_from_hash} return "struct hashmap_entry *"Eric Wong
Update callers to use hashmap_get_entry, hashmap_get_entry_from_hash or container_of as appropriate. This is another step towards eliminating the requirement of hashmap_entry being the first field in a struct. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_get_next returns "struct hashmap_entry *"Eric Wong
This is a step towards removing the requirement for hashmap_entry being the first field of a struct. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_put takes "struct hashmap_entry *"Eric Wong
This is less error-prone than "void *" as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_remove takes "const struct hashmap_entry *"Eric Wong
This is less error-prone than "const void *" as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_get takes "const struct hashmap_entry *"Eric Wong
This is less error-prone than "const void *" as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_add takes "struct hashmap_entry *"Eric Wong
This is less error-prone than "void *" as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_get_next takes "const struct hashmap_entry *"Eric Wong
This is less error-prone than "const void *" as the compiler now detects invalid types being passed. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-07hashmap_entry_init takes "struct hashmap_entry *"Eric Wong
C compilers do type checking to make life easier for us. So rely on that and update all hashmap_entry_init callers to take "struct hashmap_entry *" to avoid future bugs while improving safety and readability. Signed-off-by: Eric Wong <e@80x24.org> Reviewed-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07hashmap: add API to disable item counting when threadedJeff Hostetler
This is to address concerns raised by ThreadSanitizer on the mailing list about threaded unprotected R/W access to map.size with my previous "disallow rehash" change (0607e10009ee4e37cb49b4cec8d28a9dda1656a4). See: https://public-inbox.org/git/adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com/ Add API to hashmap to disable item counting and thus automatic rehashing. Also include API to later re-enable them. When item counting is disabled, the map.size field is invalid. So to prevent accidents, the field has been renamed and an accessor function hashmap_get_size() has been added. All direct references to this field have been been updated. And the name of the field changed to map.private_size to communicate this. Here is the relevant output from ThreadSanitizer showing the problem: WARNING: ThreadSanitizer: data race (pid=10554) Read of size 4 at 0x00000082d488 by thread T2 (mutexes: write M16): #0 hashmap_add hashmap.c:209 #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302 #2 handle_range_dir name-hash.c:347 #3 handle_range_1 name-hash.c:415 #4 lazy_dir_thread_proc name-hash.c:471 #5 <null> <null> Previous write of size 4 at 0x00000082d488 by thread T1 (mutexes: write M31): #0 hashmap_add hashmap.c:209 #1 hash_dir_entry_with_parent_and_prefix name-hash.c:302 #2 handle_range_dir name-hash.c:347 #3 handle_range_1 name-hash.c:415 #4 handle_range_dir name-hash.c:380 #5 handle_range_1 name-hash.c:415 #6 lazy_dir_thread_proc name-hash.c:471 #7 <null> <null> Martin gives instructions for running TSan on test t3008 in this post: https://public-inbox.org/git/CAN0heSoJDL9pWELD6ciLTmWf-a=oyxe4EXXOmCKvsG5MSuzxsA@mail.gmail.com/ Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30hashmap.h: compare function has access to a data fieldStefan Beller
When using the hashmap a common need is to have access to caller provided data in the compare function. A couple of times we abuse the keydata field to pass in the data needed. This happens for example in patch-ids.c. This patch changes the function signature of the compare function to have one more void pointer available. The pointer given for each invocation of the compare function must be defined in the init function of the hashmap and is just passed through. Documentation of this new feature is deferred to a later patch. This is a rather mechanical conversion, just adding the new pass-through parameter. However while at it improve the naming of the fields of all compare functions used by hashmaps by ensuring unused parameters are prefixed with 'unused_' and naming the parameters what they are (instead of 'unused' make it 'unused_keydata'). Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-22hashmap: add disallow_rehash settingJeff Hostetler
Teach hashmap to allow rehashes to be suppressed. This is useful when hashmaps are accessed by multiple threads. It still requires the caller to properly manage their locking. This just prevents unexpected rehashing during inserts and deletes. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-22hashmap: allow memihash computation to be continuedJeff Hostetler
Add variant of memihash() to allow the hash computation to be continued. There are times when we compute the hash on a full path and then the hash on just the path to the parent directory. This can be expensive on large repositories. With this, we can hash the parent directory first. And then continue the computation to include the "/filename". Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22convert trivial cases to FLEX_ARRAY macrosJeff King
Using FLEX_ARRAY macros reduces the amount of manual computation size we have to do. It also ensures we don't overflow size_t, and it makes sure we write the same number of bytes that we allocated. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-07hashmap: add string interning APIKarsten Blees
Interning short strings with high probability of duplicates can reduce the memory footprint and speed up comparisons. Add strintern() and memintern() APIs that use a hashmap to manage the pool of unique, interned strings. Note: strintern(getenv()) could be used to sanitize git's use of getenv(), in case we ever encounter a platform where a call to getenv() invalidates previous getenv() results (which is allowed by POSIX). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-18add a hashtable implementation that supports O(1) removalKarsten Blees
The existing hashtable implementation (in hash.[ch]) uses open addressing (i.e. resolve hash collisions by distributing entries across the table). Thus, removal is difficult to implement with less than O(n) complexity. Resolving collisions of entries with identical hashes (e.g. via chaining) is left to the client code. Add a hashtable implementation that supports O(1) removal and is slightly easier to use due to builtin entry chaining. Supports all basic operations init, free, get, add, remove and iteration. Also includes ready-to-use hash functions based on the public domain FNV-1 algorithm (http://www.isthe.com/chongo/tech/comp/fnv). The per-entry data structure (hashmap_entry) is piggybacked in front of the client's data structure to save memory. See test-hashmap.c for usage examples. The hashtable is resized by a factor of four when 80% full. With these settings, average memory consumption is about 2/3 of hash.[ch], and insertion is about twice as fast due to less frequent resizing. Lookups are also slightly faster, because entries are strictly confined to their bucket (i.e. no data of other buckets needs to be traversed). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>