summaryrefslogtreecommitdiff
path: root/khash.h
AgeCommit message (Collapse)Author
2023-06-21hash-ll, hashmap: move oidhash() to hash-llElijah Newren
oidhash() was used by both hashmap and khash, which makes sense. However, the location of this function in hashmap.[ch] meant that khash.h had to depend upon hashmap.h, making people unfamiliar with khash think that it was built upon hashmap. (Or at least, I personally was confused for a while about this in the past.) Move this function to hash-ll, so that khash.h can stop depending upon hashmap.h. This has another benefit as well: it allows us to remove hashmap.h's dependency on hash-ll.h. While some callers of hashmap.h were making use of oidhash, most were not, so this change provides another way to reduce the number of includes. Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21khash: name the structs that khash declaresElijah Newren
khash.h lets you instantiate custom hash types that map between two types. These are defined as a struct, as you might expect, and khash typedef's that to kh_foo_t. But it declares the struct anonymously, which doesn't give a name to the struct type itself; there is no "struct kh_foo". This has two small downsides: - when using khash, we declare "kh_foo_t *the_foo". This is unlike our usual naming style, which is "struct kh_foo *the_foo". - you can't forward-declare a typedef of an unnamed struct type in C. So we might do something like this in a header file: struct kh_foo; struct bar { struct kh_foo *the_foo; }; to avoid having to include the header that defines the real kh_foo. But that doesn't work with the typedef'd name. Without the "struct" keyword, the compiler doesn't know we mean that kh_foo is a type. So let's always give khash structs the name that matches our conventions ("struct kh_foo" to match "kh_foo_t"). We'll keep doing the typedef to retain compatibility with existing callers. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-24hash-ll.h: split out of hash.h to remove dependency on repository.hElijah Newren
hash.h depends upon and includes repository.h, due to the definition and use of the_hash_algo (defined as the_repository->hash_algo). However, most headers trying to include hash.h are only interested in the layout of the structs like object_id. Move the parts of hash.h that do not depend upon repository.h into a new file hash-ll.h (the "low level" parts of hash.h), and adjust other files to use this new header where the convenience inline functions aren't needed. This allows hash.h and object.h to be fairly small, minimal headers. It also exposes a lot of hidden dependencies on both path.h (which was brought in by repository.h) and repository.h (which was previously implicitly brought in by object.h), so also adjust other files to be more explicit about what they depend upon. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-24treewide: remove unnecessary cache.h includesElijah Newren
We had several header files include cache.h unnecessarily. Remove those. These have all been verified via both ensuring that gcc -E $HEADER | grep '"cache.h"' found no hits and that cat >temp.c <<EOF && #include "git-compat-util.h" #include "$HEADER" int main() {} EOF gcc -c temp.c successfully compiles without warnings. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06khash: clarify that allocations never failRené Scharfe
We use our standard allocation functions and macros (xcalloc, ALLOC_ARRAY, REALLOC_ARRAY) in our version of khash.h. They terminate the program on error instead, so code that's using them doesn't have to handle allocation failures. Make this behavior explicit by turning kh_resize_ into a void function and removing the related unreachable error handling code. Helped-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20hashmap: convert sha1hash() to oidhash()Jeff King
There are no callers left of sha1hash() that do not simply pass the "hash" member of a "struct object_id". Let's get rid of the outdated sha1-specific function and provide one that operates on the whole struct (even though the technique, taking the first few bytes of the hash, will remain the same). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20khash: rename oid helper functionsJeff King
For use in object_id hash tables, we have oid_hash() and oid_equal(). But these are confusingly similar to the existing oideq() and the oidhash() we plan to add to replace sha1hash(). The big difference from those functions is that rather than accepting a const pointer to the "struct object_id", we take the arguments by value (which is a khash internal convention). So let's make that obvious by calling them oidhash_by_value() and oideq_by_value(). Those names are fairly horrendous to type, but we rarely need to do so; they are passed to the khash implementation macro and then only used internally. Callers get to use the nice kh_put_oid_map(), etc. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20khash: drop sha1-specific map typesJeff King
All of the callers of khash_sha1 and khash_sha1_pos have been removed, in favor of using maps that use "struct object_id" as their keys. Let's drop these now-obsolete types. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20khash: rename kh_oid_t to kh_oid_setJeff King
khash lets us define a hash as either a map or a set (i.e., with no "value" type). For the oid maps we define, "oid" is the set and "oid_map" is the map. As the bug in the previous commit shows, it's easy to pick the wrong one. So let's make the names more distinct: "oid_set" and "oid_map". An alternative naming scheme would be to actually name the type after the key/value types. So e.g., "oid" _would_ be the set, since it has no value type. And "oid_map" would become "oid_void" or similar (and "oid_pos" becomes "oid_int"). That's better in some ways: it's more regular, and a given map type can be more reasily reused in multiple contexts (e.g., something storing an "int" that isn't a "pos"). But it's also slightly less descriptive. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-20khash: drop broken oid_map typedefJeff King
Commit 5a8643eff1 (khash: move oid hash table definition, 2019-02-19) added a khash "oid_map" type to match the existing "oid" type, which is a simple set (i.e., just keys, no values). But in setting up the khash_oid_map typedef, it accidentally referred to "kh_oid_t", which is the set type. Nobody noticed the breakage because there are not yet any callers; the type was added just as a match to the existing sha1 types (whose map type confusingly _is_ called khash_sha1, and it has no matching set type). We could easily fix this with s/oid/oid_map/ in the typedef. But let's take this a step further, and just drop the typedef entirely. These typedefs were added by 5a8643eff1 to match the khash_sha1 typedefs. But the actual khash-derived type names are descriptive enough; this is just adding an extra layer of indirection. The khash names do not quite follow our usual style (e.g., they end in "_t"), but since we end up using other khash names (e.g., khiter_t, kh_get_oid()) anyway, just typedef-ing the struct name is not really helping much. And there are already many cases where we use the raw khash type names anyway (e.g., the "set" variant defined just above us does not have such a typedef!). So let's drop this typedef, and the matching oid_pos one (which actually _does_ have a user, but we can easily convert it). We'll leave the khash_sha1 typedef around. The ultimate fate of its callers should be conversion to kh_oid_map_t, so there's no point in going through the noise of changing the names now. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-13Merge branch 'dl/no-extern-in-func-decl'Junio C Hamano
Mechanically and systematically drop "extern" from function declarlation. * dl/no-extern-in-func-decl: *.[ch]: manually align parameter lists *.[ch]: remove extern from function declarations using sed *.[ch]: remove extern from function declarations using spatch
2019-05-05*.[ch]: manually align parameter listsDenton Liu
In previous patches, extern was mechanically removed from function declarations without care to formatting, causing parameter lists to be misaligned. Manually format changed sections such that the parameter lists should be realigned. Viewing this patch with 'git diff -w' should produce no output. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-05*.[ch]: remove extern from function declarations using sedDenton Liu
There has been a push to remove extern from function declarations. Finish the job by removing all instances of "extern" for function declarations in headers using sed. This was done by running the following on my system with sed 4.2.2: $ git ls-files \*.{c,h} | grep -v ^compat/ | xargs sed -i'' -e 's/^\(\s*\)extern \([^(]*([^*]\)/\1\2/' Files under `compat/` are intentionally excluded as some are directly copied from external sources and we should avoid churning them as much as possible. Then, leftover instances of extern were found by running $ git grep -w -C3 extern \*.{c,h} and manually checking the output. No other instances were found. Note that the regex used specifically excludes function variables which _should_ be left as extern. Not the most elegant way to do it but it gets the job done. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-01khash: move oid hash table definitionbrian m. carlson
Move the oid khash table definition to khash.h and define a typedef for it, similar to the one we have for unsigned char pointers. Define variants that are maps as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-24khash: silence -Wunused-function for delta-islandsCarlo Marcelo Arenas Belón
showing the following when compiled with latest clang (OpenBSD, Fedors and macOS): delta-islands.c:23:1: warning: unused function 'kh_destroy_str' [-Wunused-function] delta-islands.c:23:1: warning: unused function 'kh_clear_str' [-Wunused-function] delta-islands.c:23:1: warning: unused function 'kh_del_str' [-Wunused-function] Reported-by: René Scharfe <l.s.r@web.de> Suggested-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-04khash: factor out kh_release_*René Scharfe
Add a function for releasing the khash-internal allocations, but not the khash structure itself. It can be used with on-stack khash structs. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-15Add missing includes and forward declarationsElijah Newren
I looped over the toplevel header files, creating a temporary two-line C program for each consisting of #include "git-compat-util.h" #include $HEADER This patch is the result of manually fixing errors in compiling those tiny programs. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22convert trivial cases to ALLOC_ARRAYJeff King
Each of these cases can be converted to use ALLOC_ARRAY or REALLOC_ARRAY, which has two advantages: 1. It automatically checks the array-size multiplication for overflow. 2. It always uses sizeof(*array) for the element-size, so that it can never go out of sync with the declared type of the array. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-18use REALLOC_ARRAY for changing the allocation size of arraysRené Scharfe
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-07hashmap: factor out getting a hash code from a SHA1Karsten Blees
Copying the first bytes of a SHA1 is duplicated in six places, however, the implications (the actual value would depend on the endianness of the platform) is documented only once. Add a properly documented API for this. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30pack-bitmap: add support for bitmap indexesVicent Marti
A bitmap index is a `.bitmap` file that can be found inside `$GIT_DIR/objects/pack/`, next to its corresponding packfile, and contains precalculated reachability information for selected commits. The full specification of the format for these bitmap indexes can be found in `Documentation/technical/bitmap-format.txt`. For a given commit SHA1, if it happens to be available in the bitmap index, its bitmap will represent every single object that is reachable from the commit itself. The nth bit in the bitmap is the nth object in the packfile; if it's set to 1, the object is reachable. By using the bitmaps available in the index, this commit implements several new functions: - `prepare_bitmap_git` - `prepare_bitmap_walk` - `traverse_bitmap_commit_list` - `reuse_partial_packfile_from_bitmap` The `prepare_bitmap_walk` function tries to build a bitmap of all the objects that can be reached from the commit roots of a given `rev_info` struct by using the following algorithm: - If all the interesting commits for a revision walk are available in the index, the resulting reachability bitmap is the bitwise OR of all the individual bitmaps. - When the full set of WANTs is not available in the index, we perform a partial revision walk using the commits that don't have bitmaps as roots, and limiting the revision walk as soon as we reach a commit that has a corresponding bitmap. The earlier OR'ed bitmap with all the indexed commits can now be completed as this walk progresses, so the end result is the full reachability list. - For revision walks with a HAVEs set (a set of commits that are deemed uninteresting), first we perform the same method as for the WANTs, but using our HAVEs as roots, in order to obtain a full reachability bitmap of all the uninteresting commits. This bitmap then can be used to: a) limit the subsequent walk when building the WANTs bitmap b) finding the final set of interesting commits by performing an AND-NOT of the WANTs and the HAVEs. If `prepare_bitmap_walk` runs successfully, the resulting bitmap is stored and the equivalent of a `traverse_commit_list` call can be performed by using `traverse_bitmap_commit_list`; the bitmap version of this call yields the objects straight from the packfile index (without having to look them up or parse them) and hence is several orders of magnitude faster. As an extra optimization, when `prepare_bitmap_walk` succeeds, the `reuse_partial_packfile_from_bitmap` call can be attempted: it will find the amount of objects at the beginning of the on-disk packfile that can be reused as-is, and return an offset into the packfile. The source packfile can then be loaded and the bytes up to `offset` can be written directly to the result without having to consider the entires inside the packfile individually. If the `prepare_bitmap_walk` call fails (e.g. because no bitmap files are available), the `rev_info` struct is left untouched, and can be used to perform a manual rev-walk using `traverse_commit_list`. Hence, this new set of functions are a generic API that allows to perform the equivalent of git rev-list --objects [roots...] [^uninteresting...] for any set of commits, even if they don't have specific bitmaps generated for them. In further patches, we'll use this bitmap traversal optimization to speed up the `pack-objects` and `rev-list` commands. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>