summaryrefslogtreecommitdiff
path: root/object.c
AgeCommit message (Collapse)Author
2006-02-12Fix object re-hashingLinus Torvalds
The hashed object lookup had a subtle bug in re-hashing: it did for (i = 0; i < count; i++) if (objs[i]) { .. rehash .. where "count" was the old hash couny. Oon the face of it is obvious, since it clearly re-hashes all the old objects. However, it's wrong. If the last old hash entry before re-hashing was in use (or became in use by the re-hashing), then when re-hashing could have inserted an object into the hash entries with idx >= count due to overflow. When we then rehash the last old entry, that old entry might become empty, which means that the overflow entries should be re-hashed again. In other words, the loop has to be fixed to either traverse the whole array, rather than just the old count. (There's room for a slight optimization: instead of counting all the way up, we can break when we see the first empty slot that is above the old "count". At that point we know we don't have any collissions that we might have to fix up any more. This patch only does the trivial fix) [jc: with trivial fix on trivial fix] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12hashtable-based objects: minimum fixups.Junio C Hamano
Calling hashtable_index from find_object before objs is created would result in division by zero failure. Avoid it. Also the given object name may not be aligned suitably for unsigned int; avoid dereferencing casted pointer. Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-02-12Use a hashtable for objects instead of a sorted listJohannes Schindelin
In a simple test, this brings down the CPU time from 47 sec to 22 sec. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-01-07[PATCH] Compilation: zero-length array declaration.Junio C Hamano
ISO C99 (and GCC 3.x or later) lets you write a flexible array at the end of a structure, like this: struct frotz { int xyzzy; char nitfol[]; /* more */ }; GCC 2.95 and 2.96 let you to do this with "char nitfol[0]"; unfortunately this is not allowed by ISO C90. This declares such construct like this: struct frotz { int xyzzy; char nitfol[FLEX_ARRAY]; /* more */ }; and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and empty for others. If you are using a C90 C compiler, you should be able to override this with CFLAGS=-DFLEX_ARRAY=1 from the command line of "make". Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-12-07qsort() ptrdiff_t may be larger than intJunio C Hamano
Morten Welinder <mwelinder@gmail.com> writes: > The code looks wrong. It assumes that pointers are no larger than ints. > If pointers are larger than ints, the code does not necessarily compute > a consistent ordering and qsort is allowed to do whatever it wants. > > Morten > > static int compare_object_pointers(const void *a, const void *b) > { > const struct object * const *pa = a; > const struct object * const *pb = b; > return *pa - *pb; > } Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-11-15Rework object refs tracking to reduce memory usageSergey Vlasov
Store pointers to referenced objects in a variable sized array instead of linked list. This cuts down memory usage of utilities which use object references; e.g., git-fsck-objects --full on the git.git repository consumes about 2 MB of memory tracked by Massif instead of 7 MB before the change. Object refs are still the biggest consumer of memory (57%), but the malloc overhead for a single block instead of a linked list is substantially smaller. Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-16[PATCH] Avoid building object ref lists when not neededLinus Torvalds
The object parsing code builds a generic "this object references that object" because doing a full connectivity check for fsck requires it. However, nothing else really needs it, and it's quite expensive for git-rev-list that can have tons of objects in flight. So, exactly like the commit buffer save thing, add a global flag to disable it, and use it in git-rev-list. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 After this change: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD | wc -l 10.33user 0.18system 0:10.54elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+18509minor)pagefaults 0swaps 59124 and note how the number of pages touched by git-rev-list for this particular object list has shrunk from 26,718 (104 MB) to 18,509 (72 MB). Calculating the total object difference between two git revisions is still clearly the most expensive git operation (both in memory and CPU time), but it's now less than 40% of what it used to be. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-09-11[PATCH] Add function to append to an object_list.Daniel Barkalow
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-08-03[PATCH] Object library enhancementsbarkalow@iabervon.org
Add function to look up an object which is entirely unknown, so that it can be put in a list. Various other functions related to lists of objects. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-06-27[PATCH] Remove "delta" object representation.Junio C Hamano
Packed delta files created by git-pack-objects seems to be the way to go, and existing "delta" object handling code has exposed the object representation details to too many places. Remove it while we refactor code to come up with a proper interface in sha1_file.c. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-22[PATCH] Parse tags for absent objectsDaniel Barkalow
Handle parsing a tag for a non-present object. This adds a function to lookup an object with lookup_* for * in a string, so that it can get the right storage based on the "type" line in the tag. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-08[PATCH] Anal retentive 'const unsigned char *sha1'Jason McMullan
Make 'sha1' parameters const where possible Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-26Make "parse_object()" also fill in commit message buffer data.Linus Torvalds
And teach fsck to free it to save memory.
2005-05-22Include file cleanups..Linus Torvalds
Add <limits.h> to the include files handled by "cache.h", and remove extraneous #include directives from various .c files. The rule is that "cache.h" gets all the basic stuff, so that we'll have as few system dependencies as possible.
2005-05-20[PATCH] delta checkNicolas Pitre
This adds knowledge of delta objects to fsck-cache and various object parsing code. A new switch to git-fsck-cache is provided to display the maximum delta depth found in a repository. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-06[PATCH] don't load and decompress objects twice with parse_object()Nicolas Pitre
It turns out that parse_object() is loading and decompressing given object to free it just before calling the specific object parsing function which does mmap and decompress the same object again. This patch introduces the ability to parse specific objects directly from a memory buffer. Without this patch, running git-fsck-cache on the kernel repositorytake: real 0m13.006s user 0m11.421s sys 0m1.218s With this patch applied: real 0m8.060s user 0m7.071s sys 0m0.710s The performance increase is significant, and this is kind of a prerequisite for sane delta object support with fsck. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-04[PATCH] Fix memory leaks in git-fsck-cacheSergey Vlasov
This patch fixes memory leaks in parse_object() and related functions; these leaks were very noticeable when running git-fsck-cache. Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-28[PATCH] Add function to parse an object of unspecified type (take 2)Daniel Barkalow
This adds a function that parses an object from the database when we have to look up its actual type. It also checks the hash of the file, due to its heritage as part of fsck-cache. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-26[PATCH] introduce xmalloc and xreallocChristopher Li
Introduce xmalloc and xrealloc to die gracefully with a descriptive message when out of memory, rather than taking a SIGSEGV. Signed-off-by: Christopher Li<chrislgit@chrisli.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-18[PATCH] Implementations of parsing functionsDaniel Barkalow
This implements the parsing functions. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>