path: root/index-pack.c
AgeCommit message (Collapse)Author
2010-01-22make "index-pack" a built-inLinus Torvalds
This required some fairly trivial packfile function 'const' cleanup, since the builtin commands get a const char *argv[] array. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2009-11-10Let 'git <command> -h' show usage without a git dirJonathan Nieder
There is no need for "git <command> -h" to depend on being inside a repository. Reported by Gerfried Fuchs through Signed-off-by: Jonathan Nieder <> Signed-off-by: Junio C Hamano <>
2009-10-01increase portability of NORETURN declarationsErik Faye-Lund
Some compilers (including at least MSVC) support NORETURN on function declarations, but only before the function-name. This patch makes it possible to define NORETURN to something meaningful for those compilers. Signed-off-by: Erik Faye-Lund <> Signed-off-by: Jeff King <>
2009-07-06Merge branch 'tr/die_errno'Junio C Hamano
* tr/die_errno: Use die_errno() instead of die() when checking syscalls Convert existing die(..., strerror(errno)) to die_errno() die_errno(): double % in strerror() output just in case Introduce die_errno() that appends strerror(errno) to die()
2009-06-27Convert existing die(..., strerror(errno)) to die_errno()Thomas Rast
Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <> Signed-off-by: Junio C Hamano <>
2009-06-21Fix various sparse warnings in the git source codeLinus Torvalds
There are a few remaining ones, but this fixes the trivial ones. It boils down to two main issues that sparse complains about: - warning: Using plain integer as NULL pointer Sparse doesn't like you using '0' instead of 'NULL'. For various good reasons, not the least of which is just the visual confusion. A NULL pointer is not an integer, and that whole "0 works as NULL" is a historical accident and not very pretty. A few of these remain: zlib is a total mess, and Z_NULL is just a 0. I didn't touch those. - warning: symbol 'xyz' was not declared. Should it be static? Sparse wants to see declarations for any functions you export. A lack of a declaration tends to mean that you should either add one, or you should mark the function 'static' to show that it's in file scope. A few of these remain: I only did the ones that should obviously just be made static. That 'wt_status_submodule_summary' one is debatable. It has a few related flags (like 'wt_status_use_color') which _are_ declared, and are used by builtin-commit.c. So maybe we'd like to export it at some point, but it's not declared now, and not used outside of that file, so 'static' it is in this patch. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2009-06-18Fix big left-shifts of unsigned charLinus Torvalds
Shifting 'unsigned char' or 'unsigned short' left can result in sign extension errors, since the C integer promotion rules means that the unsigned char/short will get implicitly promoted to a signed 'int' due to the shift (or due to other operations). This normally doesn't matter, but if you shift things up sufficiently, it will now set the sign bit in 'int', and a subsequent cast to a bigger type (eg 'long' or 'unsigned long') will now sign-extend the value despite the original expression being unsigned. One example of this would be something like unsigned long size; unsigned char c; size += c << 24; where despite all the variables being unsigned, 'c << 24' ends up being a signed entity, and will get sign-extended when then doing the addition in an 'unsigned long' type. Since git uses 'unsigned char' pointers extensively, we actually have this bug in a couple of places. I may have missed some, but this is the result of looking at git grep '[^0-9 ][ ]*<<[ ][a-z]' -- '*.c' '*.h' git grep '<<[ ]*24' which catches at least the common byte cases (shifting variables by a variable amount, and shifting by 24 bits). I also grepped for just 'unsigned char' variables in general, and converted the ones that most obviously ended up getting implicitly cast immediately anyway (eg hash_name(), encode_85()). In addition to just avoiding 'unsigned char', this patch also tries to use a common idiom for the delta header size thing. We had three different variations on it: "& 0x7fUL" in one place (getting the sign extension right), and "& ~0x80" and "& 0x7f" in two other places (not getting it right). Apart from making them all just avoid using "unsigned char" at all, I also unified them to then use a simple "& 0x7f". I considered making a sparse extension which warns about doing implicit casts from unsigned types to signed types, but it gets rather complex very quickly, so this is just a hack. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2009-04-06Merge branch 'jc/shared-literally'Junio C Hamano
* jc/shared-literally: t1301: loosen test for forced modes set_shared_perm(): sometimes we know what the final mode bits should look like move_temp_to_file(): do not forget to chmod() in "Coda hack" codepath Move chmod(foo, 0444) into move_temp_to_file() "core.sharedrepository = 0mode" should set, not loosen
2009-03-28Move chmod(foo, 0444) into move_temp_to_file()Johan Herland
When writing out a loose object or a pack (index), move_temp_to_file() is called to finalize the resulting file. These files (loose files and packs) should all have permission mode 0444 (modulo adjust_shared_perm()). Therefore, instead of doing chmod(foo, 0444) explicitly from each callsite (or even forgetting to chmod() at all), do the chmod() call from within move_temp_to_file(). Signed-off-by: Johan Herland <> Signed-off-by: Junio C Hamano <>
2009-03-16Fix various dead stores found by the clang static analyzerBenjamin Kramer
http-push.c::finish_request(): request is initialized by the for loop index-pack.c::free_base_data(): b is initialized by the for loop merge-recursive.c::process_renames(): move compare to narrower scope, and remove unused assignments to it remove unused variable renames2 xdiff/xdiffi.c::xdl_recs_cmp(): remove unused variable ec xdiff/xemit.c::xdl_emit_diff(): xche is always overwritten Signed-off-by: Benjamin Kramer <> Signed-off-by: Junio C Hamano <>
2009-02-25Merge branch 'jc/maint-1.6.0-pack-directory'Junio C Hamano
* jc/maint-1.6.0-pack-directory: Make sure objects/pack exists before creating a new pack
2009-02-25Make sure objects/pack exists before creating a new packJunio C Hamano
In a repository created with git older than f49fb35 (git-init-db: create "pack" subdirectory under objects, 2005-06-27), objects/pack/ directory is not created upon initialization. It was Ok because subdirectories are created as needed inside directories init-db creates, and back then, packfiles were recent invention. After the said commit, new codepaths started relying on the presense of objects/pack/ directory in the repository. This was exacerbated with 8b4eb6b (Do not perform cross-directory renames when creating packs, 2008-09-22) that moved the location temporary pack files are created from objects/ directory to objects/pack/ directory, because moving temporary to the final location was done carefully with lazy leading directory creation. Many packfile related operations in such an old repository can fail mysteriously because of this. This commit introduces two helper functions to make things work better. - odb_mkstemp() is a specialized version of mkstemp() to refactor the code and teach it to create leading directories as needed; - odb_pack_keep() refactors the code to create a ".keep" file while create leading directories as needed. Signed-off-by: Junio C Hamano <>
2009-02-01Merge branch 'sp/runtime-prefix'Junio C Hamano
* sp/runtime-prefix: Windows: Revert to default paths and convert them by RUNTIME_PREFIX Compute prefix at runtime if RUNTIME_PREFIX is set Modify setup_path() to only add git_exec_path() to PATH Add calls to git_extract_argv0_path() in programs that call git_config_* git_extract_argv0_path(): Move check for valid argv0 from caller to callee Refactor git_set_argv0_path() to git_extract_argv0_path() Move computation of absolute paths from Makefile to runtime (in preparation for RUNTIME_PREFIX)
2009-01-26Add calls to git_extract_argv0_path() in programs that call git_config_*Steffen Prohaska
Programs that use git_config need to find the global configuration. When runtime prefix computation is enabled, this requires that git_extract_argv0_path() is called early in the program's main(). This commit adds the necessary calls. Signed-off-by: Steffen Prohaska <> Acked-by: Johannes Sixt <> Signed-off-by: Junio C Hamano <>
2009-01-22Merge branch 'lt/maint-wrap-zlib'Junio C Hamano
* lt/maint-wrap-zlib: Wrap inflate and other zlib routines for better error reporting Conflicts: http-push.c http-walker.c sha1_file.c
2009-01-11Wrap inflate and other zlib routines for better error reportingLinus Torvalds
R. Tyler Ballance reported a mysterious transient repository corruption; after much digging, it turns out that we were not catching and reporting memory allocation errors from some calls we make to zlib. This one _just_ wraps things; it doesn't do the "retry on low memory error" part, at least not yet. It is an independent issue from the reporting. Some of the errors are expected and passed back to the caller, but we die when zlib reports it failed to allocate memory for now. Signed-off-by: Junio C Hamano <>
2009-01-05remove trailing LF in die() messagesAlexander Potashev
LF at the end of format strings given to die() is redundant because die already adds one on its own. Signed-off-by: Alexander Potashev <> Signed-off-by: Junio C Hamano <>
2008-11-13Merge branch 'np/pack-safer'Junio C Hamano
* np/pack-safer: t5303: fix printf format string for portability t5303: work around printf breakage in dash pack-objects: don't leak pack window reference when splitting packs extend test coverage for latest pack corruption resilience improvements pack-objects: allow "fixing" a corrupted pack without a full repack make find_pack_revindex() aware of the nasty world make check_object() resilient to pack corruptions make packed_object_info() resilient to pack corruptions make unpack_object_header() non fatal better validation on delta base object offsets close another possibility for propagating pack corruption
2008-11-03Merge branch 'np/index-pack'Junio C Hamano
* np/index-pack: index-pack: don't leak leaf delta result improve index-pack tests fix multiple issues in index-pack index-pack: smarter memory usage during delta resolution index-pack: rationalize delta resolution code
2008-11-02better validation on delta base object offsetsNicolas Pitre
In one case, it was possible to have a bad offset equal to 0 effectively pointing a delta onto itself and crashing git after too many recursions. In the other cases, a negative offset could result due to off_t being signed. Catch those. Signed-off-by: Nicolas Pitre <> Signed-off-by: Junio C Hamano <>
2008-10-24index-pack: don't leak leaf delta resultNicolas Pitre
Another (but minor this time) fallout from commit 9441b61 (index-pack: rationalize delta resolution code, 2008-10-17). Signed-off-by: Nicolas Pitre <> Signed-off-by: Junio C Hamano <>
2008-10-21Merge branch 'maint'Junio C Hamano
* maint: GIT rehabilitate 'git index-pack' inside the object store
2008-10-21rehabilitate 'git index-pack' inside the object storeNicolas Pitre
Before commit d0b92a3f6e it was possible to run 'git index-pack' directly in the .git/objects/pack/ directory. Restore that ability. Signed-off-by: Nicolas Pitre <> Signed-off-by: Junio C Hamano <>
2008-10-20fix multiple issues in index-packNicolas Pitre
Since commit 9441b61dc5, two issues affected correct behavior of index-pack: 1) The real_type of a delta object is the 'real_type' of its base, not the 'type' which can be a "delta type". Consequence of this is a corrupted pack index file which only needs to be recreated with a good index-pack command ('git verify-pack' will flag those). 2) The code sequence: result->data = patch_delta(get_base_data(base), base->obj->size, delta_data, delta_size, &result->size); has two issues of its own since base->obj->size should instead be base->size as we want the size of the actual object data and not the size of the delta object it is represented by. Except that simply replacing base->obj->size with base->size won't make the code more correct as the C language doesn't enforce a particular ordering for the evaluation of needed arguments for a function call, hence base->size could be pushed on the stack before get_base_data() which initializes base->size is called. Signed-off-by: Nicolas Pitre <> Tested-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2008-10-18Do not rename read-only files during a pushPetr Baudis
Win32 does not allow renaming read-only files (at least on a Samba share), making push into a local directory to fail. Thus, defer the chmod() call in index-pack.c:final() only after move_temp_to_file() was called. Signed-off-by: Petr Baudis <> Signed-off-by: Shawn O. Pearce <>
2008-10-18index-pack: smarter memory usage during delta resolutionNicolas Pitre
There is no need to keep the base object data around after its last delta has been resolved. This also means that long delta chains with only one delta per base won't grow the cache size unnecessarily as the base will be freed before recursing down. To make it easy, find_delta_children() is modified so the first and last indices are initialized in all cases. Signed-off-by: Nicolas Pitre <> Signed-off-by: Junio C Hamano <>
2008-10-18index-pack: rationalize delta resolution codeNicolas Pitre
Instead of having strange loops for walking unresolved deltas with the same base duplicated in many places, let's rework the code so this is done in a single place instead. This simplifies callers quite a bit too. Signed-off-by: Nicolas Pitre <> Signed-off-by: Junio C Hamano <>
2008-10-10Merge branch 'maint'Shawn O. Pearce
* maint: rebase -i: do not fail when there is no commit to cherry-pick test-lib: fix color reset in say_color() fix pread()'s short read in index-pack Conflicts: csum-file.c
2008-10-10fix pread()'s short read in index-packNicolas Pitre
Since v1.6.0.2~13^2~ the completion of a thin pack uses sha1write() for its ability to compute a SHA1 on the written data. This also provides data buffering which, along with commit 92392b4a45, will confuse pread() whenever an appended object is 1) freed due to memory pressure because of the depth-first delta processing, and 2) needed again because it has many delta children, and 3) its data is still buffered by sha1write(). Let's fix the issue by simply forcing cached data out when such an object is written so it can be pread()'d at leisure. Signed-off-by: Nicolas Pitre <> Signed-off-by: Shawn O. Pearce <>
2008-10-08Merge branch 'maint'Shawn O. Pearce
* maint: Do not use errno when pread() returns 0 git init: --bare/--shared overrides system/global config git-push.txt: Describe --repo option in more detail git rm: refresh index before up-to-date check Fix a few typos in relnotes
2008-10-08Do not use errno when pread() returns 0Samuel Tardieu
If we use pread() while at the end of the file, it will return 0, which is not an error from the operating system point of view. In this case, errno has not been set and must not be used. Signed-off-by: Samuel Tardieu <> Signed-off-by: Shawn O. Pearce <>
2008-10-03fix openssl headers conflicting with custom SHA1 implementationsNicolas Pitre
On ARM I have the following compilation errors: CC fast-import.o In file included from cache.h:8, from builtin.h:6, from fast-import.c:142: arm/sha1.h:14: error: conflicting types for 'SHA_CTX' /usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here arm/sha1.h:16: error: conflicting types for 'SHA1_Init' /usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here arm/sha1.h:17: error: conflicting types for 'SHA1_Update' /usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here arm/sha1.h:18: error: conflicting types for 'SHA1_Final' /usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here make: *** [fast-import.o] Error 1 This is because openssl header files are always included in git-compat-util.h since commit 684ec6c63c whenever NO_OPENSSL is not set, which somehow brings in <openssl/sha1.h> clashing with the custom ARM version. Compilation of git is probably broken on PPC too for the same reason. Turns out that the only file requiring openssl/ssl.h and openssl/err.h is imap-send.c. But only moving those problematic includes there doesn't solve the issue as it also includes cache.h which brings in the conflicting local SHA1 header file. As suggested by Jeff King, the best solution is to rename our references to SHA1 functions and structure to something git specific, and define those according to the implementation used. Signed-off-by: Nicolas Pitre <> Signed-off-by: Shawn O. Pearce <>
2008-09-22Do not perform cross-directory renames when creating packsPetr Baudis
A comment on top of create_tmpfile() describes caveats ('can have problems on various systems (FAT, NFS, Coda)') that should apply in this situation as well. This in the end did not end up solving any of my personal problems, but it might be a useful cleanup patch nevertheless. Signed-off-by: Petr Baudis <> Acked-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2008-08-30index-pack: use fixup_pack_header_footer()'s validation modeNicolas Pitre
When completing a thin pack, a new header has to be written to the pack and a new SHA1 computed. Make sure that the SHA1 of what is being read back matches the SHA1 of what was written for both: the original pack and the appended objects. To do so, a couple write_or_die() calls were converted to sha1write() which has the advantage of doing some buffering as well as handling SHA1 and CRC32 checksum already. Signed-off-by: Nicolas Pitre <> Signed-off-by: Junio C Hamano <>
2008-08-30improve reliability of fixup_pack_header_footer()Nicolas Pitre
Currently, this function has the potential to read corrupted pack data from disk and give it a valid SHA1 checksum. Let's add the ability to validate SHA1 checksum of existing data along the way, including before and after any arbitrary point in the pack. Signed-off-by: Nicolas Pitre <> Signed-off-by: Junio C Hamano <>
2008-08-26index-pack: setup git repositoryNguyễn Thái Ngọc Duy
"git index-pack" is an independent command and does not setup git repository while still need pack.indexversion. It may miss the info if it is in a subdirectory of the repository. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2008-07-25Merge branch 'maint'Junio C Hamano
* maint: Makefile: fix shell quoting tests: propagate $(TAR) down from the toplevel Makefile index-pack.c: correctly initialize appended objects send-email: find body-encoding correctly
2008-07-25index-pack.c: correctly initialize appended objectsBjörn Steinbrink
When index-pack completes a thin pack it appends objects to the pack. Since the commit 92392b4(index-pack: Honor core.deltaBaseCacheLimit when resolving deltas) such an object can be pruned in case of memory pressure, and will be read back again by get_data_from_pack(). For this to work, the fields in object_entry structure need to be initialized properly. Noticed by Pierre Habouzit. Signed-off-by: Björn Steinbrink <> Acked-by: Nicolas Pitre <> Acked-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2008-07-17Merge branch 'sb/dashless'Junio C Hamano
* sb/dashless: Make usage strings dash-less t/: Use "test_must_fail git" instead of "! git" t/ exit with small negagive int is ok with test_must_fail Conflicts: builtin-blame.c builtin-mailinfo.c builtin-mailsplit.c builtin-shortlog.c t/ t/
2008-07-17Merge branch 'sp/maint-index-pack'Junio C Hamano
* sp/maint-index-pack: index-pack: Honor core.deltaBaseCacheLimit when resolving deltas index-pack: Track the object_entry that creates each base_data index-pack: Chain the struct base_data on the stack for traversal index-pack: Refactor base arguments of resolve_delta into a struct
2008-07-15index-pack: Honor core.deltaBaseCacheLimit when resolving deltasShawn O. Pearce
If we are trying to resolve deltas for a long delta chain composed of multi-megabyte objects we can easily run into requiring 500M+ of memory to hold each object in the chain on the call stack while we recurse into the dependent objects and resolve them. We now use a simple delta cache that discards objects near the bottom of the call stack first, as they are the most least recently used objects in this current delta chain. If we recurse out of a chain we may find the base object is no longer available, as it was free'd to keep memory under the deltaBaseCacheLimit. In such cases we must unpack the base object again, which will require recursing back to the root of the top of the delta chain as we released that root first. The astute reader will probably realize that we can still exceed the delta base cache limit, but this happens only if the most recent base plus the delta plus the inflated dependent sum up to more than the base cache limit. Due to the way patch_delta is currently implemented we cannot operate in less memory anyway. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2008-07-15index-pack: Track the object_entry that creates each base_dataShawn O. Pearce
If we free the data stored within a base_data we need the struct object_entry to get the data back again for use with another dependent delta. Storing the object_entry* in base_data makes it simple to call get_data_from_pack() to recover the compressed information. This however means that we must add the missing base object to the end of our packfile prior to calling resolve_delta() on each of the dependent deltas. Adding the base first ensures we can read the base back from the pack we are indexing, as if it had been included by the remote side. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2008-07-15index-pack: Chain the struct base_data on the stack for traversalShawn O. Pearce
We need to release earlier inflated base objects when memory gets low, which means we need to be able to walk up or down the stack to locate the objects we want to release, and free their data. The new link/unlink routines allow inserting and removing the struct base_data during recursion inside resolve_delta, and the global base_cache gives us the head of the chain (bottom of the stack) so we can traverse it. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2008-07-15index-pack: Refactor base arguments of resolve_delta into a structShawn O. Pearce
We need to discard base objects which are not recently used if our memory gets low, such as when we are unpacking a long delta chain of a very large object. To support tracking the available base objects we combine the pointer and size into a struct. Future changes would allow the data pointer to be free'd and marked NULL if memory gets low. Signed-off-by: Shawn O. Pearce <> Signed-off-by: Junio C Hamano <>
2008-07-13Make usage strings dash-lessStephan Beyer
When you misuse a git command, you are shown the usage string. But this is currently shown in the dashed form. So if you just copy what you see, it will not work, when the dashed form is no longer supported. This patch makes git commands show the dash-less version. For shell scripts that do not specify OPTIONS_SPEC, generates a dash-less usage string now. Signed-off-by: Stephan Beyer <> Signed-off-by: Junio C Hamano <>
2008-07-06Fix some warnings (on cygwin) to allow -WerrorRamsay Jones
When printing valuds of type uint32_t, we should use PRIu32, and should not assume that it is unsigned int. On 32-bit platforms, it could be defined as unsigned long. The same caution applies to ntohl(). Signed-off-by: Ramsay Jones <> Signed-off-by: Junio C Hamano <>
2008-05-31Make pack creation always fsync() the resultLinus Torvalds
This means that we can depend on packs always being stable on disk, simplifying a lot of the object serialization worries. And unlike loose objects, serializing pack creation IO isn't going to be a performance killer. Signed-off-by: Linus Torvalds <> Signed-off-by: Junio C Hamano <>
2008-05-14Provide git_config with a callback-data parameterJohannes Schindelin
git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2008-02-29index-pack: introduce checking modeMartin Koegler
Adds strict option, which bails out if the pack would introduces broken object or links in the repository. Signed-off-by: Martin Koegler <> Signed-off-by: Junio C Hamano <>
2007-11-14Merge branch 'np/progress'Junio C Hamano
* np/progress: nicer display of thin pack completion make display of total transferred fully accurate remove dead code from the csum-file interface git-fetch: be even quieter. make display of total transferred more accurate sideband.c: ESC is spelled '\033' not '\e' for portability. fix display overlap between remote and local progress