path: root/bulk-checkin.c
AgeCommit message (Collapse)Author
2019-09-06bulk-checkin: zero-initialize hashfile_checkpointJeff King
We declare a "struct hashfile_checkpoint" but only sometimes actually call hashfile_checkpoint() on it. That makes it not immediately obvious that it's valid when we later access its members. In fact, the code is fine: we fill it in unconditionally in the while(1) loop as long as "idx" is non-NULL. And then if "idx" is NULL, we exit early from the function (because we're just computing the hash, not actually writing), before we look at the struct. However, this does seem to confuse gcc 9.2.1's -Wmaybe-uninitialized when compiled with "-flto -O2" (probably because with LTO it can now realize that our call to hashfile_truncate() does not set the members either). Let's zero-initialize the struct to tell the compiler, as well as any readers of the code, that all is well. Reported-by: Stephan Beyer <> Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2019-01-08convert has_sha1_file() callers to has_object_file()Jeff King
The only remaining callers of has_sha1_file() actually have an object_id already. They can use the "object" variant, rather than dereferencing the hash themselves. The code changes here were completely generated by the included coccinelle patch. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2018-08-29convert "oidcmp() == 0" to oideq()Jeff King
Using the more restrictive oideq() should, in the long run, give the compiler more opportunities to optimize these callsites. For now, this conversion should be a complete noop with respect to the generated code. The result is also perhaps a little more readable, as it avoids the "zero is equal" idiom. Since it's so prevalent in C, I think seasoned programmers tend not to even notice it anymore, but it can sometimes make for awkward double negations (e.g., we can drop a few !!oidcmp() instances here). This patch was generated almost entirely by the included coccinelle patch. This mechanical conversion should be completely safe, because we check explicitly for cases where oidcmp() is compared to 0, which is what oideq() is doing under the hood. Note that we don't have to catch "!oidcmp()" separately; coccinelle's standard isomorphisms make sure the two are treated equivalently. I say "almost" because I did hand-edit the coccinelle output to fix up a few style violations (it mostly keeps the original formatting, but sometimes unwraps long lines). Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2018-07-18Merge branch 'sb/object-store-grafts'Junio C Hamano
The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-05-30Merge branch 'js/use-bug-macro'Junio C Hamano
Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths
2018-05-16object-store: move object access functions to object-store.hStefan Beller
This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <> Signed-off-by: Junio C Hamano <>
2018-05-08Merge branch 'ds/commit-graph'Junio C Hamano
Precompute and store information necessary for ancestry traversal in a separate file to optimize graph walking. * ds/commit-graph: commit-graph: implement "--append" option commit-graph: build graph from starting commits commit-graph: read only from specific pack-indexes commit: integrate commit graph with commit parsing commit-graph: close under reachability commit-graph: add core.commitGraph setting commit-graph: implement git commit-graph read commit-graph: implement git-commit-graph write commit-graph: implement write_commit_graph() commit-graph: create git-commit-graph builtin graph: add commit graph design document commit-graph: add format document csum-file: refactor finalize_hashfile() method csum-file: rename hashclose() to finalize_hashfile()
2018-05-06Replace all die("BUG: ...") calls by BUG() onesJohannes Schindelin
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in 588a538ae55 (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2018-04-11Merge branch 'sb/packfiles-in-repository'Junio C Hamano
Refactoring of the internal global data structure continues. * sb/packfiles-in-repository: packfile: keep prepare_packed_git() private packfile: allow find_pack_entry to handle arbitrary repositories packfile: add repository argument to find_pack_entry packfile: allow reprepare_packed_git to handle arbitrary repositories packfile: allow prepare_packed_git to handle arbitrary repositories packfile: allow prepare_packed_git_one to handle arbitrary repositories packfile: add repository argument to reprepare_packed_git packfile: add repository argument to prepare_packed_git packfile: add repository argument to prepare_packed_git_one packfile: allow install_packed_git to handle arbitrary repositories packfile: allow rearrange_packed_git to handle arbitrary repositories packfile: allow prepare_packed_git_mru to handle arbitrary repositories
2018-04-02csum-file: refactor finalize_hashfile() methodDerrick Stolee
If we want to use a hashfile on the temporary file for a lockfile, then we need finalize_hashfile() to fully write the trailing hash but also keep the file descriptor open. Do this by adding a new CSUM_HASH_IN_STREAM flag along with a functional change that checks this flag before writing the checksum to the stream. This differs from previous behavior since it would be written if either CSUM_CLOSE or CSUM_FSYNC is provided. Signed-off-by: Derrick Stolee <> Signed-off-by: Junio C Hamano <>
2018-04-02csum-file: rename hashclose() to finalize_hashfile()Derrick Stolee
The hashclose() method behaves very differently depending on the flags parameter. In particular, the file descriptor is not always closed. Perform a simple rename of "hashclose()" to "finalize_hashfile()" in preparation for functional changes. Signed-off-by: Derrick Stolee <> Signed-off-by: Junio C Hamano <>
2018-03-26packfile: add repository argument to reprepare_packed_gitStefan Beller
See previous patch for explanation. Signed-off-by: Stefan Beller <> Signed-off-by: Junio C Hamano <> Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2018-03-14bulk-checkin: convert index_bulk_checkin to struct object_idbrian m. carlson
Convert the index_bulk_checkin function, and the static functions it calls, to use pointers to struct object_id. Signed-off-by: brian m. carlson <> Signed-off-by: Junio C Hamano <>
2018-03-06Merge branch 'bw/c-plus-plus'Junio C Hamano
Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...
2018-02-14object: rename function 'typename' to 'type_name'Brandon Williams
Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <> Signed-off-by: Junio C Hamano <>
2018-02-02bulk-checkin: abstract SHA-1 usagebrian m. carlson
Convert uses of the direct SHA-1 functions to use the_hash_algo instead. Signed-off-by: brian m. carlson <> Signed-off-by: Junio C Hamano <>
2018-02-02csum-file: rename sha1file to hashfilebrian m. carlson
Rename struct sha1file to struct hashfile, along with all of its related functions. The transformation in this commit was made by global search-and-replace. Signed-off-by: Junio C Hamano <>
2017-09-27distinguish error versus short read from read_in_full()Jeff King
Many callers of read_in_full() expect to see the exact number of bytes requested, but their error handling lumps together true read errors and short reads due to unexpected EOF. We can give more specific error messages by separating these cases (showing errno when appropriate, and otherwise describing the short read). Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2017-08-23pack: move {,re}prepare_packed_git and approximate_object_countJonathan Tan
Signed-off-by: Jonathan Tan <> Signed-off-by: Junio C Hamano <>
2017-05-08pack: convert struct pack_idx_entry to struct object_idbrian m. carlson
Convert struct pack_idx_entry to use struct object_id by changing the definition and applying the following semantic patch, plus the standard object_id transforms: @@ struct pack_idx_entry E1; @@ - E1.sha1 + E1.oid.hash @@ struct pack_idx_entry *E1; @@ - E1->sha1 + E1->oid.hash Signed-off-by: brian m. carlson <> Signed-off-by: Junio C Hamano <>
2017-03-24encode_in_pack_object_header: respect output buffer lengthJeff King
The encode_in_pack_object_header() writes a variable-length header to an output buffer, but it doesn't actually know long the buffer is. At first glance, this looks like it might be possible to overflow. In practice, this is probably impossible. The smallest buffer we use is 10 bytes, which would hold the header for an object up to 2^67 bytes. Obviously we're not likely to see such an object, but we might worry that an object could lie about its size (causing us to overflow before we realize it does not actually have that many bytes). But the argument is passed as a uintmax_t. Even on systems that have __int128 available, uintmax_t is typically restricted to 64-bit by the ABI. So it's unlikely that a system exists where this could be exploited. Still, it's easy enough to use a normal out/len pair and make sure we don't write too far. That protects the hypothetical 128-bit system, makes it harder for callers to accidentally specify a too-small buffer, and makes the resulting code easier to audit. Note that the one caller in fast-import tried to catch such a case, but did so _after_ the call (at which point we'd have already overflowed!). This check can now go away. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2016-11-16compression: unify pack.compression configuration parsingJunio C Hamano
There are three codepaths that use a variable whose name is pack_compression_level to affect how objects and deltas sent to a packfile is compressed. Unlike zlib_compression_level that controls the loose object compression, however, this variable was static to each of these codepaths. Two of them read the pack.compression configuration variable, using core.compression as the default, and one of them also allowed overriding it from the command line. The other codepath in bulk-checkin did not pay any attention to the configuration. Unify the configuration parsing to git_default_config(), where we implement the parsing of core.loosecompression and core.compression and make the former override the latter, by moving code to parse pack.compression and also allow core.compression to give default to this variable. Signed-off-by: Junio C Hamano <>
2015-09-25use xsnprintf for generating git object headersJeff King
We generally use 32-byte buffers to format git's "type size" header fields. These should not generally overflow unless you can produce some truly gigantic objects (and our types come from our internal array of constant strings). But it is a good idea to use xsnprintf to make sure this is the case. Note that we slightly modify the interface to write_sha1_file_prepare, which nows uses "hdrlen" as an "in" parameter as well as an "out" (on the way in it stores the allocated size of the header, and on the way out it returns the ultimate size of the header). Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2015-05-06Merge branch 'bc/object-id'Junio C Hamano
Identify parts of the code that knows that we use SHA-1 hash to name our objects too much, and use (1) symbolic constants instead of hardcoded 20 as byte count and/or (2) use struct object_id instead of unsigned char [20] for object names. * bc/object-id: apply: convert threeway_stage to object_id patch-id: convert to use struct object_id commit: convert parts to struct object_id diff: convert struct combine_diff_path to object_id bulk-checkin.c: convert to use struct object_id zip: use GIT_SHA1_HEXSZ for trailers archive.c: convert to use struct object_id bisect.c: convert leaf functions to use struct object_id define utility functions for object IDs define a structure for object IDs
2015-03-17Merge branch 'rs/deflate-init-cleanup'Junio C Hamano
Code simplification. * rs/deflate-init-cleanup: zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw}
2015-03-14bulk-checkin.c: convert to use struct object_idbrian m. carlson
Signed-off-by: brian m. carlson <> Signed-off-by: Junio C Hamano <>
2015-03-05zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw}René Scharfe
Clear the git_zstream variable at the start of git_deflate_init() etc. so that callers don't have to do that. Signed-off-by: Rene Scharfe <> Signed-off-by: Junio C Hamano <>
2014-09-15cleanups: ensure that git-compat-util.h is included firstDavid Aguilar
CodingGuidelines states that the first #include in C files should be git-compat-util.h or another header file that includes it, such as cache.h or builtin.h. Signed-off-by: David Aguilar <> Signed-off-by: Junio C Hamano <>
2014-03-03finish_tmp_packfile():use strbuf for pathname constructionSun He
The old version fixes a maximum length on the buffer, which could be a problem if one is not certain of the length of get_object_directory(). Using strbuf can avoid the protential bug. Helped-by: Michael Haggerty <> Helped-by: Eric Sunshine <> Signed-off-by: Sun He <> Signed-off-by: Junio C Hamano <>
2013-08-20stream_to_pack: xread does not guarantee to read all requested bytesJohannes Sixt
The deflate loop in bulk-checkin::stream_to_pack expects to get all bytes from a file that it requests to read in a single function call. But it used xread(), which does not give that guarantee. Replace it by read_in_full(). Signed-off-by: Johannes Sixt <> Signed-off-by: Junio C Hamano <>
2011-12-01bulk-checkin: replace fast-import based implementationJunio C Hamano
This extends the earlier approach to stream a large file directly from the filesystem to its own packfile, and allows "git add" to send large files directly into a single pack. Older code used to spawn fast-import, but the new bulk-checkin API replaces it. Signed-off-by: Junio C Hamano <>