summaryrefslogtreecommitdiff
path: root/builtin/receive-pack.c
AgeCommit message (Collapse)Author
2017-06-15config: don't include config.h by defaultBrandon Williams
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-24Merge branch 'dt/xgethostname-nul-termination'Junio C Hamano
gethostname(2) may not NUL terminate the buffer if hostname does not fit; unfortunately there is no easy way to see if our buffer was too small, but at least this will make sure we will not end up using garbage past the end of the buffer. * dt/xgethostname-nul-termination: xgethostname: handle long hostnames use HOST_NAME_MAX to size buffers for gethostname(2)
2017-04-24Merge branch 'jk/quarantine-received-objects'Junio C Hamano
Add finishing touches to a recent topic. * jk/quarantine-received-objects: refs: reject ref updates while GIT_QUARANTINE_PATH is set receive-pack: document user-visible quarantine effects receive-pack: drop tmp_objdir_env from run_update_hook
2017-04-20Merge branch 'bc/object-id'Junio C Hamano
Conversion from unsigned char [40] to struct object_id continues. * bc/object-id: Documentation: update and rename api-sha1-array.txt Rename sha1_array to oid_array Convert sha1_array_for_each_unique and for_each_abbrev to object_id Convert sha1_array_lookup to take struct object_id Convert remaining callers of sha1_array_lookup to object_id Make sha1_array_append take a struct object_id * sha1-array: convert internal storage for struct sha1_array to object_id builtin/pull: convert to struct object_id submodule: convert check_for_new_submodule_commits to object_id sha1_name: convert disambiguate_hint_fn to take object_id sha1_name: convert struct disambiguate_state to object_id test-sha1-array: convert most code to struct object_id parse-options-cb: convert sha1_array_append caller to struct object_id fsck: convert init_skiplist to struct object_id builtin/receive-pack: convert portions to struct object_id builtin/pull: convert portions to struct object_id builtin/diff: convert to struct object_id Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ Define new hash-size constants for allocating memory
2017-04-19xgethostname: handle long hostnamesDavid Turner
If the full hostname doesn't fit in the buffer supplied to gethostname, POSIX does not specify whether the buffer will be null-terminated, so to be safe, we should do it ourselves. Introduce new function, xgethostname, which ensures that there is always a \0 at the end of the buffer. Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-19use HOST_NAME_MAX to size buffers for gethostname(2)René Scharfe
POSIX limits the length of host names to HOST_NAME_MAX. Export the fallback definition from daemon.c and use this constant to make all buffers used with gethostname(2) big enough for any possible result and a terminating NUL. Inspired-by: David Turner <dturner@twosigma.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-17Merge branch 'jk/snprintf-cleanups'Junio C Hamano
Code clean-up. * jk/snprintf-cleanups: daemon: use an argv_array to exec children gc: replace local buffer with git_path transport-helper: replace checked snprintf with xsnprintf convert unchecked snprintf into xsnprintf combine-diff: replace malloc/snprintf with xstrfmt replace unchecked snprintf calls with heap buffers receive-pack: print --pack-header directly into argv array name-rev: replace static buffer with strbuf create_branch: use xstrfmt for reflog message create_branch: move msg setup closer to point of use avoid using mksnpath for refs avoid using fixed PATH_MAX buffers for refs fetch: use heap buffer to format reflog tag: use strbuf to format tag header diff: avoid fixed-size buffer for patch-ids odb_mkstemp: use git_path_buf odb_mkstemp: write filename into strbuf do not check odb_mkstemp return value for errors
2017-04-17receive-pack: drop tmp_objdir_env from run_update_hookJeff King
Since 722ff7f87 (receive-pack: quarantine objects until pre-receive accepts, 2016-10-03), we have to feed the pre-receive hook the tmp_objdir environment, so that git programs run from the hook know where to find the objects. That commit modified run_update_hook() to do the same, but there it is a noop. By the time we get to the update hooks, we have already migrated the objects from quarantine, and so tmp_objdir_env() will always return NULL. We can drop this useless call. Note that the ordering here and the lack of support for the update hook is intentional. The update hook calls are interspersed with actual ref updates, and we must migrate the objects before any refs are updated (since otherwise those refs would appear broken to outside processes). So the only other options are: - remain in quarantine for the _first_ ref, but not the others. This is sufficiently confusing that it can be rejected outright. - run all the individual update hooks first, then migrate, then update all the refs. But this changes the repository state that the update hooks see (i.e., whether or not refs from the same push are updated yet or not). So the functionality is fine and remains unchanged with this patch; we're just cleaning up a useless and confusing line of code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31Rename sha1_array to oid_arraybrian m. carlson
Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31Convert sha1_array_for_each_unique and for_each_abbrev to object_idbrian m. carlson
Make sha1_array_for_each_unique take a callback using struct object_id. Since one of these callbacks is an argument to for_each_abbrev, convert those as well. Rename various functions, replacing "sha1" with "oid". Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31Make sha1_array_append take a struct object_id *brian m. carlson
Convert the callers to pass struct object_id by changing the function declaration and definition and applying the following semantic patch: @@ expression E1, E2; @@ - sha1_array_append(E1, E2.hash) + sha1_array_append(E1, &E2) @@ expression E1, E2; @@ - sha1_array_append(E1, E2->hash) + sha1_array_append(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-30receive-pack: print --pack-header directly into argv arrayJeff King
After receive-pack reads the pack header from the client, it feeds the already-read part to index-pack and unpack-objects via their --pack-header command-line options. To do so, we format it into a fixed buffer, then duplicate it into the child's argv_array. Our buffer is long enough to handle any possible input, so this isn't wrong. But it's more complicated than it needs to be; we can just argv_array_pushf() the final value and avoid the intermediate copy. This drops the magic number and is more efficient, too. Note that we need to push to the argv_array in order, which means we can't do the push until we are in the "unpack-objects versus index-pack" conditional. Rather than duplicate the slightly complicated format specifier, I pushed it into a helper function. Signed-off-by: Jeff King <peff@peff.net>
2017-03-30Merge branch 'bc/push-cert-receive-fix'Junio C Hamano
"git receive-pack" could have been forced to die by attempting allocate an unreasonably large amount of memory with a crafted push certificate; this has been fixed. * bc/push-cert-receive-fix: builtin/receive-pack: fix incorrect pointer arithmetic
2017-03-28sha1-array: convert internal storage for struct sha1_array to object_idbrian m. carlson
Make the internal storage for struct sha1_array use an array of struct object_id internally. Update the users of this struct which inspect its internals. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-28builtin/receive-pack: convert portions to struct object_idbrian m. carlson
Convert some hardcoded constants into uses of parse_oid_hex. Additionally, convert all uses of struct command, and miscellaneous other functions necessary for that. This work is necessary to be able to convert sha1_array_append later on. To avoid needing to specify a constant, reject shallow lines with the wrong length instead of simply ignoring them. Note that in queue_command we are guaranteed to have a NUL-terminated buffer or at least one byte of overflow that we can safely read, so the linelen check can be elided. We would die in such a case, but not read invalid memory. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-28builtin/receive-pack: fix incorrect pointer arithmeticbrian m. carlson
If we had already processed the last newline in a push certificate, we would end up subtracting NULL from the end-of-certificate pointer when computing the length of the line. This would have resulted in an absurdly large length, and possibly a buffer overflow. Instead, subtract the beginning-of-certificate pointer from the end-of-certificate pointer, which is what's expected. Note that this situation should never occur, since not only do we require the certificate to be newline terminated, but the signature will only be read from the beginning of a line. Nevertheless, it seems prudent to correct it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-27Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZbrian m. carlson
Since we will likely be introducing a new hash function at some point, and that hash function might be longer than 20 bytes, use the constant GIT_MAX_RAWSZ, which is designed to be suitable for allocations, instead of GIT_SHA1_RAWSZ. This will ease the transition down the line by distinguishing between places where we need to allocate memory suitable for the largest hash from those where we need to handle the current hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-24Merge branch 'rs/update-hook-optim'Junio C Hamano
Code clean-up. * rs/update-hook-optim: receive-pack: simplify run_update_post_hook()
2017-03-18receive-pack: simplify run_update_post_hook()René Scharfe
Instead of counting the arguments to see if there are any and then building the full command use a single loop and add the hook command just before the first argument. This reduces duplication and overall code size. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17Merge branch 'bc/object-id'Junio C Hamano
"uchar [40]" to "struct object_id" conversion continues. * bc/object-id: wt-status: convert to struct object_id builtin/merge-base: convert to struct object_id Convert object iteration callbacks to struct object_id sha1_file: introduce an nth_packed_object_oid function refs: simplify parsing of reflog entries refs: convert each_reflog_ent_fn to struct object_id reflog-walk: convert struct reflog_info to struct object_id builtin/replace: convert to struct object_id Convert remaining callers of resolve_refdup to object_id builtin/merge: convert to struct object_id builtin/clone: convert to struct object_id builtin/branch: convert to struct object_id builtin/grep: convert to struct object_id builtin/fmt-merge-message: convert to struct object_id builtin/fast-export: convert to struct object_id builtin/describe: convert to struct object_id builtin/diff-tree: convert to struct object_id builtin/commit: convert to struct object_id hex: introduce parse_oid_hex
2017-03-14Merge branch 'jk/push-deadlock-regression-fix'Junio C Hamano
"git push" had a handful of codepaths that could lead to a deadlock when unexpected error happened, which has been fixed. * jk/push-deadlock-regression-fix: send-pack: report signal death of pack-objects send-pack: read "unpack" status even on pack-objects failure send-pack: improve unpack-status error messages send-pack: use skip_prefix for parsing unpack status send-pack: extract parsing of "unpack" response receive-pack: fix deadlock when we cannot create tmpdir
2017-03-07receive-pack: fix deadlock when we cannot create tmpdirJeff King
The err_fd descriptor passed to the unpack() function is intended to be handed off to the child index-pack, and our async muxer will read until it gets EOF. However, if we encounter an error before handing off the descriptor, we must manually close(err_fd). Otherwise we will be waiting for our muxer to finish, while the muxer is waiting for EOF on err_fd. We fixed an identical deadlock already in 49ecfa13f (receive-pack: close sideband fd on early pack errors, 2013-04-19). But since then, the function grew a new early-return in 722ff7f87 (receive-pack: quarantine objects until pre-receive accepts, 2016-10-03), when we fail to create a temporary directory. This return needs the same treatment. Reported-by: Horst Schirmeier <horst@schirmeier.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-22Convert remaining callers of resolve_refdup to object_idbrian m. carlson
There are a few leaf functions in various files that call resolve_refdup. Convert these functions to use struct object_id internally to prepare for transitioning resolve_refdup itself. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08receive-pack: avoid duplicates between our refs and alternatesJeff King
We de-duplicate ".have" refs among themselves, but never check if they are duplicates of our local refs. It's not unreasonable that they would be if we are a "--shared" or "--reference" clone of a similar repository; we'd have all the same tags. We can handle this by inserting our local refs into the oidset, but obviously not suppressing duplicates (since the refnames are important). Note that this also switches the order in which we advertise refs, processing ours first and then any alternates. The order shouldn't matter (and arguably showing our refs first makes more sense). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08receive-pack: treat namespace .have lines like alternatesJeff King
Namely, de-duplicate them. We use the same set as the alternates, since we call them both ".have" (i.e., there is no value in showing one versus the other). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08receive-pack: fix misleading namespace/.have commentJeff King
The comment claims that we handle alternate ".have" lines through this function, but that hasn't been the case since 85f251045 (write_head_info(): handle "extra refs" locally, 2012-01-06). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08receive-pack: use oidset to de-duplicate .have linesJeff King
If you have an alternate object store with a very large number of refs, the peak memory usage of the sha1_array can grow high, even if most of them are duplicates that end up not being printed at all. The similar for_each_alternate_ref() code-paths in fetch-pack solve this by using flags in "struct object" to de-duplicate (and so are relying on obj_hash at the core). But we don't have a "struct object" at all in this case. We could call lookup_unknown_object() to get one, but if our goal is reducing memory footprint, it's not great: - an unknown object is as large as the largest object type (a commit), which is bigger than an oidset entry - we can free the memory after our ref advertisement, but "struct object" entries persist forever (and the receive-pack may hang around for a long time, as the bottleneck is often client upload bandwidth). So let's use an oidset. Note that unlike a sha1-array it doesn't sort the output as a side effect. However, our output is at least stable, because for_each_alternate_ref() will give us the sha1s in ref-sorted order. In one particularly pathological case with an alternate that has 60,000 unique refs out of 80 million total, this reduced the peak heap usage of "git receive-pack . </dev/null" from 13GB to 14MB. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-08for_each_alternate_ref: pass name/oid instead of ref structJeff King
Breaking down the fields in the interface makes it easier to change the backend of for_each_alternate_ref to something that doesn't use "struct ref" internally. The only field that callers actually look at is the oid, anyway. The refname is kept in the interface as a plausible thing for future code to want. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-02Merge branch 'rs/receive-pack-cleanup'Junio C Hamano
Code clean-up. * rs/receive-pack-cleanup: receive-pack: call string_list_clear() unconditionally
2017-01-30receive-pack: call string_list_clear() unconditionallyRené Scharfe
string_list_clear() handles empty lists just fine, so remove the redundant check. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-05receive-pack: improve English grammar of denyCurrentBranch messageAlex Henrie
The article "the" is required here. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-31Merge branch 'ls/filter-process'Junio C Hamano
The smudge/clean filter API expect an external process is spawned to filter the contents for each path that has a filter defined. A new type of "process" filter API has been added to allow the first request to run the filter for a path to spawn a single process, and all filtering need is served by this single process for multiple paths, reducing the process creation overhead. * ls/filter-process: contrib/long-running-filter: add long running filter example convert: add filter.<driver>.process option convert: prepare filter.<driver>.process option convert: make apply_filter() adhere to standard Git error handling pkt-line: add functions to read/write flush terminated packet streams pkt-line: add packet_write_gently() pkt-line: add packet_flush_gently() pkt-line: add packet_write_fmt_gently() pkt-line: extract set_packet_header() pkt-line: rename packet_write() to packet_write_fmt() run-command: add clean_on_exit_handler run-command: move check_pipe() from write_or_die to run_command convert: modernize tests convert: quote filter names in error messages
2016-10-26find_unique_abbrev: use 4-buffer ringJeff King
Some code paths want to format multiple abbreviated sha1s in the same output line. Because we use a single static buffer for our return value, they have to either break their output into several calls or allocate their own arrays and use find_unique_abbrev_r(). Intead, let's mimic sha1_to_hex() and use a ring of several buffers, so that the return value stays valid through multiple calls. This shortens some of the callers, and makes it harder to for them to make a silly mistake. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17Merge branch 'jk/quarantine-received-objects'Junio C Hamano
In order for the receiving end of "git push" to inspect the received history and decide to reject the push, the objects sent from the sending end need to be made available to the hook and the mechanism for the connectivity check, and this was done traditionally by storing the objects in the receiving repository and letting "git gc" to expire it. Instead, store the newly received objects in a temporary area, and make them available by reusing the alternate object store mechanism to them only while we decide if we accept the check, and once we decide, either migrate them to the repository or purge them immediately. * jk/quarantine-received-objects: tmp-objdir: do not migrate files starting with '.' tmp-objdir: put quarantine information in the environment receive-pack: quarantine objects until pre-receive accepts tmp-objdir: introduce API for temporary object directories check_connected: accept an env argument
2016-10-17pkt-line: rename packet_write() to packet_write_fmt()Lars Schneider
packet_write() should be called packet_write_fmt() because it is a printf-like function that takes a format string as first parameter. packet_write_fmt() should be used for text strings only. Arbitrary binary data should use a new packet_write() function that is introduced in a subsequent patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10receive-pack: quarantine objects until pre-receive acceptsJeff King
When a client pushes objects to us, index-pack checks the objects themselves and then installs them into place. If we then reject the push due to a pre-receive hook, we cannot just delete the packfile; other processes may be depending on it. We have to do a normal reachability check at this point via `git gc`. But such objects may hang around for weeks due to the gc.pruneExpire grace period. And worse, during that time they may be exploded from the pack into inefficient loose objects. Instead, this patch teaches receive-pack to put the new objects into a "quarantine" temporary directory. We make these objects available to the connectivity check and to the pre-receive hook, and then install them into place only if it is successful (and otherwise remove them as tempfiles). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26sha1_array: let callbacks interrupt iterationJeff King
The callbacks for iterating a sha1_array must have a void return. This is unlike our usual for_each semantics, where a callback may interrupt iteration and have its value propagated. Let's switch it to the usual form, which will enable its use in more places (e.g., where we are replacing an existing iteration with a different data structure). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21Merge branch 'va/i18n'Junio C Hamano
More i18n. * va/i18n: i18n: update-index: mark warnings for translation i18n: show-branch: mark plural strings for translation i18n: show-branch: mark error messages for translation i18n: receive-pack: mark messages for translation notes: spell first word of error messages in lowercase i18n: notes: mark error messages for translation i18n: merge-recursive: mark verbose message for translation i18n: merge-recursive: mark error messages for translation i18n: config: mark error message for translation i18n: branch: mark option description for translation i18n: blame: mark error messages for translation
2016-09-15i18n: receive-pack: mark messages for translationVasco Almeida
Mark messages refuse_unconfigured_deny_msg and refuse_unconfigured_deny_delete_current_msg for translation. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-24receive-pack: allow a maximum input size to be specifiedJeff King
Receive-pack feeds its input to either index-pack or unpack-objects, which will happily accept as many bytes as a sender is willing to provide. Let's allow an arbitrary cutoff point where we will stop writing bytes to disk. Cleaning up what has already been written to disk is a related problem that is not addressed by this patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-17Merge branch 'jk/tighten-alloc'Junio C Hamano
Small code and comment clean-up. * jk/tighten-alloc: receive-pack: use FLEX_ALLOC_MEM in queue_command() correct FLEXPTR_* example in comment
2016-08-14receive-pack: use FLEX_ALLOC_MEM in queue_command()René Scharfe
Use the macro FLEX_ALLOC_MEM instead of open-coding it. This shortens and simplifies the code a bit. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03Merge branch 'jk/push-progress'Junio C Hamano
"git push" and "git clone" learned to give better progress meters to the end user who is waiting on the terminal. * jk/push-progress: receive-pack: send keepalives during quiet periods receive-pack: turn on connectivity progress receive-pack: relay connectivity errors to sideband receive-pack: turn on index-pack resolving progress index-pack: add flag for showing delta-resolution progress clone: use a real progress meter for connectivity check check_connected: add progress flag check_connected: relay errors to alternate descriptor check_everything_connected: use a struct with named options check_everything_connected: convert to argv_array rev-list: add optional progress reporting check_everything_connected: always pass --quiet to rev-list
2016-07-20receive-pack: send keepalives during quiet periodsJeff King
After a client has sent us the complete pack, we may spend some time processing the data and running hooks. If the client asked us to be quiet, receive-pack won't send any progress data during the index-pack or connectivity-check steps. And hooks may or may not produce their own progress output. In these cases, the network connection is totally silent from both ends. Git itself doesn't care about this (it will wait forever), but other parts of the system (e.g., firewalls, load-balancers, etc) might hang up the connection. So we'd like to send some sort of keepalive to let the network and the client side know that we're still alive and processing. We can use the same trick we did in 05e9515 (upload-pack: send keepalive packets during pack computation, 2013-09-08). Namely, we will send an empty sideband data packet every `N` seconds that we do not relay any stderr data over the sideband channel. As with 05e9515, this means that we won't bother sending keepalives when there's actual progress data, but will kick in when it has been disabled (or if there is a lull in the progress data). The concept is simple, but the details are subtle enough that they need discussing here. Before the client sends us the pack, we don't want to do any keepalives. We'll have sent our ref advertisement, and we're waiting for them to send us the pack (and tell us that they support sidebands at all). While we're receiving the pack from the client (or waiting for it to start), there's no need for keepalives; it's up to them to keep the connection active by sending data. Moreover, it would be wrong for us to do so. When we are the server in the smart-http protocol, we must treat our connection as half-duplex. So any keepalives we send while receiving the pack would potentially be buffered by the webserver. Not only does this make them useless (since they would not be delivered in a timely manner), but it could actually cause a deadlock if we fill up the buffer with keepalives. (It wouldn't be wrong to send keepalives in this phase for a full-duplex connection like ssh; it's simply pointless, as it is the client's responsibility to speak). As soon as we've gotten all of the pack data, then the client is waiting for us to speak, and we should start keepalives immediately. From here until the end of the connection, we send one any time we are not otherwise sending data. But there's a catch. Receive-pack doesn't know the moment we've gotten all the data. It passes the descriptor to index-pack, who reads all of the data, and then starts resolving the deltas. We have to communicate that back. To make this work, we instruct the sideband muxer to enable keepalives in three phases: 1. In the beginning, not at all. 2. While reading from index-pack, wait for a signal indicating end-of-input, and then start them. 3. Afterwards, always. The signal from index-pack in phase 2 has to come over the stderr channel which the muxer is reading. We can't use an extra pipe because the portable run-command interface only gives us stderr and stdout. Stdout is already used to pass the .keep filename back to receive-pack. We could also send a signal there, but then we would find out about it in the main thread. And the keepalive needs to be done by the async muxer thread (since it's the one writing sideband data back to the client). And we can't reliably signal the async thread from the main thread, because the async code sometimes uses threads and sometimes uses forked processes. Therefore the signal must come over the stderr channel, where it may be interspersed with other random human-readable messages from index-pack. This patch makes the signal a single NUL byte. This is easy to parse, should not appear in any normal stderr output, and we don't have to worry about any timing issues (like seeing half the signal bytes in one read(), and half in a subsequent one). This is a bit ugly, but it's simple to code and should work reliably. Another option would be to stop using an async thread for muxing entirely, and just poll() both stderr and stdout of index-pack from the main thread. This would work for index-pack (because we aren't doing anything useful in the main thread while it runs anyway). But it would make the connectivity check and the hook muxers much more complicated, as they need to simultaneously feed the sub-programs while reading their stderr. The index-pack phase is the only one that needs this signaling, so it could simply behave differently than the other two. That would mean having two separate implementations of copy_to_sideband (and the keepalive code), though. And it still doesn't get rid of the signaling; it just means we can write a nicer message like "END_OF_INPUT" or something on stdout, since we don't have to worry about separating it from the stderr cruft. One final note: this signaling trick is only done with index-pack, not with unpack-objects. There's no point in doing it for the latter, because by definition it only kicks in for a small number of objects, where keepalives are not as useful (and this conveniently lets us avoid duplicating the implementation). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20receive-pack: turn on connectivity progressJeff King
When we receive a large push, the server side of the connection may spend a lot of time (30s or more for a full push of linux.git) walking the object graph without producing any output. Let's give the user some indication that we're actually working. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20receive-pack: relay connectivity errors to sidebandJeff King
If the connectivity check encounters a problem when receiving a push, the error output goes to receive-pack's stderr, whose destination depends on the protocol used (ssh tends to send it to the user, though without a "remote" prefix; http will generally eat it in the server's error log). The information should consistently go back to the user, as there is a reasonable chance their client is buggy and generating a bad pack. We can do so by muxing it over the sideband as we do with other sub-process stderr. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20receive-pack: turn on index-pack resolving progressJeff King
When we receive a large push, the server side may have to spend a lot of CPU processing the incoming packfile. During the "receiving" phase, we are typically network bound, and the client is writing its own progress to the user. But during the delta resolution phase, we may spend minutes (e.g., for a full push of linux.git) without making any indication to the user that the connection has not hung. Let's ask index-pack to produce progress output for this phase (unless the client asked us to be quiet, of course). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20check_everything_connected: use a struct with named optionsJeff King
The number of variants of check_everything_connected has grown over the years, so that the "real" function takes several possibly-zero, possibly-NULL arguments. We hid the complexity behind some wrapper functions, but this doesn't scale well when we want to add new options. If we add more wrapper variants to handle the new options, then we can get a combinatorial explosion when those options might be used together (right now nobody wants to use both "shallow" and "transport" together, so we get by with just a few wrappers). If instead we add new parameters to each function, each of which can have a default value, then callers who want the defaults end up with confusing invocations like: check_everything_connected(fn, 0, data, -1, 0, NULL); where it is unclear which parameter is which (and every caller needs updated when we add new options). Instead, let's add a struct to hold all of the optional parameters. This is a little more verbose for the callers (who have to declare the struct and fill it in), but it makes their code much easier to follow, because every option is named as it is set (and unused options do not have to be mentioned at all). Note that we could also stick the iteration function and its callback data into the option struct, too. But since those are required for each call, by avoiding doing so, we can let very simple callers just pass "NULL" for the options and not worry about the struct at all. While we're touching each site, let's also rename the function to check_connected(). The existing name was quite long, and not all of the wrappers even used the full name. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14receive-pack: implement advertising and receiving push optionsStefan Beller
The pre/post receive hook may be interested in more information from the user. This information can be transmitted when both client and server support the "push-options" capability, which when used is a phase directly after update commands ended by a flush pkt. Similar to the atomic option, the server capability can be disabled via the `receive.advertisePushOptions` config variable. While documenting this, fix a nit in the `receive.advertiseAtomic` wording. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-14push options: {pre,post}-receive hook learns about push optionsStefan Beller
The environment variable GIT_PUSH_OPTION_COUNT is set to the number of push options sent, and GIT_PUSH_OPTION_{0,1,..} is set to the transmitted option. The code is not executed as the push options are set to NULL, nor is the new capability advertised. There was some discussion back and forth how to present these push options to the user as there are some ways to do it: Keep all options in one environment variable ============================================ + easiest way to implement in Git - This would make things hard to parse correctly in the hook. Put the options in files instead, filenames are in GIT_PUSH_OPTION_FILES ====================================== + After a discussion about environment variables and shells, we may not want to put user data into an environment variable (see [1] for example). + We could transmit binaries, i.e. we're not bound to C strings as we are when using environment variables to the user. + Maybe easier to parse than constructing environment variable names GIT_PUSH_OPTION_{0,1,..} yourself - cleanup of the temporary files is hard to do reliably - we have race conditions with multiple clients pushing, hence we'd need to use mkstemp. That's not too bad, but still. Use environment variables, but restrict to key/value pairs ========================================================== (When the user pushes a push option `foo=bar`, we'd GIT_PUSH_OPTION_foo=bar) + very easy to parse for a simple model of push options - it's not sufficient for more elaborate models, e.g. it doesn't allow doubles (e.g. cc=reviewer@email) Present the options in different environment variables ====================================================== (This is implemented) * harder to parse as a user, but we have a sample hook for that. - doesn't allow binary files + allows the same option twice, i.e. is not restrictive about options, except for binary files. + doesn't clutter a remote directory with (possibly stale) temporary files As we first want to focus on getting simple strings to work reliably, we go with the last option for now. If we want to do transmission of binaries later, we can just attach a 'side-channel', e.g. "any push option that contains a '\0' is put into a file instead of the environment variable and we'd have new GIT_PUSH_OPTION_FILES, GIT_PUSH_OPTION_FILENAME_{0,1,..} environment variables". [1] 'Shellshock' https://lwn.net/Articles/614218/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>