summaryrefslogtreecommitdiff
path: root/daemon.c
AgeCommit message (Collapse)Author
2016-02-26Merge branch 'jk/tighten-alloc'Junio C Hamano
Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...
2016-02-22convert manual allocations to argv_arrayJeff King
There are many manual argv allocations that predate the argv_array API. Switching to that API brings a few advantages: 1. We no longer have to manually compute the correct final array size (so it's one less thing we can screw up). 2. In many cases we had to make a separate pass to count, then allocate, then fill in the array. Now we can do it in one pass, making the code shorter and easier to follow. 3. argv_array handles memory ownership for us, making it more obvious when things should be free()d and and when not. Most of these cases are pretty straightforward. In some, we switch from "run_command_v" to "run_command" which lets us directly use the argv_array embedded in "struct child_process". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-15strbuf: introduce strbuf_getline_{lf,nul}()Junio C Hamano
The strbuf_getline() interface allows a byte other than LF or NUL as the line terminator, but this is only because I wrote these codepaths anticipating that there might be a value other than NUL and LF that could be useful when I introduced line_termination long time ago. No useful caller that uses other value has emerged. By now, it is clear that the interface is overly broad without a good reason. Many codepaths have hardcoded preference to read either LF terminated or NUL terminated records from their input, and then call strbuf_getline() with LF or NUL as the third parameter. This step introduces two thin wrappers around strbuf_getline(), namely, strbuf_getline_lf() and strbuf_getline_nul(), and mechanically rewrites these call sites to call either one of them. The changes contained in this patch are: * introduction of these two functions in strbuf.[ch] * mechanical conversion of all callers to strbuf_getline() with either '\n' or '\0' as the third parameter to instead call the respective thin wrapper. After this step, output from "git grep 'strbuf_getline('" would become a lot smaller. An interim goal of this series is to make this an empty set, so that we can have strbuf_getline_crlf() take over the shorter name strbuf_getline(). Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-03Merge branch 'rs/daemon-plug-child-leak'Junio C Hamano
"git daemon" uses "run_command()" without "finish_command()", so it needs to release resources itself, which it forgot to do. * rs/daemon-plug-child-leak: daemon: plug memory leak run-command: factor out child_process_clear()
2015-11-02daemon: plug memory leakRené Scharfe
Call child_process_clear() when a child ends to release the memory allocated for its environment. This is necessary because unlike all other users of start_command() we don't call finish_command(), which would have taken care of that for us. This leak was introduced by f063d38b (daemon: use cld->env_array when re-spawning). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05daemon: use cld->env_array when re-spawningJeff King
This avoids an ugly strcat into a fixed-size buffer. It's not wrong (the buffer is plenty large enough for an IPv6 address plus some minor formatting), but it takes some effort to verify that. Unfortunately we are still stuck with some fixed-size buffers to hold the output of inet_ntop. But at least we now pass very easy-to-verify parameters, rather than doing a manual computation to account for other data in the buffer. As a side effect, this also fixes the case where we might pass an uninitialized portbuf buffer through the environment. This probably couldn't happen in practice, as it would mean that addr->sa_family was neither AF_INET nor AF_INET6 (and that is all we are listening on). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25convert trivial sprintf / strcpy calls to xsnprintfJeff King
We sometimes sprintf into fixed-size buffers when we know that the buffer is large enough to fit the input (either because it's a constant, or because it's numeric input that is bounded in size). Likewise with strcpy of constant strings. However, these sites make it hard to audit sprintf and strcpy calls for buffer overflows, as a reader has to cross-reference the size of the array with the input. Let's use xsnprintf instead, which communicates to a reader that we don't expect this to overflow (and catches the mistake in case we do). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-25write_file(): drop caller-supplied LF from calls to create a one-liner fileJunio C Hamano
All of the callsites covered by this change call write_file() or write_file_gently() to create a one-liner file. Drop the caller supplied LF and let these callees to append it as necessary. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-24write_file(): drop "fatal" parameterJunio C Hamano
All callers except three passed 1 for the "fatal" parameter to ask this function to die upon error, but to a casual reader of the code, it was not all obvious what that 1 meant. Instead, split the function into two based on a common write_file_v() that takes the flag, introduce write_file_gently() as a new way to attempt creating a file without dying on error, and make three callers to call it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-11Merge branch 'jc/daemon-no-ipv6-for-2.4.1'Junio C Hamano
"git daemon" fails to build from the source under NO_IPV6 configuration (regression in 2.4). * jc/daemon-no-ipv6-for-2.4.1: daemon: unbreak NO_IPV6 build regression
2015-05-11Merge branch 'nd/multiple-work-trees'Junio C Hamano
A replacement for contrib/workdir/git-new-workdir that does not rely on symbolic links and make sharing of objects and refs safer by making the borrowee and borrowers aware of each other. * nd/multiple-work-trees: (41 commits) prune --worktrees: fix expire vs worktree existence condition t1501: fix test with split index t2026: fix broken &&-chain t2026 needs procondition SANITY git-checkout.txt: a note about multiple checkout support for submodules checkout: add --ignore-other-wortrees checkout: pass whole struct to parse_branchname_arg instead of individual flags git-common-dir: make "modules/" per-working-directory directory checkout: do not fail if target is an empty directory t2025: add a test to make sure grafts is working from a linked checkout checkout: don't require a work tree when checking out into a new one git_path(): keep "info/sparse-checkout" per work-tree count-objects: report unused files in $GIT_DIR/worktrees/... gc: support prune --worktrees gc: factor out gc.pruneexpire parsing code gc: style change -- no SP before closing parenthesis checkout: clean up half-prepared directories in --to mode checkout: reject if the branch is already checked out elsewhere prune: strategies for linked checkouts checkout: support checking out into a new working directory ...
2015-05-05daemon: unbreak NO_IPV6 build regressionJunio C Hamano
When 01cec54e (daemon: deglobalize hostname information, 2015-03-07) wrapped the global variables such as hostname inside a struct, it forgot to convert one location that spelled "hostname" that needs to be updated to "hi->hostname". This was inside NO_IPV6 block, and was not caught by anybody. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-10daemon: deglobalize hostname informationRené Scharfe
Move the variables related to the client-supplied hostname into its own struct, let execute() own an instance of that instead of storing the information in global variables and pass the struct to any function that needs to access it as a parameter. The lifetime of the variables is easier to see this way. Allocated memory is released within execute(). The strbufs don't have to be reset anymore because they are written to only once at most: parse_host_arg() is only called once by execute() and lookup_hostname() guards against being called twice using hostname_lookup_done. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-10daemon: use strbuf for hostname infoRené Scharfe
Convert hostname, canon_hostname, ip_address and tcp_port to strbuf. This allows to get rid of the helpers strbuf_addstr_or_null() and STRARG because a strbuf always represents a valid (initially empty) string. sanitize_client() is not needed anymore and sanitize_client_strbuf() takes its place and name. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-03Merge branch 'jk/daemon-interpolate'Junio C Hamano
The "interpolated-path" option of "git daemon" inserted any string client declared on the "host=" capability request without checking. Sanitize and limit %H and %CH to a saner and a valid DNS name. * jk/daemon-interpolate: daemon: sanitize incoming virtual hostname t5570: test git-daemon's --interpolated-path option git_connect: let user override virtual-host we send to daemon
2015-03-03Merge branch 'rs/daemon-interpolate'Junio C Hamano
"git daemon" looked up the hostname even when "%CH" and "%IP" interpolations are not requested, which was unnecessary. * rs/daemon-interpolate: daemon: use callback to build interpolated path daemon: look up client-supplied hostname lazily
2015-02-17daemon: use callback to build interpolated pathRené Scharfe
Provide a callback function for strbuf_expand() instead of using the helper strbuf_expand_dict_cb(). While the resulting code is longer, it only looks up the canonical hostname and IP address if at least one of the placeholders %CH and %IP are used with --interpolated-path. Use a struct for passing the directory to the callback function instead of passing it directly to avoid having to cast away its const qualifier. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17daemon: look up client-supplied hostname lazilyRené Scharfe
Look up canonical hostname and IP address using getaddrinfo(3) or gethostbyname(3) only if --interpolated-path or --access-hook were specified. Do that by introducing getter functions for canon_hostname and ip_address and using them for all read accesses. These wrappers call the new helper lookup_hostname(), which sets the variables only at its first call. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17daemon: sanitize incoming virtual hostnameJeff King
We use the daemon_avoid_alias function to make sure that the pathname the user gives us is sane. However, after applying that check, we might then interpolate the path using a string given by the server admin, but which may contain more untrusted data from the client. We should be sure to sanitize this data, as well. We cannot use daemon_avoid_alias here, as it is more strict than we need in requiring a leading '/'. At the same time, we can be much more strict here. We are interpreting a hostname, which should not contain slashes or excessive runs of dots, as those things are not allowed in DNS names. Note that in addition to cleansing the hostname field, we must check the "canonical hostname" (%CH) as well as the port (%P), which we take as a raw string. For the canonical hostname, this comes from an actual DNS lookup on the accessed IP, which makes it a much less likely vector for problems. But it does not hurt to sanitize it in the same way. Unfortunately we cannot test this case easily, as it would involve a custom hostname lookup. We do not need to check %IP, as it comes straight from inet_ntop, so must have a sane form. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01use new wrapper write_file() for simple file writingNguyễn Thái Ngọc Duy
This fixes common problems in these code about error handling, forgetting to close the file handle after fprintf() fails, or not printing out the error string.. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-29Merge branch 'rs/daemon-fixes' into maintJunio C Hamano
* rs/daemon-fixes: daemon: remove write-only variable maxfd daemon: fix error message after bind() daemon: handle gethostbyname() error
2014-10-14Merge branch 'rs/daemon-fixes'Junio C Hamano
"git daemon" (with NO_IPV6 build configuration) used to incorrectly use the hostname even when gethostbyname() reported that the given hostname is not found. * rs/daemon-fixes: daemon: remove write-only variable maxfd daemon: fix error message after bind() daemon: handle gethostbyname() error
2014-10-01daemon: remove write-only variable maxfdRené Scharfe
It became unused when 6573faff (NO_IPV6 support for git daemon) replaced select() with poll(). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01daemon: fix error message after bind()René Scharfe
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01daemon: handle gethostbyname() errorRené Scharfe
If the user-supplied hostname can't be found then we should not use it. We already avoid doing that in the non-NO_IPV6 case by checking if the return value of getaddrinfo() is zero (success). Do the same in the NO_IPV6 case and make sure the return value of gethostbyname() isn't NULL before dereferencing this pointer. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11Merge branch 'rs/child-process-init'Junio C Hamano
Code clean-up. * rs/child-process-init: run-command: inline prepare_run_command_v_opt() run-command: call run_command_v_opt_cd_env() instead of duplicating it run-command: introduce child_process_init() run-command: introduce CHILD_PROCESS_INIT
2014-08-20run-command: introduce CHILD_PROCESS_INITRené Scharfe
Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-07daemon.c: replace `git_config()` with `git_config_get_bool()` familyTanay Abhra
Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-16Merge branch 'cc/replace-edit'Junio C Hamano
Teach "git replace" an "--edit" mode. * cc/replace-edit: replace: use argv_array in export_object avoid double close of descriptors handed to run_command replace: replace spaces with tabs in indentation
2014-06-25avoid double close of descriptors handed to run_commandJeff King
When a file descriptor is given to run_command via the "in", "out", or "err" parameters, run_command takes ownership. The descriptor will be closed in the parent process whether the process is spawned successfully or not, and closing it again is wrong. In practice this has not caused problems, because we usually close() right after start_command returns, meaning no other code has opened a descriptor in the meantime. So we just get EBADF and ignore it (rather than accidentally closing somebody else's descriptor!). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20daemon: use skip_prefix to avoid magic numbersJeff King
Like earlier cases, we can use skip_prefix to avoid magic numbers that must match the length of starts_with prefixes. However, the numbers are a little more complicated here, as we keep parsing past the prefix. We can solve it by keeping a running pointer as we parse; its final value is the location we want. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20use skip_prefix to avoid magic numbersJeff King
It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-18daemon: mark some strings as constJeff King
None of these strings is modified; marking them as const will help later refactoring. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-16Merge branch 'jk/daemon-tolower'Junio C Hamano
* jk/daemon-tolower: daemon/config: factor out duplicate xstrdup_tolower
2014-05-23daemon/config: factor out duplicate xstrdup_tolowerJeff King
We have two implementations of the same function; let's drop that to one. We take the name from daemon.c, but the implementation (which is just slightly more efficient) from the config code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10daemon: move daemonize() to libgit.aNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-10Merge branch 'nd/daemon-informative-errors-typofix'Junio C Hamano
* nd/daemon-informative-errors-typofix: daemon: be strict at parsing parameters --[no-]informative-errors
2013-12-20daemon: be strict at parsing parameters --[no-]informative-errorsNguyễn Thái Ngọc Duy
Use strcmp() instead of starts_with()/!prefixcmp() to stop accepting --informative-errors-just-a-little Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05replace {pre,suf}fixcmp() with {starts,ends}_with()Christian Couder
Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c | grep -v strbuf\\.c | xargs perl -pi -e ' s|!prefixcmp\(|starts_with\(|g; s|prefixcmp\(|!starts_with\(|g; s|!suffixcmp\(|ends_with\(|g; s|suffixcmp\(|!ends_with\(|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-25Merge branch 'sb/misc-fixes'Junio C Hamano
Assorted code cleanups and a minor fix. * sb/misc-fixes: diff.c: Do not initialize a variable, which gets reassigned anyway. commit: Fix a memory leak in determine_author_info daemon.c:handle: Remove unneeded check for null pointer.
2013-07-22Merge branch 'tr/protect-low-3-fds'Junio C Hamano
When "git" is spawned in such a way that any of the low 3 file descriptors is closed, our first open() may yield file descriptor 2, and writing error message to it would screw things up in a big way. * tr/protect-low-3-fds: git: ensure 0/1/2 are open in main() daemon/shell: refactor redirection of 0/1/2 from /dev/null
2013-07-17daemon/shell: refactor redirection of 0/1/2 from /dev/nullThomas Rast
Both daemon.c and shell.c contain logic to open FDs 0/1/2 from /dev/null if they are not already open. Move the function in daemon.c to setup.c and use it in shell.c, too. While there, remove a 'not' that inverted the meaning of the comment. The point is indeed to *avoid* messing up. Signed-off-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15daemon.c:handle: Remove unneeded check for null pointer.Stefan Beller
addr doesn't need to be checked at that line as it it already accessed 7 lines before in the if (addr->sa_family). Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-01Merge branch 'jk/pkt-line-cleanup'Junio C Hamano
Clean up pkt-line API, implementation and its callers to make them more robust. * jk/pkt-line-cleanup: do not use GIT_TRACE_PACKET=3 in tests remote-curl: always parse incoming refs remote-curl: move ref-parsing code up in file remote-curl: pass buffer straight to get_remote_heads teach get_remote_heads to read from a memory buffer pkt-line: share buffer/descriptor reading implementation pkt-line: provide a LARGE_PACKET_MAX static buffer pkt-line: move LARGE_PACKET_MAX definition from sideband pkt-line: teach packet_read_line to chomp newlines pkt-line: provide a generic reading function with options pkt-line: drop safe_write function pkt-line: move a misplaced comment write_or_die: raise SIGPIPE when we get EPIPE upload-archive: use argv_array to store client arguments upload-archive: do not copy repo name send-pack: prefer prefixcmp over memcmp in receive_status fetch-pack: fix out-of-bounds buffer offset in get_ack upload-pack: remove packet debugging harness upload-pack: do not add duplicate objects to shallow list upload-pack: use get_sha1_hex to parse "shallow" lines
2013-03-25Merge branch 'dm/ni-maxhost-may-be-missing' into maint-1.8.1Junio C Hamano
Some sources failed to compile on systems that lack NI_MAXHOST in their system header. * dm/ni-maxhost-may-be-missing: git-compat-util.h: Provide missing netdb.h definitions
2013-03-19Merge branch 'dm/ni-maxhost-may-be-missing'Junio C Hamano
On systems without NI_MAXHOST in their system header files, connect.c (hence most of the transport) did not compile. * dm/ni-maxhost-may-be-missing: git-compat-util.h: Provide missing netdb.h definitions
2013-02-25git-compat-util.h: Provide missing netdb.h definitionsDavid Michael
Some platforms may lack the NI_MAXHOST and NI_MAXSERV values in their system headers, so ensure they are available. Signed-off-by: David Michael <fedora.dm0@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24pkt-line: share buffer/descriptor reading implementationJeff King
The packet_read function reads from a descriptor. The packet_get_line function is similar, but reads from an in-memory buffer, and uses a completely separate implementation. This patch teaches the generic packet_read function to accept either source, and we can do away with packet_get_line's implementation. There are two other differences to account for between the old and new functions. The first is that we used to read into a strbuf, but now read into a fixed size buffer. The only two callers are fine with that, and in fact it simplifies their code, since they can use the same static-buffer interface as the rest of the packet_read_line callers (and we provide a similar convenience wrapper for reading from a buffer rather than a descriptor). This is technically an externally-visible behavior change in that we used to accept arbitrary sized packets up to 65532 bytes, and now cap out at LARGE_PACKET_MAX, 65520. In practice this doesn't matter, as we use it only for parsing smart-http headers (of which there is exactly one defined, and it is small and fixed-size). And any extension headers would be breaking the protocol to go over LARGE_PACKET_MAX anyway. The other difference is that packet_get_line would return on error rather than dying. However, both callers of packet_get_line are actually improved by dying. The first caller does its own error checking, but we can drop that; as a result, we'll actually get more specific reporting about protocol breakage when packet_read dies internally. The only downside is that packet_read will not print the smart-http URL that failed, but that's not a big deal; anybody not debugging can already see the remote's URL already, and anybody debugging would want to run with GIT_CURL_VERBOSE anyway to see way more information. The second caller, which is just trying to skip past any extra smart-http headers (of which there are none defined, but which we allow to keep room for future expansion), did not error check at all. As a result, it would treat an error just like a flush packet. The resulting mess would generally cause an error later in get_remote_heads, but now we get error reporting much closer to the source of the problem. Brown-paper-bag-fixes-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20pkt-line: provide a LARGE_PACKET_MAX static bufferJeff King
Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20pkt-line: teach packet_read_line to chomp newlinesJeff King
The packets sent during ref negotiation are all terminated by newline; even though the code to chomp these newlines is short, we end up doing it in a lot of places. This patch teaches packet_read_line to auto-chomp the trailing newline; this lets us get rid of a lot of inline chomping code. As a result, some call-sites which are not reading line-oriented data (e.g., when reading chunks of packfiles alongside sideband) transition away from packet_read_line to the generic packet_read interface. This patch converts all of the existing callsites. Since the function signature of packet_read_line does not change (but its behavior does), there is a possibility of new callsites being introduced in later commits, silently introducing an incompatibility. However, since a later patch in this series will change the signature, such a commit would have to be merged directly into this commit, not to the tip of the series; we can therefore ignore the issue. This is an internal cleanup and should produce no change of behavior in the normal case. However, there is one corner case to note. Callers of packet_read_line have never been able to tell the difference between a flush packet ("0000") and an empty packet ("0004"), as both cause packet_read_line to return a length of 0. Readers treat them identically, even though Documentation/technical/protocol-common.txt says we must not; it also says that implementations should not send an empty pkt-line. By stripping out the newline before the result gets to the caller, we will now treat the newline-only packet ("0005\n") the same as an empty packet, which in turn gets treated like a flush packet. In practice this doesn't matter, as neither empty nor newline-only packets are part of git's protocols (at least not for the line-oriented bits, and readers who are not expecting line-oriented packets will be calling packet_read directly, anyway). But even if we do decide to care about the distinction later, it is orthogonal to this patch. The right place to tighten would be to stop treating empty packets as flush packets, and this change does not make doing so any harder. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>