summaryrefslogtreecommitdiff
path: root/connect.c
AgeCommit message (Collapse)Author
2015-04-20Merge branch 'tb/connect-ipv6-parse-fix'Junio C Hamano
An earlier update to the parser that disects an address broke an address, followed by a colon, followed by an empty string (instead of the port number). * tb/connect-ipv6-parse-fix: connect.c: ignore extra colon after hostname
2015-04-09connect.c: ignore extra colon after hostnameTorsten Bögershausen
Ignore an extra ':' at the end of the hostname in URL's like "ssh://example.com:/path/to/repo" The colon is meant to separate a port number from the hostname. If the port is empty, the colon should be ignored, see RFC 3986. It had been working for URLs with ssh:// scheme, but was unintentionally broken in 86ceb3, "allow ssh://user@[2001:db8::1]/repo.git" Reported-by: Reid Woodbury Jr. <reidw@rawsound.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-10connect.c: do not leak "conn" after showing diagnosisStefan Beller
When git_connect() is called to see how the URL is parsed for debugging purposes with CONNECT_DIAG_URL set, the variable conn is leaked. At this point in the codeflow, it only has its memory and no other resource is associated with it, so it is sufficient to clean it up by just freeing it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05Merge branch 'tb/connect-ipv6-parse-fix'Junio C Hamano
We did not parse username followed by literal IPv6 address in SSH transport URLs, e.g. ssh://user@[2001:db8::1]:22/repo.git correctly. * tb/connect-ipv6-parse-fix: t5500: show user name and host in diag-url t5601: add more test cases for IPV6 connect.c: allow ssh://user@[2001:db8::1]/repo.git
2015-03-05Merge branch 'rs/simple-cleanups'Junio C Hamano
Code cleanups. * rs/simple-cleanups: sha1_name: use strlcpy() to copy strings pretty: use starts_with() to check for a prefix for-each-ref: use skip_prefix() to avoid duplicate string comparison connect: use strcmp() for string comparison
2015-03-03Merge branch 'jk/daemon-interpolate'Junio C Hamano
The "interpolated-path" option of "git daemon" inserted any string client declared on the "host=" capability request without checking. Sanitize and limit %H and %CH to a saner and a valid DNS name. * jk/daemon-interpolate: daemon: sanitize incoming virtual hostname t5570: test git-daemon's --interpolated-path option git_connect: let user override virtual-host we send to daemon
2015-02-22t5500: show user name and host in diag-urlTorsten Bögershausen
The URL for ssh may have include a username before the hostname, like ssh://user@host/repo. When literal IPV6 addresses are used together with a username, the substring "user@[::1]" must be converted into "user@::1". Make that conversion visible for the user, and write userandhost in the diagnostics Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-22connect.c: allow ssh://user@[2001:db8::1]/repo.gitTorsten Bögershausen
The ssh:// syntax was added in 2386d658 (Add first cut at "git protocol" connect logic., 2005-07-13), it accepted ssh://user@2001:db8::1/repo.git, which is now legacy. Over the years the parser was improved to support [] and port numbers, but the combination of ssh://user@[2001:db8::1]:222/repo.git did never work. The only only way to use a user name, a literall IPV6 address and a port number was ssh://[user@2001:db8::1]:222/repo.git (Thanks to Christian Taube <lists@hcf.yourweb.de> for reporting this long standing issue) New users would use ssh://user@[2001:db8::1]:222/repo.git, so change the parser to handle it correctly. Support the old legacy URLs as well, to be backwards compatible, and avoid regressions for users which upgrade an existing installation to a later Git version. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-22connect: use strcmp() for string comparisonRené Scharfe
Get rid of magic string length constants and simply compare the strings using strcmp(). This makes the intent of the code a bit clearer. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17git_connect: let user override virtual-host we send to daemonJeff King
When we connect to a git-daemon at a given host and port, we actually send the string "localhost:9418" to the other side, which allows it to do virtual-hosting lookups. For testing and debugging, we'd like to be able to send arbitrary strings, rather than the hostname we actually connected to. Using "insteadOf" config does not work for this purpose, as the hostname determination happens at a very low level, right before we feed the hostname to our lookup routines. You could use /etc/hosts or similar to get around this, but we cannot do that portably from our test suite. Instead, this patch provides an environment variable that can be used to send an arbitrary string. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22Merge branch 'mh/simplify-repack-without-refs'Junio C Hamano
"git remote update --prune" to drop many refs has been optimized. * mh/simplify-repack-without-refs: sort_string_list(): rename to string_list_sort() prune_remote(): iterate using for_each_string_list_item() prune_remote(): rename local variable repack_without_refs(): make the refnames argument a string_list prune_remote(): sort delete_refs_list references en masse prune_remote(): initialize both delete_refs lists in a single loop prune_remote(): exit early if there are no stale references
2014-11-25sort_string_list(): rename to string_list_sort()Michael Haggerty
The new name is more consistent with the names of other string_list-related functions. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-10git_connect: set ssh shell command in GIT_SSH_COMMANDThomas Quinot
It may be impractical to install a wrapper script for GIT_SSH when additional parameters need to be passed. Provide an alternative way of specifying a shell command to be run, including command line arguments, by means of the GIT_SSH_COMMAND environment variable, which behaves like GIT_SSH but is passed to the shell. The special circuitry to modify parameters in the case of using PuTTY's plink/tortoiseplink is activated only when using GIT_SSH; in the case of using GIT_SSH_COMMAND, it is deliberately left up to the user to make any required parameters adaptation before calling the underlying ssh implementation. Signed-off-by: Thomas Quinot <thomas@quinot.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-19Merge branch 'rs/more-uses-of-skip-prefix'Junio C Hamano
Code clean-up. * rs/more-uses-of-skip-prefix: pack-write: simplify index_pack_lockfile using skip_prefix() and xstrfmt() connect: simplify check_ref() using skip_prefix() and starts_with()
2014-09-02connect: simplify check_ref() using skip_prefix() and starts_with()René Scharfe
Both callers of check_ref() pass in NUL-terminated strings for name. Remove the len parameter and then use skip_prefix() and starts_with() instead of memcmp() to check if it starts with certain strings. This gets rid of several magic string length constants and a strlen() call. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20run-command: introduce child_process_init()René Scharfe
Add a helper function for initializing those struct child_process variables for which the macro CHILD_PROCESS_INIT can't be used. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20run-command: introduce CHILD_PROCESS_INITRené Scharfe
Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-21Merge branch 'maint'Junio C Hamano
* maint: use xmemdupz() to allocate copies of strings given by start and length use xcalloc() to allocate zero-initialized memory
2014-07-21use xmemdupz() to allocate copies of strings given by start and lengthRené Scharfe
Use xmemdupz() to allocate the memory, copy the data and make sure to NUL-terminate the result, all in one step. The resulting code is shorter, doesn't contain the constants 1 and '\0', and avoids duplicating function parameters. For blame, the last copied byte (o->file.ptr[o->file.size]) is always set to NUL by fake_working_tree_commit() or read_sha1_file(), so no information is lost by the conversion to using xmemdupz(). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20use skip_prefix to avoid magic numbersJeff King
It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-15git_connect: use argv_arrayJeff King
This avoids magic numbers when we allocate fixed-size argv arrays, and makes it more obvious that we are not overflowing. It is also the first step to fixing a memory leak. When git_connect returns a child_process struct, the argv array in the struct is dynamically allocated, but the individual strings are not (they are either owned elsewhere, or are freed). Later, in finish_connect, we free the array but leave the strings alone. This works for the child_process created by git_connect, but if we use transport_take_over, we may also end up with a child_process created by transport-helper's get_helper. In that case, the strings are freshly allocated, and we would want to free them. However, we have no idea in finish_connect which type we have. By consistently using run-command's internal argv-array, we do not have to worry about this issue at all; finish_command takes care of it for us, and we can drop our manual free entirely. Note that this actually makes the get_helper leak slightly worse; now we are leaking both the strings and the array. But when we adjust it in a future patch, that leak will go away entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-25Merge branch 'nd/indent-fix-connect-c'Junio C Hamano
* nd/indent-fix-connect-c: connect.c: SP after "}", not TAB
2014-03-13connect.c: SP after "}", not TABNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17Merge branch 'nd/shallow-clone'Junio C Hamano
Fetching from a shallow-cloned repository used to be forbidden, primarily because the codepaths involved were not carefully vetted and we did not bother supporting such usage. This attempts to allow object transfer out of a shallow-cloned repository in a controlled way (i.e. the receiver become a shallow repository with truncated history). * nd/shallow-clone: (31 commits) t5537: fix incorrect expectation in test case 10 shallow: remove unused code send-pack.c: mark a file-local function static git-clone.txt: remove shallow clone limitations prune: clean .git/shallow after pruning objects clone: use git protocol for cloning shallow repo locally send-pack: support pushing from a shallow clone via http receive-pack: support pushing to a shallow clone via http smart-http: support shallow fetch/clone remote-curl: pass ref SHA-1 to fetch-pack as well send-pack: support pushing to a shallow clone receive-pack: allow pushes that update .git/shallow connected.c: add new variant that runs with --shallow-file add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses receive/send-pack: support pushing from a shallow clone receive-pack: reorder some code in unpack() fetch: add --update-shallow to accept refs that update .git/shallow upload-pack: make sure deepening preserves shallow roots fetch: support fetching from a shallow repository clone: support remote shallow repository ...
2013-12-17Merge branch 'tb/clone-ssh-with-colon-for-port'Junio C Hamano
Be more careful when parsing remote repository URL given in the scp-style host:path notation. * tb/clone-ssh-with-colon-for-port: git_connect(): use common return point connect.c: refactor url parsing git_connect(): refactor the port handling for ssh git fetch: support host:/~repo t5500: add test cases for diag-url git fetch-pack: add --diag-url git_connect: factor out discovery of the protocol and its parts git_connect: remove artificial limit of a remote command t5601: add tests for ssh t5601: remove clear_ssh, refactor setup_ssh_wrapper
2013-12-11connect.c: teach get_remote_heads to parse "shallow" linesNguyễn Thái Ngọc Duy
No callers pass a non-empty pointer as shallow_points at this stage. As a result, all clients still refuse to talk to shallow repository on the other end. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-11remote.h: replace struct extra_have_objects with struct sha1_arrayNguyễn Thái Ngọc Duy
The latter can do everything the former can and is used in many more places. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git_connect(): use common return pointTorsten Bögershausen
Use only one return point from git_connect(), doing the free(); return conn; only at one place in the code. There may be a little confusion what the variable "host" is for. At some places it is only the host part, at other places it may include the port number, so change host into hostandport here. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09connect.c: refactor url parsingTorsten Bögershausen
Make the function is_local() in transport.c public, rename it into url_is_local_not_ssh() and use it in both transport.c and connect.c Use a protocol "local" for URLs for the local file system. One note about using file:// under Windows: The (absolute) path on Unix like system typically starts with "/". When the host is empty, it can be omitted, so that a shell scriptlet url=file://$pwd will give a URL like "file:///home/user/repo". Windows does not have the same concept of a root directory located in "/". When parsing the URL allow "file://C:/user/repo" (even if RFC1738 indicates that "file:///C:/user/repo" should be used). Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git_connect(): refactor the port handling for sshTorsten Bögershausen
Use get_host_and_port() even for ssh. Remove the variable port git_connect(), and simplify parse_connect_url() Use only one return point in git_connect(), doing the free() and return conn. t5601 had 2 corner test cases which now pass. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git fetch: support host:/~repoTorsten Bögershausen
The documentation (in urls.txt) says that "ssh://host:/~repo", "host:/~repo" or "host:~repo" specify the repository "repo" in the home directory at "host". This has not been working for "host:/~repo". Before commit 356bec "Support [address] in URLs", the comparison "url != hostname" could be used to determine if the URL had a scheme or not: "ssh://host/host" != "host". However, after 356bec "[::1]" was converted into "::1", yielding url != hostname as well. To fix this regression, don't use "if (url != hostname)", but look at the separator instead. Rename the variable "c" into "separator" to make it easier to read. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git fetch-pack: add --diag-urlTorsten Bögershausen
The main purpose is to trace the URL parser called by git_connect() in connect.c The main features of the parser can be listed as this: - parse out host and path for URLs with a scheme (git:// file:// ssh://) - parse host names embedded by [] correctly - extract the port number, if present - separate URLs like "file" (which are local) from URLs like "host:repo" which should use ssh Add the new parameter "--diag-url" to "git fetch-pack", which prints the value for protocol, host and path to stderr and exits. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git_connect: factor out discovery of the protocol and its partsJohannes Sixt
git_connect has grown large due to the many different protocols syntaxes that are supported. Move the part of the function that parses the URL to connect to into a separate function for readability. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-09git_connect: remove artificial limit of a remote commandJohannes Sixt
Since day one, function git_connect() had a limit on the command line of the command that is invoked to make a connection. 7a33bcbe converted the code that constructs the command to strbuf. This would have been the right time to remove the limit, but it did not happen. Remove it now. git_connect() uses start_command() to invoke the command; consequently, the limits of the system still apply, but are diagnosed only at execve() time. But these limits are more lenient than the 1K that git_connect() imposed. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05replace {pre,suf}fixcmp() with {starts,ends}_with()Christian Couder
Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c | grep -v strbuf\\.c | xargs perl -pi -e ' s|!prefixcmp\(|starts_with\(|g; s|prefixcmp\(|!starts_with\(|g; s|!suffixcmp\(|ends_with\(|g; s|suffixcmp\(|!ends_with\(|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-30Merge branch 'jc/upload-pack-send-symref'Junio C Hamano
One long-standing flaw in the pack transfer protocol used by "git clone" was that there was no way to tell the other end which branch "HEAD" points at, and the receiving end needed to guess. A new capability has been defined in the pack protocol to convey this information so that cloning from a repository with more than one branches pointing at the same commit where the HEAD is at now reliably sets the initial branch in the resulting repository. * jc/upload-pack-send-symref: t5570: Update for clone-progress-to-stderr branch t5570: Update for symref capability clone: test the new HEAD detection logic connect: annotate refs with their symref information in get_remote_head() connect.c: make parse_feature_value() static upload-pack: send non-HEAD symbolic refs upload-pack: send symbolic ref information as capability upload-pack.c: do not pass confusing cb_data to mark_our_ref() t5505: fix "set-head --auto with ambiguous HEAD" test
2013-10-14Merge branch 'nd/clone-local-with-colon'Jonathan Nieder
* nd/clone-local-with-colon: clone: tighten "local paths with colons" check a bit
2013-09-27clone: tighten "local paths with colons" check a bitNguyễn Thái Ngọc Duy
commit 6000334 (clone: allow cloning local paths with colons in them - 2013-05-04) made it possible to specify a path that has colons in it without file://, e.g. ../foo:bar/somewhere. But the check was a bit sloppy. Consider the url '[foo]:bar'. The '[]' unwrapping code will turn the string to 'foo\0:bar'. In effect this new string is the same as 'foo/:bar' in the check "path < strchrnul(host, '/')", which mistakes it for a local path (with '/' before the first ':') when it's actually not. So disable the check for '/' before ':' when the URL has been mangled by '[]' unwrapping. [jn: with tests from Jeff King] Noticed-by: Morten Stenshorne <mstensho@opera.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-18connect: annotate refs with their symref information in get_remote_head()Junio C Hamano
By doing this, clients of upload-pack can now reliably tell what ref a symbolic ref points at; the updated test in t5505 used to expect failure due to the ambiguity and made sure we give diagnostics, but we no longer need to be so pessimistic. Make sure we correctly learn which branch HEAD points at from the other side instead. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-18connect.c: make parse_feature_value() staticJunio C Hamano
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-08cache.h: move remote/connect API out of itJunio C Hamano
The definition of "struct ref" in "cache.h", a header file so central to the system, always confused me. This structure is not about the local ref used by sha1-name API to name local objects. It is what refspecs are expanded into, after finding out what refs the other side has, to define what refs are updated after object transfer succeeds to what values. It belongs to "remote.h" together with "struct refspec". While we are at it, also move the types and functions related to the Git transport connection to a new header file connect.h Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02Merge branch 'nd/clone-local-with-colon'Junio C Hamano
"git clone foo/bar:baz" cannot be a request to clone from a remote over git-over-ssh specified in the scp style. Detect this case and clone from a local repository at "foo/bar:baz". * nd/clone-local-with-colon: clone: allow cloning local paths with colons in them
2013-05-07clone: allow cloning local paths with colons in themNguyễn Thái Ngọc Duy
Usually "foo:bar" is interpreted as an ssh url. This patch allows to clone from such paths by putting at least one slash before the colon (i.e. /path/to/foo:bar or just ./foo:bar). file://foo:bar should also work, but local optimizations are off in that case, which may be unwanted. While at there, warn the users about --local being ignored in this case. Reported-by: William Giokas <1007380@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24teach get_remote_heads to read from a memory bufferJeff King
Now that we can read packet data from memory as easily as a descriptor, get_remote_heads can take either one as a source. This will allow further refactoring in remote-curl. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24pkt-line: share buffer/descriptor reading implementationJeff King
The packet_read function reads from a descriptor. The packet_get_line function is similar, but reads from an in-memory buffer, and uses a completely separate implementation. This patch teaches the generic packet_read function to accept either source, and we can do away with packet_get_line's implementation. There are two other differences to account for between the old and new functions. The first is that we used to read into a strbuf, but now read into a fixed size buffer. The only two callers are fine with that, and in fact it simplifies their code, since they can use the same static-buffer interface as the rest of the packet_read_line callers (and we provide a similar convenience wrapper for reading from a buffer rather than a descriptor). This is technically an externally-visible behavior change in that we used to accept arbitrary sized packets up to 65532 bytes, and now cap out at LARGE_PACKET_MAX, 65520. In practice this doesn't matter, as we use it only for parsing smart-http headers (of which there is exactly one defined, and it is small and fixed-size). And any extension headers would be breaking the protocol to go over LARGE_PACKET_MAX anyway. The other difference is that packet_get_line would return on error rather than dying. However, both callers of packet_get_line are actually improved by dying. The first caller does its own error checking, but we can drop that; as a result, we'll actually get more specific reporting about protocol breakage when packet_read dies internally. The only downside is that packet_read will not print the smart-http URL that failed, but that's not a big deal; anybody not debugging can already see the remote's URL already, and anybody debugging would want to run with GIT_CURL_VERBOSE anyway to see way more information. The second caller, which is just trying to skip past any extra smart-http headers (of which there are none defined, but which we allow to keep room for future expansion), did not error check at all. As a result, it would treat an error just like a flush packet. The resulting mess would generally cause an error later in get_remote_heads, but now we get error reporting much closer to the source of the problem. Brown-paper-bag-fixes-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20pkt-line: provide a LARGE_PACKET_MAX static bufferJeff King
Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20pkt-line: teach packet_read_line to chomp newlinesJeff King
The packets sent during ref negotiation are all terminated by newline; even though the code to chomp these newlines is short, we end up doing it in a lot of places. This patch teaches packet_read_line to auto-chomp the trailing newline; this lets us get rid of a lot of inline chomping code. As a result, some call-sites which are not reading line-oriented data (e.g., when reading chunks of packfiles alongside sideband) transition away from packet_read_line to the generic packet_read interface. This patch converts all of the existing callsites. Since the function signature of packet_read_line does not change (but its behavior does), there is a possibility of new callsites being introduced in later commits, silently introducing an incompatibility. However, since a later patch in this series will change the signature, such a commit would have to be merged directly into this commit, not to the tip of the series; we can therefore ignore the issue. This is an internal cleanup and should produce no change of behavior in the normal case. However, there is one corner case to note. Callers of packet_read_line have never been able to tell the difference between a flush packet ("0000") and an empty packet ("0004"), as both cause packet_read_line to return a length of 0. Readers treat them identically, even though Documentation/technical/protocol-common.txt says we must not; it also says that implementations should not send an empty pkt-line. By stripping out the newline before the result gets to the caller, we will now treat the newline-only packet ("0005\n") the same as an empty packet, which in turn gets treated like a flush packet. In practice this doesn't matter, as neither empty nor newline-only packets are part of git's protocols (at least not for the line-oriented bits, and readers who are not expecting line-oriented packets will be calling packet_read directly, anyway). But even if we do decide to care about the distinction later, it is orthogonal to this patch. The right place to tighten would be to stop treating empty packets as flush packets, and this change does not make doing so any harder. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20pkt-line: provide a generic reading function with optionsJeff King
Originally we had a single function for reading packetized data: packet_read_line. Commit 46284dd grew a more "gentle" form, packet_read, that returns an error instead of dying upon reading a truncated input stream. However, it is not clear from the names which should be called, or what the difference is. Let's instead make packet_read be a generic public interface that can take option flags, and update the single callsite that uses it. This is less code, more clear, and paves the way for introducing more options into the generic interface later. The function signature is changed, so there should be no hidden conflicts with topics in flight. While we're at it, we'll document how error conditions are handled based on the options, and rename the confusing "return_line_fail" option to "gentle_on_eof". While we are cleaning up the names, we can drop the "return_line_fail" checks in packet_read_internal entirely. They look like this: ret = safe_read(..., return_line_fail); if (return_line_fail && ret < 0) ... The check for return_line_fail is a no-op; safe_read will only ever return an error value if return_line_fail was true in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-29Merge branch 'jc/capabilities'Junio C Hamano
Some capabilities were asked by fetch-pack even when upload-pack did not advertise that they are available. Fix fetch-pack not to do so. * jc/capabilities: fetch-pack: mention server version with verbose output parse_feature_request: make it easier to see feature values fetch-pack: do not ask for unadvertised capabilities do not send client agent unless server does first send-pack: fix capability-sending logic include agent identifier in capability string
2012-08-14parse_feature_request: make it easier to see feature valuesJeff King
We already take care to parse key/value capabilities like "foo=bar", but the code does not provide a good way of actually finding out what is on the right-hand side of the "=". A server using "parse_feature_request" could accomplish this with some extra parsing. You must skip past the "key" portion manually, check for "=" versus NUL or space, and then find the length by searching for the next space (or NUL). But clients can't even do that, since the "server_supports" interface does not even return the pointer. Instead, let's have our parser share more information by providing a pointer to the value and its length. The "parse_feature_value" function returns a pointer to the feature's value portion, along with the length of the value. If the feature is missing, NULL is returned. If it does not have an "=", then a zero-length value is returned. Similarly, "server_feature_value" behaves in the same way, but always checks the static server_feature_list variable. We can then implement "server_supports" in terms of "server_feature_value". We cannot implement the original "parse_feature_request" in terms of our new function, because it returned a pointer to the beginning of the feature. However, no callers actually cared about the value of the returned pointer, so we can simplify it to a boolean just as we do for "server_supports". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>