summaryrefslogtreecommitdiff
path: root/pkt-line.c
AgeCommit message (Collapse)Author
2020-07-07Merge branch 'bc/sha-256-part-2'Junio C Hamano
SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...
2020-05-27pkt-line: add a member for hash algorithmbrian m. carlson
Add a member for the hash algorithm currently in use to the packet reader so it can parse references correctly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-24pkt-line: define PACKET_READ_RESPONSE_ENDDenton Liu
In a future commit, we will use PACKET_READ_RESPONSE_END to separate messages proxied by remote-curl. To prepare for this, add the PACKET_READ_RESPONSE_END enum value. In switch statements that need a case added, die() or BUG() when a PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how PACKET_READ_DELIM is implemented (especially in cases where packets are being forwarded). Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-19pkt-line: extern packet_length()Denton Liu
In a future commit, we will be manually processing packets and we will need to access the length header. In order to simplify this, extern packet_length() so that the logic can be reused. Change the function parameter from `const char *linelen` to `const char lenbuf_hex[4]`. Even though these two types behave identically as function parameters, use the array notation to semantically indicate exactly what this function is expecting as an argument. Also, rename it from linelen to lenbuf_hex as the former sounds like it should be an integral type which is misleading. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-16pkt-line: drop 'const'-ness of a param to set_packet_header()Junio C Hamano
The function's definition has a paramter of type "int" qualified as "const". The fact that the incoming parameter is used as read-only in the fuction is an implementation detail that the callers should not have to be told in the prototype declaring it (and "const" there has no effect, as C passes parameters by value). The prototype defined for the function in pkt-line.h lacked the matching "const" for this reason, but apparently some compilers (e.g. MS Visual C 2017) complain about the parameter type mismatch. Let's squelch it by removing the "const" that is pointless in the definition of a small and trivial function like this, which would not help optimizing compilers nor reading humans that much. Noticed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-15pkt-line: prepare buffer before handling ERR packetsJeff King
Since 2d103c31c2 (pack-protocol.txt: accept error packets in any context, 2018-12-29), the pktline code will detect an ERR packet and die automatically, saving the caller from dealing with it. But we do so too early in the function, before we have terminated the buffer with a NUL. As a result, passing the ERR message to die() may result in us printing random cruft from a previous packet. This doesn't trigger memory tools like ASan because we reuse the same buffer over and over (so the contents are valid and initialized; they're just stale). We can see demonstrate this by tightening the regex we use to match the error message in t5516; without this patch, git-fetch will accidentally print the capabilities from the (much longer) initial packet we received. By moving the ERR code later in the function we get a few other benefits, too: - we'll now chomp any newline sent by the other side (which is what we want, since die() will add its own newline) - we'll now mention the ERR packet with GIT_TRACE_PACKET Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-20Merge branch 'jk/no-sigpipe-during-network-transport'Junio C Hamano
On platforms where "git fetch" is killed with SIGPIPE (e.g. OSX), the upload-pack that runs on the other end that hangs up after detecting an error could cause "git fetch" to die with a signal, which led to a flakey test. "git fetch" now ignores SIGPIPE during the network portion of its operation (this is not a problem as we check the return status from our write(2)s). * jk/no-sigpipe-during-network-transport: fetch: ignore SIGPIPE during network operation fetch: avoid calling write_or_die()
2019-03-05fetch: avoid calling write_or_die()Jeff King
The write_or_die() function has one quirk that a caller might not expect: when it sees EPIPE from the write() call, it translates that into a death by SIGPIPE. This doesn't change the overall behavior (the program exits either way), but it does potentially confuse test scripts looking for a non-signal exit code. Let's switch away from using write_or_die() in a few code paths, which will give us more consistent exit codes. It also gives us the opportunity to write more descriptive error messages, since we have context that write_or_die() does not. Note that this won't do much by itself, since we'd typically be killed by SIGPIPE before write_or_die() even gets a chance to do its thing. That will be addressed in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-03remote-curl: use post_rpc() for protocol v2 alsoJonathan Tan
When transmitting and receiving POSTs for protocol v0 and v1, remote-curl uses post_rpc() (and associated functions), but when doing the same for protocol v2, it uses a separate set of functions (proxy_rpc() and others). Besides duplication of code, this has caused at least one bug: the auth retry mechanism that was implemented in v0/v1 was not implemented in v2. To fix this issue and avoid it in the future, make remote-curl also use post_rpc() when handling protocol v2. Because line lengths are written to the HTTP request in protocol v2 (unlike in protocol v0/v1), this necessitates changes in post_rpc() and some of the functions it uses; perform these changes too. A test has been included to ensure that the code for both the unchunked and chunked variants of the HTTP request is exercised. Note: stateless_connect() has been updated to use the lower-level packet reading functions instead of struct packet_reader. The low-level control is necessary here because we cannot change the destination buffer of struct packet_reader while it is being used; struct packet_buffer has a peeking mechanism which relies on the destination buffer being present in between a peek and a read. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-17{fetch,upload}-pack: sideband v2 fetch responseJonathan Tan
Currently, a response to a fetch request has sideband support only while the packfile is being sent, meaning that the server cannot send notices until the start of the packfile. Extend sideband support in protocol v2 fetch responses to the whole response. upload-pack will advertise it if the uploadpack.allowsidebandall configuration variable is set, and fetch-pack will automatically request it if advertised. If the sideband is to be used throughout the whole response, upload-pack will use it to send errors instead of prefixing a PKT-LINE payload with "ERR ". This will be tested in a subsequent patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-17sideband: reverse its dependency on pkt-lineJonathan Tan
A subsequent patch will teach struct packet_reader a new field that, if set, instructs it to interpret read data as multiplexed. This will create a dependency from pkt-line to sideband. To avoid a circular dependency, split recv_sideband() into 2 parts: the reading loop (left in recv_sideband()) and the processing of the contents (in demultiplex_sideband()), and move the former into pkt-line. This reverses the direction of dependency: sideband no longer depends on pkt-line, and pkt-line now depends on sideband. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-15pkt-line: introduce struct packet_writerJonathan Tan
A future patch will allow the client to request multiplexing of the entire fetch response (and not only during packfile transmission), which in turn allows the server to send progress and keepalive messages at any time during the response. It will be convenient for a future patch if writing options (specifically, whether the written data is to be multiplexed) could be controlled from a single place, so create struct packet_writer to serve as that place, and modify upload-pack to use it. Currently, it only stores the output fd, but a subsequent patch will (as described above) introduce an option to determine if the written data is to be multiplexed. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-02pack-protocol.txt: accept error packets in any contextMasaya Suzuki
In the Git pack protocol definition, an error packet may appear only in a certain context. However, servers can face a runtime error (e.g. I/O error) at an arbitrary timing. This patch changes the protocol to allow an error packet to be sent instead of any packet. Without this protocol spec change, when a server cannot process a request, there's no way to tell that to a client. Since the server cannot produce a valid response, it would be forced to cut a connection without telling why. With this protocol spec change, the server can be more gentle in this situation. An old client may see these error packets as an unexpected packet, but this is not worse than having an unexpected EOF. Following this protocol spec change, the error packet handling code is moved to pkt-line.c. Implementation wise, this implementation uses pkt-line to communicate with a subprocess. Since this is not a part of Git protocol, it's possible that a packet that is not supposed to be an error packet is mistakenly parsed as an error packet. This error packet handling is enabled only for the Git pack protocol parsing code considering this. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23pkt-line.c: mark more strings for translationNguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23Update messages in preparation for i18nNguyễn Thái Ngọc Duy
Many messages will be marked for translation in the following commits. This commit updates some of them to be more consistent and reduce diff noise in those commits. Changes are - keep the first letter of die(), error() and warning() in lowercase - no full stop in die(), error() or warning() if it's single sentence messages - indentation - some messages are turned to BUG(), or prefixed with "BUG:" and will not be marked for i18n - some messages are improved to give more information - some messages are broken down by sentence to be i18n friendly (on the same token, combine multiple warning() into one big string) - the trailing \n is converted to printf_ln if possible, or deleted if not redundant - errno_errno() is used instead of explicit strerror() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-30Merge branch 'js/use-bug-macro'Junio C Hamano
Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths
2018-05-06Replace all die("BUG: ...") calls by BUG() onesJohannes Schindelin
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in 588a538ae55 (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15pkt-line: add packet_buf_write_len functionBrandon Williams
Add the 'packet_buf_write_len()' function which allows for writing an arbitrary length buffer into a 'struct strbuf' and formatting it in packet-line format. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14pkt-line: add delim packet supportBrandon Williams
One of the design goals of protocol-v2 is to improve the semantics of flush packets. Currently in protocol-v1, flush packets are used both to indicate a break in a list of packet lines as well as an indication that one side has finished speaking. This makes it particularly difficult to implement proxies as a proxy would need to completely understand git protocol instead of simply looking for a flush packet. To do this, introduce the special deliminator packet '0001'. A delim packet can then be used as a deliminator between lists of packet lines while flush packets can be reserved to indicate the end of a response. Documentation for how this packet will be used in protocol v2 will included in a future patch. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14pkt-line: allow peeking a packet line without consuming itBrandon Williams
Sometimes it is advantageous to be able to peek the next packet line without consuming it (e.g. to be able to determine the protocol version a server is speaking). In order to do that introduce 'struct packet_reader' which is an abstraction around the normal packet reading logic. This enables a caller to be able to peek a single line at a time using 'packet_reader_peek()' and having a caller consume a line by calling 'packet_reader_read()'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14pkt-line: introduce packet_read_with_statusBrandon Williams
The current pkt-line API encodes the status of a pkt-line read in the length of the read content. An error is indicated with '-1', a flush with '0' (which can be confusing since a return value of '0' can also indicate an empty pkt-line), and a positive integer for the length of the read content otherwise. This doesn't leave much room for allowing the addition of additional special packets in the future. To solve this introduce 'packet_read_with_status()' which reads a packet and returns the status of the read encoded as an 'enum packet_status' type. This allows for easily identifying between special and normal packets as well as errors. It also enables easily adding a new special packet in the future. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-06Merge branch 'bw/protocol-v1'Junio C Hamano
A new mechanism to upgrade the wire protocol in place is proposed and demonstrated that it works with the older versions of Git without harming them. * bw/protocol-v1: Documentation: document Extra Parameters ssh: introduce a 'simple' ssh variant i5700: add interop test for protocol transition http: tell server that the client understands v1 connect: tell server that the client understands v1 connect: teach client to recognize v1 server response upload-pack, receive-pack: introduce protocol version 1 daemon: recognize hidden request arguments protocol: introduce protocol extension mechanisms pkt-line: add packet_write function connect: in ref advertisement, shallows are last
2017-10-17pkt-line: add packet_write functionBrandon Williams
Add a function which can be used to write the contents of an arbitrary buffer. This makes it easy to build up data in a buffer before writing the packet instead of formatting the entire contents of the packet using 'packet_write_fmt()'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-27prefer "!=" when checking read_in_full() resultJeff King
Comparing the result of read_in_full() using less-than is potentially dangerous, as a negative return value may be converted to an unsigned type and be considered a success. This is discussed further in 561598cfcf (read_pack_header: handle signed/unsigned comparison in read result, 2017-09-13). Each of these instances is actually fine in practice: - in get-tar-commit-id, the HEADERSIZE macro expands to a signed integer. If it were switched to an unsigned type (e.g., a size_t), then it would be a bug. - the other two callers check for a short read only after handling a negative return separately. This is a fine practice, but we'd prefer to model "!=" as a general rule. So all of these cases can be considered cleanups and not actual bugfixes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25Merge branch 'jk/write-in-full-fix'Junio C Hamano
Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern
2017-09-19Merge branch 'ma/pkt-line-leakfix'Junio C Hamano
A leakfix. * ma/pkt-line-leakfix: pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
2017-09-14pkt-line: check write_in_full() errors against "< 0"Jeff King
As with the previous two commits, we prefer to check write_in_full()'s return value to see if it is negative, rather than comparing it to the input length. These cases actually flip the logic to check for success, making conversion a little different than in other cases. We could of course write: if (write_in_full(...) >= 0) return 0; return error(...); But our usual method of spelling write() error checks is just "< 0". So let's flip the logic for each of these conditionals to our usual style. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06pkt-line: re-'static'-ify buffer in packet_write_fmt_1()Martin Ågren
The static-ness was silently dropped in commit 70428d1a5 ("pkt-line: add packet_write_fmt_gently()", 2016-10-16). As a result, for each call to packet_write_fmt_1, we allocate and leak a buffer. We could keep the strbuf non-static and instead make sure we always release it before returning (but not before we die, so that we don't touch errno). That would also prepare us for threaded use. But until that needs to happen, let's just restore the static-ness so that we get back to a situation where we (eventually) do not continuosly keep allocating memory. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-26sub-process: refactor handshake to common functionJonathan Tan
Refactor, into a common function, the version and capability negotiation done when invoking a long-running process as a clean or smudge filter. This will be useful for other Git code that needs to interact similarly with a long-running process. As you can see in the change to t0021, this commit changes the error message reported when the long-running process does not introduce itself with the expected "server"-terminated line. Originally, the error message reports that the filter "does not support filter protocol version 2", differentiating between the old single-file filter protocol and the new multi-file filter protocol - I have updated it to something more generic and useful. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08convert: move packet_write_line() into pkt-line as packet_writel()Ben Peart
Add packet_writel() which writes multiple lines in a single call and then calls packet_flush_gently(). Update convert.c to use the new packet_writel() function from pkt-line. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08pkt-line: add packet_read_line_gently()Ben Peart
Add packet_read_line_gently() to enable reading a line without dying on EOF. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08pkt-line: fix packet_read_line() to handle len < 0 errorsBen Peart
Update packet_read_line() to test for len > 0 to avoid potential bug if read functions return lengths less than zero to indicate errors. Signed-off-by: Ben Peart <benpeart@microsoft.com> Found/Fixed-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17pkt-line: add functions to read/write flush terminated packet streamsLars Schneider
write_packetized_from_fd() and write_packetized_from_buf() write a stream of packets. All content packets use the maximal packet size except for the last one. After the last content packet a `flush` control packet is written. read_packetized_to_strbuf() reads arbitrary sized packets until it detects a `flush` packet. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17pkt-line: add packet_write_gently()Lars Schneider
packet_write_fmt_gently() uses format_packet() which lets the caller only send string data via "%s". That means it cannot be used for arbitrary data that may contain NULs. Add packet_write_gently() which writes arbitrary data and does not die in case of an error. The function is used by other pkt-line functions in a subsequent patch. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17pkt-line: add packet_flush_gently()Lars Schneider
packet_flush() would die in case of a write error even though for some callers an error would be acceptable. Add packet_flush_gently() which writes a pkt-line flush packet like packet_flush() but does not die in case of an error. The function is used in a subsequent patch. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17pkt-line: add packet_write_fmt_gently()Lars Schneider
packet_write_fmt() would die in case of a write error even though for some callers an error would be acceptable. Add packet_write_fmt_gently() which writes a formatted pkt-line like packet_write_fmt() but does not die in case of an error. The function is used in a subsequent patch. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17pkt-line: extract set_packet_header()Lars Schneider
Extracted set_packet_header() function converts an integer to a 4 byte hex string. Make this function locally available so that other pkt-line functions could use it. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17pkt-line: rename packet_write() to packet_write_fmt()Lars Schneider
packet_write() should be called packet_write_fmt() because it is a printf-like function that takes a format string as first parameter. packet_write_fmt() should be used for text strings only. Arbitrary binary data should use a new packet_write() function that is introduced in a subsequent patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07introduce hex2chr() for converting two hexadecimal digits to a characterRené Scharfe
Add and use a helper function that decodes the char value of two hexadecimal digits. It returns a negative number on error, avoids running over the end of the given string and doesn't shift negative values. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-01pkt-line: show packets in async processes as "sideband"Jeff King
If you run "GIT_TRACE_PACKET=1 git push", you may get confusing output like (line prefixes omitted for clarity): packet: push< \1000eunpack ok0019ok refs/heads/master0000 packet: push< unpack ok packet: push< ok refs/heads/master packet: push< 0000 packet: push< 0000 Why do we see the data twice, once apparently wrapped inside another pkt-line, and once unwrapped? Why do we get two flush packets? The answer is that we start an async process to demux the sideband data. The first entry comes from the sideband process reading the data, and the second from push itself. Likewise, the first flush is inside the demuxed packet, and the second is an actual sideband flush. We can make this a bit more clear by marking the sideband demuxer explicitly as "sideband" rather than "push". The most elegant way to do this would be to simply call packet_trace_identity() inside the sideband demuxer. But we can't do that reliably, because it relies on a global variable, which might be shared if pthreads are in use. What we really need is thread-local storage for packet_trace_identity. But the async code does not provide an interface for that, and it would be messy to add it here (we'd have to care about pthreads, initializing our pthread_key_t ahead of time, etc). So instead, let us just assume that any async process is handling sideband data. That's always true now, and is likely to remain so in the future. The output looks like: packet: sideband< \1000eunpack ok0019ok refs/heads/master0000 packet: push< unpack ok packet: push< ok refs/heads/master packet: push< 0000 packet: sideband< 0000 Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-16pkt-line: support tracing verbatim pack contentsJeff King
When debugging the pack protocol, it is sometimes useful to store the verbatim pack that we sent or received on the wire. Looking at the on-disk result is often not helpful for a few reasons: 1. If the operation is a clone, we destroy the repo on failure, leaving nothing on disk. 2. If the pack is small, we unpack it immediately, and the full pack never hits the disk. 3. If we feed the pack to "index-pack --fix-thin", the resulting pack has the extra delta bases added to it. We already have a GIT_TRACE_PACKET mechanism for tracing packets. Let's extend it with GIT_TRACE_PACKFILE to dump the verbatim packfile. There are a few other positive fallouts that come from rearranging this code: - We currently disable the packet trace after seeing the PACK header, even though we may get human-readable lines on other sidebands; now we include them in the trace. - We currently try to print "PACK ..." in the trace to indicate that the packfile has started. But because we disable packet tracing, we never printed this line. We will now do so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-15pkt-line: tighten sideband PACK check when tracingJeff King
To find the start of the pack data, we accept the word PACK at the beginning of any sideband channel, even though what we really want is to find the pack data on channel 1. In practice this doesn't matter, as sideband-2 messages tend to start with "error:" or similar, but it is a good idea to be explicit (especially as we add more code in this area, we will rely on this assumption). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-15pkt-line: simplify starts_with checks in packet tracingJeff King
We carefully check that our pkt buffer has enough characters before seeing if it starts with "PACK". The intent is to avoid reading random memory if we get a short buffer like "PAC". However, we know that the traced packets are always NUL-terminated. They come from one of these sources: 1. A string literal. 2. `format_packet`, which uses a strbuf. 3. `packet_read`, which defensively NUL-terminates what we read. We can therefore drop the length checks, as we know we will hit the trailing NUL if we have a short input. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-10pkt-line: allow writing of LARGE_PACKET_MAX buffersJeff King
When we send out pkt-lines with refnames, we use a static 1000-byte buffer. This means that the maximum size of a ref over the git protocol is around 950 bytes (the exact size depends on the protocol line being written, but figure on a sha1 plus some boilerplate). This is enough for any sane workflow, but occasionally odd things happen (e.g., a bug may create a ref "foo/foo/foo/..." accidentally). With the current code, you cannot even use "push" to delete such a ref from a remote. Let's switch to using a strbuf, with a hard-limit of LARGE_PACKET_MAX (which is specified by the protocol). This matches the size of the readers, as of 74543a0 (pkt-line: provide a LARGE_PACKET_MAX static buffer, 2013-02-20). Versions of git older than that will complain about our large packets, but it's really no worse than the current behavior. Right now the sender barfs with "impossibly long line" trying to send the packet, and afterwards the reader will barf with "protocol error: bad line length %d", which is arguably better anyway. Note that we're not really _solving_ the problem here, but just bumping the limits. In theory, the length of a ref is unbounded, and pkt-line can only represent sizes up to 65531 bytes. So we are just bumping the limit, not removing it. But hopefully 64K should be enough for anyone. As a bonus, by using a strbuf for the formatting we can eliminate an unnecessary copy in format_buf_write. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-14trace: improve trace performanceKarsten Blees
The trace API currently rechecks the environment variable and reopens the trace file on every API call. This has the ugly side effect that errors (e.g. file cannot be opened, or the user specified a relative path) are also reported on every call. Performance can be improved by about factor three by remembering the environment state and keeping the file open. Replace the 'const char *key' parameter in the API with a pointer to a 'struct trace_key' that bundles the environment variable name with additional, trace-internal state. Change the call sites of these APIs to use a static 'struct trace_key' instead of a string constant. In trace.c::get_trace_fd(), save and reuse the file descriptor in 'struct trace_key'. Add a 'trace_disable()' API, so that packet_trace() can cleanly disable tracing when it encounters packed data (instead of using unsetenv()). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05replace {pre,suf}fixcmp() with {starts,ends}_with()Christian Couder
Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c | grep -v strbuf\\.c | xargs perl -pi -e ' s|!prefixcmp\(|starts_with\(|g; s|prefixcmp\(|!starts_with\(|g; s|!suffixcmp\(|ends_with\(|g; s|suffixcmp\(|!ends_with\(|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-24pkt-line: share buffer/descriptor reading implementationJeff King
The packet_read function reads from a descriptor. The packet_get_line function is similar, but reads from an in-memory buffer, and uses a completely separate implementation. This patch teaches the generic packet_read function to accept either source, and we can do away with packet_get_line's implementation. There are two other differences to account for between the old and new functions. The first is that we used to read into a strbuf, but now read into a fixed size buffer. The only two callers are fine with that, and in fact it simplifies their code, since they can use the same static-buffer interface as the rest of the packet_read_line callers (and we provide a similar convenience wrapper for reading from a buffer rather than a descriptor). This is technically an externally-visible behavior change in that we used to accept arbitrary sized packets up to 65532 bytes, and now cap out at LARGE_PACKET_MAX, 65520. In practice this doesn't matter, as we use it only for parsing smart-http headers (of which there is exactly one defined, and it is small and fixed-size). And any extension headers would be breaking the protocol to go over LARGE_PACKET_MAX anyway. The other difference is that packet_get_line would return on error rather than dying. However, both callers of packet_get_line are actually improved by dying. The first caller does its own error checking, but we can drop that; as a result, we'll actually get more specific reporting about protocol breakage when packet_read dies internally. The only downside is that packet_read will not print the smart-http URL that failed, but that's not a big deal; anybody not debugging can already see the remote's URL already, and anybody debugging would want to run with GIT_CURL_VERBOSE anyway to see way more information. The second caller, which is just trying to skip past any extra smart-http headers (of which there are none defined, but which we allow to keep room for future expansion), did not error check at all. As a result, it would treat an error just like a flush packet. The resulting mess would generally cause an error later in get_remote_heads, but now we get error reporting much closer to the source of the problem. Brown-paper-bag-fixes-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20pkt-line: provide a LARGE_PACKET_MAX static bufferJeff King
Most of the callers of packet_read_line just read into a static 1000-byte buffer (callers which handle arbitrary binary data already use LARGE_PACKET_MAX). This works fine in practice, because: 1. The only variable-sized data in these lines is a ref name, and refs tend to be a lot shorter than 1000 characters. 2. When sending ref lines, git-core always limits itself to 1000 byte packets. However, the only limit given in the protocol specification in Documentation/technical/protocol-common.txt is LARGE_PACKET_MAX; the 1000 byte limit is mentioned only in pack-protocol.txt, and then only describing what we write, not as a specific limit for readers. This patch lets us bump the 1000-byte limit to LARGE_PACKET_MAX. Even though git-core will never write a packet where this makes a difference, there are two good reasons to do this: 1. Other git implementations may have followed protocol-common.txt and used a larger maximum size. We don't bump into it in practice because it would involve very long ref names. 2. We may want to increase the 1000-byte limit one day. Since packets are transferred before any capabilities, it's difficult to do this in a backwards-compatible way. But if we bump the size of buffer the readers can handle, eventually older versions of git will be obsolete enough that we can justify bumping the writers, as well. We don't have plans to do this anytime soon, but there is no reason not to start the clock ticking now. Just bumping all of the reading bufs to LARGE_PACKET_MAX would waste memory. Instead, since most readers just read into a temporary buffer anyway, let's provide a single static buffer that all callers can use. We can further wrap this detail away by having the packet_read_line wrapper just use the buffer transparently and return a pointer to the static storage. That covers most of the cases, and the remaining ones already read into their own LARGE_PACKET_MAX buffers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20pkt-line: teach packet_read_line to chomp newlinesJeff King
The packets sent during ref negotiation are all terminated by newline; even though the code to chomp these newlines is short, we end up doing it in a lot of places. This patch teaches packet_read_line to auto-chomp the trailing newline; this lets us get rid of a lot of inline chomping code. As a result, some call-sites which are not reading line-oriented data (e.g., when reading chunks of packfiles alongside sideband) transition away from packet_read_line to the generic packet_read interface. This patch converts all of the existing callsites. Since the function signature of packet_read_line does not change (but its behavior does), there is a possibility of new callsites being introduced in later commits, silently introducing an incompatibility. However, since a later patch in this series will change the signature, such a commit would have to be merged directly into this commit, not to the tip of the series; we can therefore ignore the issue. This is an internal cleanup and should produce no change of behavior in the normal case. However, there is one corner case to note. Callers of packet_read_line have never been able to tell the difference between a flush packet ("0000") and an empty packet ("0004"), as both cause packet_read_line to return a length of 0. Readers treat them identically, even though Documentation/technical/protocol-common.txt says we must not; it also says that implementations should not send an empty pkt-line. By stripping out the newline before the result gets to the caller, we will now treat the newline-only packet ("0005\n") the same as an empty packet, which in turn gets treated like a flush packet. In practice this doesn't matter, as neither empty nor newline-only packets are part of git's protocols (at least not for the line-oriented bits, and readers who are not expecting line-oriented packets will be calling packet_read directly, anyway). But even if we do decide to care about the distinction later, it is orthogonal to this patch. The right place to tighten would be to stop treating empty packets as flush packets, and this change does not make doing so any harder. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-20pkt-line: provide a generic reading function with optionsJeff King
Originally we had a single function for reading packetized data: packet_read_line. Commit 46284dd grew a more "gentle" form, packet_read, that returns an error instead of dying upon reading a truncated input stream. However, it is not clear from the names which should be called, or what the difference is. Let's instead make packet_read be a generic public interface that can take option flags, and update the single callsite that uses it. This is less code, more clear, and paves the way for introducing more options into the generic interface later. The function signature is changed, so there should be no hidden conflicts with topics in flight. While we're at it, we'll document how error conditions are handled based on the options, and rename the confusing "return_line_fail" option to "gentle_on_eof". While we are cleaning up the names, we can drop the "return_line_fail" checks in packet_read_internal entirely. They look like this: ret = safe_read(..., return_line_fail); if (return_line_fail && ret < 0) ... The check for return_line_fail is a no-op; safe_read will only ever return an error value if return_line_fail was true in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>