summaryrefslogtreecommitdiff
path: root/compat
AgeCommit message (Collapse)Author
2020-02-27mingw: workaround for hangs when sending STDINAlexandr Miloslavskiy
Explanation ----------- The problem here is flawed `poll()` implementation. When it tries to see if pipe can be written without blocking, it eventually calls `NtQueryInformationFile()` and tests `WriteQuotaAvailable`. However, the meaning of quota was misunderstood. The value of quota is reduced when either some data was written to a pipe, *or* there is a pending read on the pipe. Therefore, if there is a pending read of size >= than the pipe's buffer size, poll() will think that pipe is not writable and will hang forever, usually that means deadlocking both pipe users. I have studied the problem and found that Windows pipes track two values: `QuotaUsed` and `BytesInQueue`. The code in `poll()` apparently wants to know `BytesInQueue` instead of quota. Unfortunately, `BytesInQueue` can only be requested from read end of the pipe, while `poll()` receives write end. The git's implementation of `poll()` was copied from gnulib, which also contains a flawed implementation up to today. I also had a look at implementation in cygwin, which is also broken in a subtle way. It uses this code in `pipe_data_available()`: fpli.WriteQuotaAvailable = (fpli.OutboundQuota - fpli.ReadDataAvailable) However, `ReadDataAvailable` always returns 0 for the write end of the pipe, turning the code into an obfuscated version of returning pipe's total buffer size, which I guess will in turn have `poll()` always say that pipe is writable. The commit that introduced the code doesn't say anything about this change, so it could be some debugging code that slipped in. These are the typical sizes used in git: 0x2000 - default read size in `strbuf_read()` 0x1000 - default read size in CRT, used by `strbuf_getwholeline()` 0x2000 - pipe buffer size in compat\mingw.c As a consequence, as soon as child process uses `strbuf_read()`, `poll()` in parent process will hang forever, deadlocking both processes. This results in two observable behaviors: 1) If parent process begins sending STDIN quickly (and usually that's the case), then first `poll()` will succeed and first block will go through. MAX_IO_SIZE_DEFAULT is 8MB, so if STDIN exceeds 8MB, then it will deadlock. 2) If parent process waits a little bit for any reason (including OS scheduler) and child is first to issue `strbuf_read()`, then it will deadlock immediately even on small STDINs. The problem is illustrated by `git stash push`, which will currently read the entire patch into memory and then send it to `git apply` via STDIN. If patch exceeds 8MB, git hangs on Windows. Possible solutions ------------------ 1) Somehow obtain `BytesInQueue` instead of `QuotaUsed` I did a pretty thorough search and didn't find any ways to obtain the value from write end of the pipe. 2) Also give read end of the pipe to `poll()` That can be done, but it will probably invite some dirty code, because `poll()` * can accept multiple pipes at once * can accept things that are not pipes * is expected to have a well known signature. 3) Make `poll()` always reply "writable" for write end of the pipe Afterall it seems that cygwin (accidentally?) does that for years. Also, it should be noted that `pump_io_round()` writes 8MB blocks, completely ignoring the fact that pipe's buffer size is only 8KB, which means that pipe gets clogged many times during that single write. This may invite a deadlock, if child's STDERR/STDOUT gets clogged while it's trying to deal with 8MB of STDIN. Such deadlocks could be defeated with writing less than pipe's buffer size per round, and always reading everything from STDOUT/STDERR before starting next round. Therefore, making `poll()` always reply "writable" shouldn't cause any new issues or block any future solutions. 4) Increase the size of the pipe's buffer The difference between `BytesInQueue` and `QuotaUsed` is the size of pending reads. Therefore, if buffer is bigger than size of reads, `poll()` won't hang so easily. However, I found that for example `strbuf_read()` will get more and more hungry as it reads large inputs, eventually surpassing any reasonable pipe buffer size. Chosen solution --------------- Make `poll()` always reply "writable" for write end of the pipe. Hopefully one day someone will find a way to implement it properly. Reproduction ------------ printf "%8388608s" X >large_file.txt git stash push --include-untracked -- large_file.txt I have decided not to include this as test to avoid slowing down the test suite. I don't expect the specific problem to come back, and chances are that `git stash push` will be reworked to avoid sending the entire patch via STDIN. Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-24use strpbrk(3) to search for characters from a given setRené Scharfe
We can check if certain characters are present in a string by calling strchr(3) on each of them, or we can pass them all to a single strpbrk(3) call. The latter is shorter, less repetitive and slightly more efficient, so let's do that instead. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-17Merge branch 'js/mingw-open-in-gdb'Junio C Hamano
Dev support. * js/mingw-open-in-gdb: mingw: add a helper function to attach GDB to the current process
2020-02-14Merge branch 'jk/asan-build-fix' into maintJunio C Hamano
Work around test breakages caused by custom regex engine used in libasan, when address sanitizer is used with more recent versions of gcc and clang. * jk/asan-build-fix: Makefile: use compat regex with SANITIZE=address
2020-02-14mingw: add a helper function to attach GDB to the current processJohannes Schindelin
When debugging Git, the criss-cross spawning of processes can make things quite a bit difficult, especially when a Unix shell script is thrown in the mix that calls a `git.exe` that then segfaults. To help debugging such things, we introduce the `open_in_gdb()` function which can be called at a code location where the segfault happens (or as close as one can get); This will open a new MinTTY window with a GDB that already attached to the current process. Inspired by Derrick Stolee. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-12Merge branch 'jk/clang-sanitizer-fixes'Junio C Hamano
C pedantry ;-) fix. * jk/clang-sanitizer-fixes: obstack: avoid computing offsets from NULL pointer xdiff: avoid computing non-zero offset from NULL pointer avoid computing zero offsets from NULL pointer merge-recursive: use subtraction to flip stage merge-recursive: silence -Wxor-used-as-pow warning
2020-02-05Merge branch 'js/add-p-leftover-bits'Junio C Hamano
The final leg of rewriting "add -i/-p" in C. * js/add-p-leftover-bits: ci: include the built-in `git add -i` in the `linux-gcc` job built-in add -p: handle Escape sequences more efficiently built-in add -p: handle Escape sequences in interactive.singlekey mode built-in add -p: respect the `interactive.singlekey` config setting terminal: add a new function to read a single keystroke terminal: accommodate Git for Windows' default terminal terminal: make the code of disable_echo() reusable built-in add -p: handle diff.algorithm built-in add -p: support interactive.diffFilter t3701: adjust difffilter test
2020-01-30Merge branch 'jk/asan-build-fix'Junio C Hamano
Work around test breakages caused by custom regex engine used in libasan, when address sanitizer is used with more recent versions of gcc and clang. * jk/asan-build-fix: Makefile: use compat regex with SANITIZE=address
2020-01-29obstack: avoid computing offsets from NULL pointerJeff King
As with the previous two commits, UBSan with clang-11 complains about computing offsets from a NULL pointer. The failures in t4013 (and elsewhere) look like this: kwset.c:102:23: runtime error: applying non-zero offset 107820859019600 to null pointer ... not ok 79 - git log -SF master # magic is (not used) That line is not enlightening: ... = obstack_alloc(&kwset->obstack, sizeof (struct trie)); because obstack is implemented almost entirely in macros, and the actual problem is five macros deep (I temporarily converted them to inline functions to get better compiler errors, which was tedious but worked reasonably well). The actual problem is in these pointer-alignment macros: /* If B is the base of an object addressed by P, return the result of aligning P to the next multiple of A + 1. B and P must be of type char *. A + 1 must be a power of 2. */ #define __BPTR_ALIGN(B, P, A) ((B) + (((P) - (B) + (A)) & ~(A))) /* Similar to _BPTR_ALIGN (B, P, A), except optimize the common case where pointers can be converted to integers, aligned as integers, and converted back again. If PTR_INT_TYPE is narrower than a pointer (e.g., the AS/400), play it safe and compute the alignment relative to B. Otherwise, use the faster strategy of computing the alignment relative to 0. */ #define __PTR_ALIGN(B, P, A) \ __BPTR_ALIGN (sizeof (PTR_INT_TYPE) < sizeof (void *) ? (B) : (char *) 0, \ P, A) If we have a sufficiently-large integer pointer type, then we do the computation using a NULL pointer constant. That turns __BPTR_ALIGN() into something like: NULL + (P - NULL + A) & ~A and UBSan is complaining about adding the full value of P to that initial NULL. We can fix this by doing our math as an integer type, and then casting the result back to a pointer. The problem case only happens when we know that the integer type is large enough, so there should be no issue with truncation. Another option would be just simplify out all the 0's from __BPTR_ALIGN() for the NULL-pointer case. That probably wouldn't work for a platform where the NULL pointer isn't all-zeroes, but Git already wouldn't work on such a platform (due to our use of memset to set pointers in structs to NULL). But I tried here to keep as close to the original as possible. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-16Sync with maintJunio C Hamano
* maint: msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x
2020-01-16Makefile: use compat regex with SANITIZE=addressJeff King
Recent versions of the gcc and clang Address Sanitizer produce test failures related to regexec(). This triggers with gcc-10 and clang-8 (but not gcc-9 nor clang-7). Running: make CC=gcc-10 SANITIZE=address test results in failures in t4018, t3206, and t4062. The cause seems to be that when built with ASan, we use a different version of regexec() than normal. And this version doesn't understand the REG_STARTEND flag. Here's my evidence supporting that. The failure in t4062 is an ASan warning: expecting success of 4062.2 '-G matches': git diff --name-only -G "^(0{64}){64}$" HEAD^ >out && test 4096-zeroes.txt = "$(cat out)" ================================================================= ==672994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fa76f672000 at pc 0x7fa7726f75b6 bp 0x7ffe41bdda70 sp 0x7ffe41bdd220 READ of size 4097 at 0x7fa76f672000 thread T0 #0 0x7fa7726f75b5 (/lib/x86_64-linux-gnu/libasan.so.6+0x4f5b5) #1 0x562ae0c9c40e in regexec_buf /home/peff/compile/git/git-compat-util.h:1117 #2 0x562ae0c9c40e in diff_grep /home/peff/compile/git/diffcore-pickaxe.c:52 #3 0x562ae0c9cc28 in pickaxe_match /home/peff/compile/git/diffcore-pickaxe.c:166 [...] In this case we're looking in a buffer which was mmap'd via reuse_worktree_file(), and whose size is 4096 bytes. But libasan's regex tries to look at byte 4097 anyway! If we tweak Git like this: diff --git a/diff.c b/diff.c index 8e2914c031..cfae60c120 100644 --- a/diff.c +++ b/diff.c @@ -3880,7 +3880,7 @@ static int reuse_worktree_file(struct index_state *istate, */ if (ce_uptodate(ce) || (!lstat(name, &st) && !ie_match_stat(istate, ce, &st, 0))) - return 1; + return 0; return 0; } to use a regular buffer (with a trailing NUL) instead of an mmap, then the complaint goes away. The other failures are actually diff output with an incorrect funcname header. If I instrument xdiff to show the funcname matching like so: diff --git a/xdiff-interface.c b/xdiff-interface.c index 8509f9ea22..f6c3dc1986 100644 --- a/xdiff-interface.c +++ b/xdiff-interface.c @@ -197,6 +197,7 @@ struct ff_regs { struct ff_reg { regex_t re; int negate; + char *printable; } *array; }; @@ -218,7 +219,12 @@ static long ff_regexp(const char *line, long len, for (i = 0; i < regs->nr; i++) { struct ff_reg *reg = regs->array + i; - if (!regexec_buf(&reg->re, line, len, 2, pmatch, 0)) { + int ret = regexec_buf(&reg->re, line, len, 2, pmatch, 0); + warning("regexec %s:\n regex: %s\n buf: %.*s", + ret == 0 ? "matched" : "did not match", + reg->printable, + (int)len, line); + if (!ret) { if (reg->negate) return -1; break; @@ -264,6 +270,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags) expression = value; if (regcomp(&reg->re, expression, cflags)) die("Invalid regexp to look for hunk header: %s", expression); + reg->printable = xstrdup(expression); free(buffer); value = ep + 1; } then when compiling with ASan and gcc-10, running the diff from t4018.66 produces this: $ git diff -U1 cpp-skip-access-specifiers warning: regexec did not match: regex: ^[ ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*]) buf: private: warning: regexec matched: regex: ^((::[[:space:]]*)?[A-Za-z_].*)$ buf: private: diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers index 4d4a9db..ebd6f42 100644 --- a/cpp-skip-access-specifiers +++ b/cpp-skip-access-specifiers @@ -6,3 +6,3 @@ private: void DoSomething(); int ChangeMe; }; void DoSomething(); - int ChangeMe; + int IWasChanged; }; That first regex should match (and is negated, so it should be telling us _not_ to match "private:"). But it wouldn't if regexec() is looking at the whole buffer, and not just the length-limited line we've fed to regexec_buf(). So this is consistent again with REG_STARTEND being ignored. The correct output (compiling without ASan, or gcc-9 with Asan) looks like this: warning: regexec matched: regex: ^[ ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*]) buf: private: [...more lines that we end up not using...] warning: regexec matched: regex: ^((::[[:space:]]*)?[A-Za-z_].*)$ buf: class RIGHT : public Baseclass diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers index 4d4a9db..ebd6f42 100644 --- a/cpp-skip-access-specifiers +++ b/cpp-skip-access-specifiers @@ -6,3 +6,3 @@ class RIGHT : public Baseclass void DoSomething(); - int ChangeMe; + int IWasChanged; }; So it really does seem like libasan's regex engine is ignoring REG_STARTEND. We should be able to work around it by compiling with NO_REGEX, which would use our local regexec(). But to make matters even more interesting, this isn't enough by itself. Because ASan has support from the compiler, it doesn't seem to intercept our call to regexec() at the dynamic library level. It actually recognizes when we are compiling a call to regexec() and replaces it with ASan-specific code at that point. And unlike most of our other compat code, where we might have git_mmap() or similar, the actual symbol name in the compiled compat/regex code is regexec(). So just compiling with NO_REGEX isn't enough; we still end up in libasan! We can work around that by having the preprocessor replace regexec with git_regexec (both in the callers and in the actual implementation), and we truly end up with a call to our custom regex code, even when compiling with ASan. That's probably a good thing to do anyway, as it means anybody looking at the symbols later (e.g., in a debugger) would have a better indication of which function is which. So we'll do the same for the other common regex functions (even though just regexec() is enough to fix this ASan problem). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-16msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.xJohannes Schindelin
With the upgrade, the library names changed from libeay32/ssleay32 to libcrypto/libssl. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15built-in add -p: handle Escape sequences more efficientlyJohannes Schindelin
When `interactive.singlekey = true`, we react immediately to keystrokes, even to Escape sequences (e.g. when pressing a cursor key). The problem with Escape sequences is that we do not really know when they are done, and as a heuristic we poll standard input for half a second to make sure that we got all of it. While waiting half a second is not asking for a whole lot, it can become quite annoying over time, therefore with this patch, we read the terminal capabilities (if available) and extract known Escape sequences from there, then stop polling immediately when we detected that the user pressed a key that generated such a known sequence. This recapitulates the remaining part of b5cc003253c8 (add -i: ignore terminal escape sequences, 2011-05-17). Note: We do *not* query the terminal capabilities directly. That would either require a lot of platform-specific code, or it would require linking to a library such as ncurses. Linking to a library in the built-ins is something we try very hard to avoid (we even kicked the libcurl dependency to a non-built-in remote helper, just to shave off a tiny fraction of a second from Git's startup time). And the platform-specific code would be a maintenance nightmare. Even worse: in Git for Windows' case, we would need to query MSYS2 pseudo terminals, which `git.exe` simply cannot do (because it is intentionally *not* an MSYS2 program). To address this, we simply spawn `infocmp -L -1` and parse its output (which works even in Git for Windows, because that helper is included in the end-user facing installations). This is done only once, as in the Perl version, but it is done only when the first Escape sequence is encountered, not upon startup of `git add -i`; This saves on startup time, yet makes reacting to the first Escape sequence slightly more sluggish. But it allows us to keep the terminal-related code encapsulated in the `compat/terminal.c` file. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15built-in add -p: handle Escape sequences in interactive.singlekey modeJohannes Schindelin
This recapitulates part of b5cc003253c8 (add -i: ignore terminal escape sequences, 2011-05-17): add -i: ignore terminal escape sequences On the author's terminal, the up-arrow input sequence is ^[[A, and thus fat-fingering an up-arrow into 'git checkout -p' is quite dangerous: git-add--interactive.perl will ignore the ^[ and [ characters and happily treat A as "discard everything". As a band-aid fix, use Term::Cap to get all terminal capabilities. Then use the heuristic that any capability value that starts with ^[ (i.e., \e in perl) must be a key input sequence. Finally, given an input that starts with ^[, read more characters until we have read a full escape sequence, then return that to the caller. We use a timeout of 0.5 seconds on the subsequent reads to avoid getting stuck if the user actually input a lone ^[. Since none of the currently recognized keys start with ^[, the net result is that the sequence as a whole will be ignored and the help displayed. Note that we leave part for later which uses "Term::Cap to get all terminal capabilities", for several reasons: 1. it is actually not really necessary, as the timeout of 0.5 seconds should be plenty sufficient to catch Escape sequences, 2. it is cleaner to keep the change to special-case Escape sequences separate from the change that reads all terminal capabilities to speed things up, and 3. in practice, relying on the terminal capabilities is a bit overrated, as the information could be incomplete, or plain wrong. For example, in this developer's tmux sessions, the terminal capabilities claim that the "cursor up" sequence is ^[M, but the actual sequence produced by the "cursor up" key is ^[[A. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15terminal: add a new function to read a single keystrokeJohannes Schindelin
Typically, input on the command-line is line-based. It is actually not really easy to get single characters (or better put: keystrokes). We provide two implementations here: - One that handles `/dev/tty` based systems as well as native Windows. The former uses the `tcsetattr()` function to put the terminal into "raw mode", which allows us to read individual keystrokes, one by one. The latter uses `stty.exe` to do the same, falling back to direct Win32 Console access. Thanks to the refactoring leading up to this commit, this is a single function, with the platform-specific details hidden away in conditionally-compiled code blocks. - A fall-back which simply punts and reads back an entire line. Note that the function writes the keystroke into an `strbuf` rather than a `char`, in preparation for reading Escape sequences (e.g. when the user hit an arrow key). This is also required for UTF-8 sequences in case the keystroke corresponds to a non-ASCII letter. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15terminal: accommodate Git for Windows' default terminalJohannes Schindelin
Git for Windows' Git Bash runs in MinTTY by default, which does not have a Win32 Console instance, but uses MSYS2 pseudo terminals instead. This is a problem, as Git for Windows does not want to use the MSYS2 emulation layer for Git itself, and therefore has no direct way to interact with that pseudo terminal. As a workaround, use the `stty` utility (which is included in Git for Windows, and which *is* an MSYS2 program, so it knows how to deal with the pseudo terminal). Note: If Git runs in a regular CMD or PowerShell window, there *is* a regular Win32 Console to work with. This is not a problem for the MSYS2 `stty`: it copes with this scenario just fine. Also note that we introduce support for more bits than would be necessary for a mere `disable_echo()` here, in preparation for the upcoming `enable_non_canonical()` function. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-15terminal: make the code of disable_echo() reusableJohannes Schindelin
We are about to introduce the function `enable_non_canonical()`, which shares almost the complete code with `disable_echo()`. Let's prepare for that, by refactoring out that shared code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-02Merge branch 'js/mingw-reserved-filenames'Junio C Hamano
Forbid pathnames that the platform's filesystem cannot represent on MinGW. * js/mingw-reserved-filenames: mingw: refuse paths containing reserved names mingw: short-circuit the conversion of `/dev/null` to UTF-16
2019-12-22mingw: refuse paths containing reserved namesJohannes Schindelin
There are a couple of reserved names that cannot be file names on Windows, such as `AUX`, `NUL`, etc. For an almost complete list, see https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file If one would try to create a directory named `NUL`, it would actually "succeed", i.e. the call would return success, but nothing would be created. Worse, even adding a file extension to the reserved name does not make it a valid file name. To understand the rationale behind that behavior, see https://devblogs.microsoft.com/oldnewthing/20031022-00/?p=42073 Let's just disallow them all. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-22mingw: short-circuit the conversion of `/dev/null` to UTF-16Johannes Schindelin
In the next commit, we want to disallow accessing any path that contains any segment that is equivalent to `NUL`. In particular, we want to disallow accessing `NUL` (e.g. to prevent any repository from being checked out that contains a file called `NUL`, as that is not a valid file name on Windows). However, there are legitimate use cases within Git itself to write to the Null device. As Git is really a Linux project, it does not abstract that idea, though, but instead uses `/dev/null` to describe this intention. So let's side-step the validation _specifically_ in the case that we want to write to (or read from) `/dev/null`, via a dedicated short-cut in the code that skips the call to `validate_win32_path()`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-16Merge branch 'dd/time-reentrancy'Junio C Hamano
Avoid gmtime() and localtime() and prefer their reentrant counterparts. * dd/time-reentrancy: mingw: use {gm,local}time_s as backend for {gm,local}time_r archive-zip.c: switch to reentrant localtime_r date.c: switch to reentrant {gm,local}time_r
2019-12-10Merge branch 'js/mingw-inherit-only-std-handles'Junio C Hamano
Work around a issue where a FD that is left open when spawning a child process and is kept open in the child can interfere with the operation in the parent process on Windows. * js/mingw-inherit-only-std-handles: mingw: forbid translating ERROR_SUCCESS to an errno value mingw: do set `errno` correctly when trying to restrict handle inheritance mingw: restrict file handle inheritance only on Windows 7 and later mingw: spawned processes need to inherit only standard handles mingw: work around incorrect standard handles mingw: demonstrate that all file handles are inherited by child processes
2019-12-10Sync with Git 2.24.1Junio C Hamano
2019-12-06Sync with 2.23.1Johannes Schindelin
* maint-2.23: (44 commits) Git 2.23.1 Git 2.22.2 Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters ...
2019-12-06Sync with 2.22.2Johannes Schindelin
* maint-2.22: (43 commits) Git 2.22.2 Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors ...
2019-12-06Sync with 2.21.1Johannes Schindelin
* maint-2.21: (42 commits) Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh ...
2019-12-06mingw: sh arguments need quoting in more circumstancesJohannes Schindelin
Previously, we failed to quote characters such as '*', '(' and the likes. Let's fix this. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-06mingw: fix quoting of empty arguments for `sh`Johannes Schindelin
When constructing command-lines to spawn processes, it is an unfortunate but necessary decision to quote arguments differently: MSYS2 has different dequoting rules (inherited from Cygwin) than the rest of Windows. To accommodate that, Git's Windows compatibility layer has two separate quoting helpers, one for MSYS2 (which it uses exclusively when spawning `sh`) and the other for regular Windows executables. The MSYS2 one had an unfortunate bug where a `,` somehow slipped in, instead of the `;`. As a consequence, empty arguments would not be enclosed in a pair of double quotes, but the closing double quote was skipped. Let's fix this. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-06mingw: use MSYS2 quoting even when spawning shell scriptsJohannes Schindelin
At the point where `mingw_spawn_fd()` is called, we already have a full path to the script interpreter in that scenario, and we pass it in as the executable to run, while the `argv` reflect what the script should receive as command-line. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-06mingw: detect when MSYS2's sh is to be spawned more robustlyJohannes Schindelin
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-06Sync with 2.20.2Johannes Schindelin
* maint-2.20: (36 commits) Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories ...
2019-12-06Sync with 2.19.3Johannes Schindelin
* maint-2.19: (34 commits) Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams ...
2019-12-06Sync with 2.18.2Johannes Schindelin
* maint-2.18: (33 commits) Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up ...
2019-12-06Sync with 2.17.3Johannes Schindelin
* maint-2.17: (32 commits) Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names ...
2019-12-06Sync with 2.16.6Johannes Schindelin
* maint-2.16: (31 commits) Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses ...
2019-12-06Sync with 2.15.4Johannes Schindelin
* maint-2.15: (29 commits) Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment ...
2019-12-06Sync with 2.14.6Johannes Schindelin
* maint-2.14: (28 commits) Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment test-path-utils: offer to run a protectNTFS/protectHFS benchmark ...
2019-12-05Merge branch 'win32-accommodate-funny-drive-names'Johannes Schindelin
While the only permitted drive letters for physical drives on Windows are letters of the US-English alphabet, this restriction does not apply to virtual drives assigned via `subst <letter>: <path>`. To prevent targeted attacks against systems where "funny" drive letters such as `1` or `!` are assigned, let's handle them as regular drive letters on Windows. This fixes CVE-2019-1351. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05Merge branch 'win32-filenames-cannot-have-trailing-spaces-or-periods'Johannes Schindelin
On Windows, filenames cannot have trailing spaces or periods, when opening such paths, they are stripped automatically. Read: you can open the file `README` via the file name `README . . .`. This ambiguity can be used in combination with other security bugs to cause e.g. remote code execution during recursive clones. This patch series fixes that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05mingw: handle `subst`-ed "DOS drives"Johannes Schindelin
Over a decade ago, in 25fe217b86c (Windows: Treat Windows style path names., 2008-03-05), Git was taught to handle absolute Windows paths, i.e. paths that start with a drive letter and a colon. Unbeknownst to us, while drive letters of physical drives are limited to letters of the English alphabet, there is a way to assign virtual drive letters to arbitrary directories, via the `subst` command, which is _not_ limited to English letters. It is therefore possible to have absolute Windows paths of the form `1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode letters can also be used, e.g. `ä:\tschibät.sch`. While it can be sensibly argued that users who set up such funny drive letters really seek adverse consequences, the Windows Operating System is known to be a platform where many users are at the mercy of administrators who have their very own idea of what constitutes a reasonable setup. Therefore, let's just make sure that such funny paths are still considered absolute paths by Git, on Windows. In addition to Unicode characters, pretty much any character is a valid drive letter, as far as `subst` is concerned, even `:` and `"` or even a space character. While it is probably the opposite of smart to use them, let's safeguard `is_dos_drive_prefix()` against all of them. Note: `[::1]:repo` is a valid URL, but not a valid path on Windows. As `[` is now considered a valid drive letter, we need to be very careful to avoid misinterpreting such a string as valid local path in `url_is_local_not_ssh()`. To do that, we use the just-introduced function `is_valid_path()` (which will label the string as invalid file name because of the colon characters). This fixes CVE-2019-1351. Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05mingw: refuse to access paths with trailing spaces or periodsJohannes Schindelin
When creating a directory on Windows whose path ends in a space or a period (or chains thereof), the Win32 API "helpfully" trims those. For example, `mkdir("abc ");` will return success, but actually create a directory called `abc` instead. This stems back to the DOS days, when all file names had exactly 8 characters plus exactly 3 characters for the file extension, and the only way to have shorter names was by padding with spaces. Sadly, this "helpful" behavior is a bit inconsistent: after a successful `mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because the directory `abc ` does not actually exist). Even if it would work, we now have a serious problem because a Git repository could contain directories `abc` and `abc `, and on Windows, they would be "merged" unintentionally. As these paths are illegal on Windows, anyway, let's disallow any accesses to such paths on that Operating System. For practical reasons, this behavior is still guarded by the config setting `core.protectNTFS`: it is possible (and at least two regression tests make use of it) to create commits without involving the worktree. In such a scenario, it is of course possible -- even on Windows -- to create such file names. Among other consequences, this patch disallows submodules' paths to end in spaces on Windows (which would formerly have confused Git enough to try to write into incorrect paths, anyway). While this patch does not fix a vulnerability on its own, it prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. The regression test added to `t/t7417-submodule-path-url.sh` reflects that attack vector. Note that we have to adjust the test case "prevent git~1 squatting on Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue. It tries to clone two submodules whose names differ only in a trailing period character, and as a consequence their git directories differ in the same way. Previously, when Git tried to clone the second submodule, it thought that the git directory already existed (because on Windows, when you create a directory with the name `b.` it actually creates `b`), but with this patch, the first submodule's clone will fail because of the illegal name of the git directory. Therefore, when cloning the second submodule, Git will take a different code path: a fresh clone (without an existing git directory). Both code paths fail to clone the second submodule, both because the the corresponding worktree directory exists and is not empty, but the error messages are worded differently. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05mingw: refuse to access paths with illegal charactersJohannes Schindelin
Certain characters are not admissible in file names on Windows, even if Cygwin/MSYS2 (and therefore, Git for Windows' Bash) pretend that they are, e.g. `:`, `<`, `>`, etc Let's disallow those characters explicitly in Windows builds of Git. Note: just like trailing spaces or periods, it _is_ possible on Windows to create commits adding files with such illegal characters, as long as the operation leaves the worktree untouched. To allow for that, we continue to guard `is_valid_win32_path()` behind the config setting `core.protectNTFS`, so that users _can_ continue to do that, as long as they turn the protections off via that config setting. Among other problems, this prevents Git from trying to write to an "NTFS Alternate Data Stream" (which refers to metadata stored alongside a file, under a special name: "<filename>:<stream-name>"). This fix therefore also prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. Further reading on illegal characters in Win32 filenames: https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05mingw: fix quoting of argumentsJohannes Schindelin
We need to be careful to follow proper quoting rules. For example, if an argument contains spaces, we have to quote them. Double-quotes need to be escaped. Backslashes need to be escaped, but only if they are followed by a double-quote character. We need to be _extra_ careful to consider the case where an argument ends in a backslash _and_ needs to be quoted: in this case, we append a double-quote character, i.e. the backslash now has to be escaped! The current code, however, fails to recognize that, and therefore can turn an argument that ends in a single backslash into a quoted argument that now ends in an escaped double-quote character. This allows subsequent command-line parameters to be split and part of them being mistaken for command-line options, e.g. through a maliciously-crafted submodule URL during a recursive clone. Technically, we would not need to quote _all_ arguments which end in a backslash _unless_ the argument needs to be quoted anyway. For example, `test\` would not need to be quoted, while `test \` would need to be. To keep the code simple, however, and therefore easier to reason about and ensure its correctness, we now _always_ quote an argument that ends in a backslash. This addresses CVE-2019-1350. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-02mingw: forbid translating ERROR_SUCCESS to an errno valueJohannes Schindelin
Johannes Sixt pointed out that the `err_win_to_posix()` function mishandles `ERROR_SUCCESS`: it maps it to `ENOSYS`. The only purpose of this function is to map Win32 API errors to `errno` ones, and there is actually no equivalent to `ERROR_SUCCESS`: the idea of `errno` is that it will only be set in case of an error, and left alone in case of success. Therefore, as pointed out by Junio Hamano, it is a bug to call this function when there was not even any error to map. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-02mingw: use {gm,local}time_s as backend for {gm,local}time_rDoan Tran Cong Danh
Since Windows doesn't provide gmtime_r(3) and localtime_r(3), we're providing a compat version by using non-reentrant gmtime(3) and localtime(3) as backend. Then, we copy the returned data into the buffer. By doing that, in case of failure, we will dereference a NULL pointer returned by gmtime(3), and localtime(3), and we always return a valid pointer instead of NULL. Drop the memcpy(3) by using gmtime_s(), and use localtime_s() as the backend on Windows, and make sure we will return NULL in case of failure. Cc: Johannes Sixt <j6t@kdbg.org> Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-01Merge branch 'rs/use-copy-array-in-mingw-shell-command-preparation'Junio C Hamano
Code cleanup. * rs/use-copy-array-in-mingw-shell-command-preparation: mingw: use COPY_ARRAY for copying array
2019-12-01Merge branch 'en/doc-typofix'Junio C Hamano
Docfix. * en/doc-typofix: Fix spelling errors in no-longer-updated-from-upstream modules multimail: fix a few simple spelling errors sha1dc: fix trivial comment spelling error Fix spelling errors in test commands Fix spelling errors in messages shown to users Fix spelling errors in names of tests Fix spelling errors in comments of testcases Fix spelling errors in code comments Fix spelling errors in documentation outside of Documentation/ Documentation: fix a bunch of typos, both old and new
2019-11-30mingw: do set `errno` correctly when trying to restrict handle inheritanceJohannes Schindelin
In 9a780a384de (mingw: spawned processes need to inherit only standard handles, 2019-11-22), we taught the Windows-specific part to restrict which file handles are passed on to the spawned processes. Since this logic seemed to be a bit fragile across Windows versions (we _still_ support Windows Vista in Git for Windows, for example), a fall-back was added to try spawning the process again, this time without restricting which file handles are to be inherited by the spawned process. In the common case (i.e. when the process could not be spawned for reasons _other_ than the file handle inheritance), the fall-back attempt would still fail, of course. Crucially, one thing we missed in that code path was to set `errno` appropriately. This should have been caught by t0061.2 which expected `errno` to be `ENOENT` after trying to start a process for a non-existing executable, but `errno` was set to `ENOENT` prior to the `CreateProcessW()` call: while looking for the config settings for trace2, Git tries to access `xdg_config` and `user_config` via `access_or_die()`, and as neither of those config files exists when running the test case (because in Git's test suite, `HOME` points to the test directory), the `errno` has the expected value, but for the wrong reasons. Let's fix that by making sure that `errno` is set correctly. It even appears that `errno` was set in the _wrong_ case previously: `CreateProcessW()` returns non-zero upon success, but `errno` was set only in the non-zero case. It would be nice if we could somehow fix t0061 to make sure that this does not regress again. One approach that seemed like it should work, but did not, was to set `errno` to 0 in the test helper that is used by t0061.2. However, when `mingw_spawnvpe()` wants to see whether the file in question is a script, it calls `parse_interpreter()`, which in turn tries to `open()` the file. Obviously, this call fails, and sets `errno` to `ENOENT`, deep inside the call chain started from that test helper. Instead, we force re-set `errno` at the beginning of the function `mingw_spawnve_fd()`, which _should_ be safe given that callers of that function will want to look at `errno` if -1 was returned. And if that `errno` is 0 ("No error"), regression tests like t0061.2 will kick in. Reported-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-23mingw: restrict file handle inheritance only on Windows 7 and laterJohannes Schindelin
Turns out that it don't work so well on Vista, see https://github.com/git-for-windows/git/issues/1742 for details. According to https://devblogs.microsoft.com/oldnewthing/?p=8873, it *should* work on Windows Vista and later. But apparently there are issues on Windows Vista when pipes are involved. Given that Windows Vista is past its end of life (official support ended on April 11th, 2017), let's not spend *too* much time on this issue and just disable the file handle inheritance restriction on any Windows version earlier than Windows 7. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-23mingw: spawned processes need to inherit only standard handlesJohannes Schindelin
By default, CreateProcess() does not inherit any open file handles, unless the bInheritHandles parameter is set to TRUE. Which we do need to set because we need to pass in stdin/stdout/stderr to talk to the child processes. Sadly, this means that all file handles (unless marked via O_NOINHERIT) are inherited. This lead to problems in VFS for Git, where a long-running read-object hook is used to hydrate missing objects, and depending on the circumstances, might only be called *after* Git opened a file handle. Ideally, we would not open files without O_NOINHERIT unless *really* necessary (i.e. when we want to pass the opened file handle as standard handle into a child process), but apparently it is all-too-easy to introduce incorrect open() calls: this happened, and prevented updating a file after the read-object hook was started because the hook still held a handle on said file. Happily, there is a solution: as described in the "Old New Thing" https://blogs.msdn.microsoft.com/oldnewthing/20111216-00/?p=8873 there is a way, starting with Windows Vista, that lets us define precisely which handles should be inherited by the child process. And since we bumped the minimum Windows version for use with Git for Windows to Vista with v2.10.1 (i.e. a *long* time ago), we can use this method. So let's do exactly that. We need to make sure that the list of handles to inherit does not contain duplicates; Otherwise CreateProcessW() would fail with ERROR_INVALID_ARGUMENT. While at it, stop setting errno to ENOENT unless it really is the correct value. Also, fall back to not limiting handle inheritance under certain error conditions (e.g. on Windows 7, which is a lot stricter in what handles you can specify to limit to). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>