path: root/grep.h
diff options
authorCarlo Marcelo Arenas Belón <>2019-08-28 14:54:44 (GMT)
committerJunio C Hamano <>2019-09-09 18:50:08 (GMT)
commitad7c543e3b0f80befd26f4115f8fec4285a018bf (patch)
treedec38a9cdfb93d1519241a6e1caddbe5b4911143 /grep.h
parent75b2f01a0f642b39b0f29b6218515df9b5eb798e (diff)
grep: skip UTF8 checks explicitly
18547aacf5 ("grep/pcre: support utf-8", 2016-06-25) that was released with git 2.10 added the PCRE_UTF8 flag to PCRE1 matching including a call to has_non_ascii() to try to avoid breakage if there was non-utf8 encoded content in the haystack. Usually PCRE is compiled with JIT support (even if is not the default), and therefore the codepath used includes calling pcre_jit_exec, which skips UTF-8 validation by design (which might result in crashes or hangs) but when JIT support wasn't compiled we use pcre_exec instead with the posibility that grep might be aborted if invalid UTF-8 is found in the haystack. PCRE1 provides a flag since Mar 5, 2007 that could be used to skip the checks explicitly so use that to make both codepaths equivalent (the flag is ignored by pcre1_jit_exec) this fix is only implemented for PCRE1 because PCRE2 is likely to have a better solution (without the risks) instead in the future Helped-by: Johannes Schindelin <> Helped-by: Eric Sunshine <> Helped-by: Ævar Arnfjörð Bjarmason <> Suggested-by: Junio C Hamano <> Signed-off-by: Carlo Marcelo Arenas Belón <> Signed-off-by: Junio C Hamano <>
Diffstat (limited to 'grep.h')
1 files changed, 3 insertions, 0 deletions
diff --git a/grep.h b/grep.h
index 1875880..9c8797a 100644
--- a/grep.h
+++ b/grep.h
@@ -3,6 +3,9 @@
#include "color.h"
#include <pcre.h>
+#define PCRE_NO_UTF8_CHECK 0
#if PCRE_MAJOR >= 8 && PCRE_MINOR >= 32