path: root/gettext.c
AgeCommit message (Collapse)Author
2021-02-10Merge branch 'ab/detox-gettext-tests'Junio C Hamano
Get rid of "GETTEXT_POISON" support altogether, which may or may not be controversial. * ab/detox-gettext-tests: tests: remove uses of GIT_TEST_GETTEXT_POISON=false tests: remove support for GIT_TEST_GETTEXT_POISON ci: remove GETTEXT_POISON jobs
2021-01-21tests: remove support for GIT_TEST_GETTEXT_POISONÆvar Arnfjörð Bjarmason
This removes the ability to inject "poison" gettext() messages via the GIT_TEST_GETTEXT_POISON special test setup. I initially added this as a compile-time option in bb946bba761 (i18n: add GETTEXT_POISON to simulate unfriendly translator, 2011-02-22), and most recently modified to be toggleable at runtime in 6cdccfce1e0 (i18n: make GETTEXT_POISON a runtime option, 2018-11-08).. The reason for its removal is that the trade-off of maintaining it v.s. what it's getting us has long since flipped. When gettext was integrated in 5e9637c6297 (i18n: add infrastructure for translating Git with gettext, 2011-11-18) there was understandable concern on the Git ML that in marking messages for translation en-masse we'd inadvertently mark plumbing messages. The GETTEXT_POISON facility was a way to smoke those out via our test suite. Nowadays however we're done (or almost entirely done) with any marking of messages for translation. New messages are usually marked by their authors, who'll know whether it makes sense to translate them or not. If not any errors in marking the messages are much more likely to be spotted in review than in the the initial deluge of i18n patches in the 2011-2012 era. So let's just remove this. This leaves the test suite in a state where we still have a lot of test_i18n, C_LOCALE_OUTPUT etc. uses. Subsequent commits will remove those too. The change to t/ is a selective revert of the relevant part of f2d17068fd (i18n: rebase-interactive: mark comments of squash for translation, 2016-06-17), and the comment in t/ is from c7108bf9ed (i18n: rebase: mark messages for translation, 2012-07-25). Signed-off-by: Ævar Arnfjörð Bjarmason <> Signed-off-by: Junio C Hamano <>
2021-01-11gettext.c: remove/reword a mostly-useless commentÆvar Arnfjörð Bjarmason
Mostly remove the comment I added 5e9637c6297 (i18n: add infrastructure for translating Git with gettext, 2011-11-18). Since then we had a fix in 9c0495d23e6 (gettext.c: detect the vsnprintf bug at runtime, 2013-12-01) so we're not running with the "set back to C locale" hack on any modern system. So having more than 1/4 of the file taken up by a digression about a glibc bug that mostly doesn't happen to anyone anymore is just a needless distraction. Shorten the comment to make a brief mention of the bug, and where to find more info by looking at the git history for this now-removed comment. Signed-off-by: Ævar Arnfjörð Bjarmason <> Signed-off-by: Junio C Hamano <>
2019-07-25Merge branch 'ab/test-env'Junio C Hamano
Many GIT_TEST_* environment variables control various aspects of how our tests are run, but a few followed "non-empty is true, empty or unset is false" while others followed the usual "there are a few ways to spell true, like yes, on, etc., and also ways to spell false, like no, off, etc." convention. * ab/test-env: env--helper: mark a file-local symbol as static tests: make GIT_TEST_FAIL_PREREQS a boolean tests: replace test_tristate with "git env--helper" tests README: re-flow a previously changed paragraph tests: make GIT_TEST_GETTEXT_POISON a boolean t6040 test: stop using global "script" variable config.c: refactor die_bad_number() to not call gettext() early env--helper: new undocumented builtin wrapping git_env_*() config tests: simplify include cycle test
2019-07-03gettext: always use UTF-8 on native WindowsKarsten Blees
On native Windows, Git exclusively uses UTF-8 for console output (both with MinTTY and native Win32 Console). Gettext uses `setlocale()` to determine the output encoding for translated text, however, MSVCRT's `setlocale()` does not support UTF-8. As a result, translated text is encoded in system encoding (as per `GetAPC()`), and non-ASCII chars are mangled in console output. Side note: There is actually a code page for UTF-8: 65001. In practice, it does not work as expected at least on Windows 7, though, so we cannot use it in Git. Besides, if we overrode the code page, any process spawned from Git would inherit that code page (as opposed to the code page configured for the current user), which would quite possibly break e.g. diff or merge helpers. So we really cannot override the code page. In `init_gettext_charset()`, Git calls gettext's `bind_textdomain_codeset()` with the character set obtained via `locale_charset()`; Let's override that latter function to force the encoding to UTF-8 on native Windows. In Git for Windows' SDK, there is a `libcharset.h` and therefore we define `HAVE_LIBCHARSET_H` in the MINGW-specific section in `config.mak.uname`, therefore we need to add the override before that conditionally-compiled code block. Rather than simply defining `locale_charset()` to return the string `"UTF-8"`, though, we are careful not to break `LC_ALL=C`: the `ab/no-kwset` patch series, for example, needs to have a way to prevent Git from expecting UTF-8-encoded input. Signed-off-by: Karsten Blees <> Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2019-06-21tests: make GIT_TEST_GETTEXT_POISON a booleanÆvar Arnfjörð Bjarmason
Change the GIT_TEST_GETTEXT_POISON variable from being "non-empty?" to being a more standard boolean variable. Since it needed to be checked in both C code and shellscript (via test -n) it was one of the remaining shellscript-like variables. Now that we have "env--helper" we can change that. There's a couple of tricky edge cases that arise because we're using git_env_bool() early, and the config-reading "env--helper". If GIT_TEST_GETTEXT_POISON is set to an invalid value die_bad_number() will die, but to do so it would usually call gettext(). Let's detect the special case of GIT_TEST_GETTEXT_POISON and always emit that message in the C locale, lest we infinitely loop. As seen in the updated tests in there's also a caveat related to "env--helper" needing to read the config for trace2 purposes. Since the C_LOCALE_OUTPUT prerequisite is lazy and relies on "env--helper" we could get invalid results if we failed to read the config (e.g. because we'd loop on includes) when combined with e.g. "test_i18ngrep" wanting to check with "env--helper" if GIT_TEST_GETTEXT_POISON was true or not. I'm crossing my fingers and hoping that a test similar to the one I removed in the earlier "config tests: simplify include cycle test" change in this series won't happen again, and testing for this explicitly in "env--helper"'s own tests. This change breaks existing uses of e.g. GIT_TEST_GETTEXT_POISON=YesPlease, which we've documented in po/README and other places. As noted in [1] we might want to consider also accepting "YesPlease" in "env--helper" as a special-case. But as the lack of uproar over 6cdccfce1e ("i18n: make GETTEXT_POISON a runtime option", 2018-11-08) demonstrates the audience for this option is a really narrow set of git developers, who shouldn't have much trouble modifying their test scripts, so I think it's better to deal with that minor headache now and make all the relevant GIT_TEST_* variables boolean in the same way than carry the "YesPlease" special-case forward. 1. Signed-off-by: Ævar Arnfjörð Bjarmason <> Signed-off-by: Junio C Hamano <>
2018-11-09i18n: make GETTEXT_POISON a runtime optionÆvar Arnfjörð Bjarmason
Change the GETTEXT_POISON compile-time + runtime GIT_GETTEXT_POISON test parameter to only be a GIT_TEST_GETTEXT_POISON=<non-empty?> runtime parameter, to be consistent with other parameters documented in "Running tests with special setups" in t/README. When I added GETTEXT_POISON in bb946bba76 ("i18n: add GETTEXT_POISON to simulate unfriendly translator", 2011-02-22) I was concerned with ensuring that the _() function would get constant folded if NO_GETTEXT was defined, and likewise that GETTEXT_POISON would be compiled out unless it was defined. But as the benchmark in my [1] shows doing a one-off runtime getenv("GIT_TEST_[...]") is trivial, and since GETTEXT_POISON was originally added the GIT_TEST_* env variables have become the common idiom for turning on special test setups. So change GETTEXT_POISON to work the same way. Now the GETTEXT_POISON=YesPlease compile-time option is gone, and running the tests with GIT_TEST_GETTEXT_POISON=[YesPlease|] can be toggled on/off without recompiling. This allows for conditionally amending tests to test with/without poison, similar to what 859fdc0c3c ("commit-graph: define GIT_TEST_COMMIT_GRAPH", 2018-08-29) did for GIT_TEST_COMMIT_GRAPH. Do some of that, now we e.g. always run the test. I did enough there to remove the GETTEXT_POISON prerequisite, but its inverse C_LOCALE_OUTPUT is still around, and surely some tests using it can be converted to e.g. always set GIT_TEST_GETTEXT_POISON=. Notes on the implementation: * We still compile a dedicated GETTEXT_POISON build in Travis CI. Perhaps this should be revisited and integrated into the "linux-gcc" build, see ae59a4e44f ("travis: run tests with GIT_TEST_SPLIT_INDEX", 2018-01-07) for prior art in that area. Then again maybe not, see [2]. * We now skip a test in under GIT_TEST_GETTEXT_POISON=YesPlease that wasn't skipped before. This test relies on C locale output, but due to an edge case in how the previous implementation of GETTEXT_POISON worked (reading it from GIT-BUILD-OPTIONS) wasn't enabling poison correctly. Now it does, and needs to be skipped. * The getenv() function is not reentrant, so out of paranoia about code of the form: printf(_("%s"), getenv("some-env")); call use_gettext_poison() in our early setup in git_setup_gettext() so we populate the "poison_requested" variable in a codepath that's won't suffer from that race condition. * We error out in the Makefile if you're still saying GETTEXT_POISON=YesPlease to prompt users to change their invocation. * We should not print out poisoned messages during the test initialization itself to keep it more readable, so the test library hides the variable if set in $GIT_TEST_GETTEXT_POISON_ORIG during setup. See [3]. See also [4] for more on the motivation behind this patch, and the history of the GETTEXT_POISON facility. 1. 2. 3. 4. Signed-off-by: Ævar Arnfjörð Bjarmason <> Signed-off-by: Junio C Hamano <>
2018-05-08Merge branch 'js/runtime-prefix'Junio C Hamano
* js/runtime-prefix: Avoid multiple PREFIX definitions git_setup_gettext: plug memory leak gettext: avoid initialization if the locale dir is not present
2018-05-08Merge branch 'dj/runtime-prefix'Junio C Hamano
A build-time option has been added to allow Git to be told to refer to its associated files relative to the main binary, in the same way that has been possible on Windows for quite some time, for Linux, BSDs and Darwin. * dj/runtime-prefix: Makefile: quote $INSTLIBDIR when passing it to sed Makefile: remove unused @@PERLLIBDIR@@ substitution variable mingw/msvc: use the new-style RUNTIME_PREFIX helper exec_cmd: provide a new-style RUNTIME_PREFIX helper for Windows exec_cmd: RUNTIME_PREFIX on some POSIX systems Makefile: add Perl runtime prefix support Makefile: generate Perl header from template file
2018-04-24git_setup_gettext: plug memory leakJohannes Schindelin
The system_path() function returns a freshly-allocated string. We need to release it. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2018-04-24gettext: avoid initialization if the locale dir is not presentJohannes Schindelin
The runtime of a simple `git.exe version` call on Windows is currently dominated by the gettext setup, adding a whopping ~150ms to the ~210ms total. Given that this cost is added to each and every git.exe invocation goes through common-main's invocation of git_setup_gettext(), and given that scripts have to call git.exe dozens, if not hundreds, of times, this is a substantial performance penalty. This is particularly pointless when considering that Git for Windows ships without localization (to keep the installer's size to a bearable ~34MB): all that time setting up gettext is for naught. To be clear, Git for Windows *needs* to be compiled with localization, for the following reasons: - to allow users to copy add-on localization in case they want it, and - to fix the nasty error message BUG: your vsnprintf is broken (returned -1) by using libgettext's override of vsnprintf() that does not share the behavior of msvcrt.dll's version of vsnprintf(). So let's be smart about it and skip setting up gettext if the locale directory is not even present. Since localization might be missing for not-yet-supported locales, this will not break anything. Signed-off-by: Johannes Schindelin <> Signed-off-by: Junio C Hamano <>
2018-04-11exec_cmd: RUNTIME_PREFIX on some POSIX systemsDan Jacques
Enable Git to resolve its own binary location using a variety of OS-specific and generic methods, including: - procfs via "/proc/self/exe" (Linux) - _NSGetExecutablePath (Darwin) - KERN_PROC_PATHNAME sysctl on BSDs. - argv0, if absolute (all, including Windows). This is used to enable RUNTIME_PREFIX support for non-Windows systems, notably Linux and Darwin. When configured with RUNTIME_PREFIX, Git will do a best-effort resolution of its executable path and automatically use this as its "exec_path" for relative helper and data lookups, unless explicitly overridden. Small incidental formatting cleanup of "exec_cmd.c". Signed-off-by: Dan Jacques <> Thanks-to: Robbie Iannucci <> Thanks-to: Junio C Hamano <> Signed-off-by: Junio C Hamano <>
2016-07-01gettext: add is_utf8_locale()Nguyễn Thái Ngọc Duy
This function returns true if git is running under an UTF-8 locale. pcre in the next patch will need this. is_encoding_utf8() is used instead of strcmp() to catch both "utf-8" and "utf8" suffixes. When built with no gettext support, we peek in several env variables to detect UTF-8. pcre library might support utf-8 even if libc is built without locale support.. The peeking code is a copy from compat/regex/regcomp.c Helped-by: Ævar Arnfjörð Bjarmason <> Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2015-06-29introduce "format" date-modeJeff King
This feeds the format directly to strftime. Besides being a little more flexible, the main advantage is that your system strftime may know more about your locale's preferred format (e.g., how to spell the days of the week). Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2015-02-26gettext.c: move get_preferred_languages() from http.cJeff King
Calling setlocale(LC_MESSAGES, ...) directly from http.c, without including <locale.h>, was causing compilation warnings. Move the helper function to gettext.c that already includes the header and where locale-related issues are handled. Signed-off-by: Jeff King <> Signed-off-by: Junio C Hamano <>
2013-12-05gettext.c: detect the vsnprintf bug at runtimeNguyễn Thái Ngọc Duy
Bug 6530 [1] in glibc causes "git show v0.99.6~1" to fail with error "your vsnprintf is broken". The workaround avoids that, but it corrupts system error messages in non-C locales. The bug has been fixed since 2.17. We could know running glibc version with gnu_get_libc_version(). But version is not a sure way to detect the bug because downstream may back port the fix to older versions. Do a runtime test that immitates the call flow that leads to "your vsnprintf is broken". Only enable the workaround if the test fails. Tested on Gentoo Linux, glibc 2.16.0 and 2.17, amd64. [1] Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2012-09-14fetch: align per-ref summary report in UTF-8 localesNguyễn Thái Ngọc Duy
fetch does printf("%-*s", width, "foo") where "foo" can be a utf-8 string, but width is in bytes, not columns. For ASCII it's fine as one byte takes one column. For utf-8, this may result in misaligned ref summary table. Introduce gettext_width() function that returns the string length in columns (currently only supports utf-8 locales). Make the code use TRANSPORT_SUMMARY(x) where the length is compensated properly in non-English locales. Signed-off-by: Nguyễn Thái Ngọc Duy <> Signed-off-by: Junio C Hamano <>
2011-12-06i18n: add infrastructure for translating Git with gettextÆvar Arnfjörð Bjarmason
Change the skeleton implementation of i18n in Git to one that can show localized strings to users for our C, Shell and Perl programs using either GNU libintl or the Solaris gettext implementation. This new internationalization support is enabled by default. If gettext isn't available, or if Git is compiled with NO_GETTEXT=YesPlease, Git falls back on its current behavior of showing interface messages in English. When using the autoconf script we'll auto-detect if the gettext libraries are installed and act appropriately. This change is somewhat large because as well as adding a C, Shell and Perl i18n interface we're adding a lot of tests for them, and for those tests to work we need a skeleton PO file to actually test translations. A minimal Icelandic translation is included for this purpose. Icelandic includes multi-byte characters which makes it easy to test various edge cases, and it's a language I happen to understand. The rest of the commit message goes into detail about various sub-parts of this commit. = Installation Gettext .mo files will be installed and looked for in the standard $(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to override that, but that's only intended to be used to test Git itself. = Perl Perl code that's to be localized should use the new Git::I18n module. It imports a __ function into the caller's package by default. Instead of using the high level Locale::TextDomain interface I've opted to use the low-level (equivalent to the C interface) Locale::Messages module, which Locale::TextDomain itself uses. Locale::TextDomain does a lot of redundant work we don't need, and some of it would potentially introduce bugs. It tries to set the $TEXTDOMAIN based on package of the caller, and has its own hardcoded paths where it'll search for messages. I found it easier just to completely avoid it rather than try to circumvent its behavior. In any case, this is an issue wholly internal Git::I18N. Its guts can be changed later if that's deemed necessary. See <> for a further elaboration on this topic. = Shell Shell code that's to be localized should use the git-sh-i18n library. It's basically just a wrapper for the system's If isn't available we'll fall back on gettext(1) if it's available. The latter is available without the former on Solaris, which has its own non-GNU gettext implementation. We also need to emulate eval_gettext() there. If neither are present we'll use a dumb printf(1) fall-through wrapper. = About libcharset.h and langinfo.h We use libcharset to query the character set of the current locale if it's available. I.e. we'll use it instead of nl_langinfo if HAVE_LIBCHARSET_H is set. The GNU gettext manual recommends using langinfo.h's nl_langinfo(CODESET) to acquire the current character set, but on systems that have libcharset.h's locale_charset() using the latter is either saner, or the only option on those systems. GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either, but MinGW and some others need to use libcharset.h's locale_charset() instead. =Credits This patch is based on work by Jeff Epler <> who did the initial Makefile / C work, and a lot of comments from the Git mailing list, including Jonathan Nieder, Jakub Narebski, Johannes Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and others. [jc: squashed a small Makefile fix from Ramsay] Signed-off-by: Ævar Arnfjörð Bjarmason <> Signed-off-by: Ramsay Jones <> Signed-off-by: Junio C Hamano <>
2011-03-08i18n: do not poison translations unless GIT_GETTEXT_POISON envvar is setJonathan Nieder
Tweak the GETTEXT_POISON facility so it is activated at run time instead of compile time. If the GIT_GETTEXT_POISON environment variable is set, _(msg) will result in gibberish as before; but if the GIT_GETTEXT_POISON variable is not set, it will return the message for human-readable output. So the behavior of mistranslated and untranslated git can be compared without rebuilding git in between. For simplicity we always set the GIT_GETTEXT_POISON variable in tests. This does not affect builds without the GETTEXT_POISON compile-time option set, so non-i18n git will not be slowed down. Signed-off-by: Jonathan Nieder <> Signed-off-by: Junio C Hamano <>