summaryrefslogtreecommitdiff
path: root/t/t0204-gettext-reencode-sanity.sh
diff options
context:
space:
mode:
authorÆvar Arnfjörð Bjarmason <avarab@gmail.com>2011-11-17 23:14:42 (GMT)
committerJunio C Hamano <gitster@pobox.com>2011-12-06 04:46:55 (GMT)
commit5e9637c629702e3d41ad01d95956d1835d7338e0 (patch)
tree345bc74dcb47d1054db0577c3e95caa805fb9ee7 /t/t0204-gettext-reencode-sanity.sh
parentbc1bbe0c19a6ff39522b4fa3259f34150e308e1f (diff)
downloadgit-5e9637c629702e3d41ad01d95956d1835d7338e0.zip
git-5e9637c629702e3d41ad01d95956d1835d7338e0.tar.gz
git-5e9637c629702e3d41ad01d95956d1835d7338e0.tar.bz2
i18n: add infrastructure for translating Git with gettext
Change the skeleton implementation of i18n in Git to one that can show localized strings to users for our C, Shell and Perl programs using either GNU libintl or the Solaris gettext implementation. This new internationalization support is enabled by default. If gettext isn't available, or if Git is compiled with NO_GETTEXT=YesPlease, Git falls back on its current behavior of showing interface messages in English. When using the autoconf script we'll auto-detect if the gettext libraries are installed and act appropriately. This change is somewhat large because as well as adding a C, Shell and Perl i18n interface we're adding a lot of tests for them, and for those tests to work we need a skeleton PO file to actually test translations. A minimal Icelandic translation is included for this purpose. Icelandic includes multi-byte characters which makes it easy to test various edge cases, and it's a language I happen to understand. The rest of the commit message goes into detail about various sub-parts of this commit. = Installation Gettext .mo files will be installed and looked for in the standard $(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to override that, but that's only intended to be used to test Git itself. = Perl Perl code that's to be localized should use the new Git::I18n module. It imports a __ function into the caller's package by default. Instead of using the high level Locale::TextDomain interface I've opted to use the low-level (equivalent to the C interface) Locale::Messages module, which Locale::TextDomain itself uses. Locale::TextDomain does a lot of redundant work we don't need, and some of it would potentially introduce bugs. It tries to set the $TEXTDOMAIN based on package of the caller, and has its own hardcoded paths where it'll search for messages. I found it easier just to completely avoid it rather than try to circumvent its behavior. In any case, this is an issue wholly internal Git::I18N. Its guts can be changed later if that's deemed necessary. See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for a further elaboration on this topic. = Shell Shell code that's to be localized should use the git-sh-i18n library. It's basically just a wrapper for the system's gettext.sh. If gettext.sh isn't available we'll fall back on gettext(1) if it's available. The latter is available without the former on Solaris, which has its own non-GNU gettext implementation. We also need to emulate eval_gettext() there. If neither are present we'll use a dumb printf(1) fall-through wrapper. = About libcharset.h and langinfo.h We use libcharset to query the character set of the current locale if it's available. I.e. we'll use it instead of nl_langinfo if HAVE_LIBCHARSET_H is set. The GNU gettext manual recommends using langinfo.h's nl_langinfo(CODESET) to acquire the current character set, but on systems that have libcharset.h's locale_charset() using the latter is either saner, or the only option on those systems. GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either, but MinGW and some others need to use libcharset.h's locale_charset() instead. =Credits This patch is based on work by Jeff Epler <jepler@unpythonic.net> who did the initial Makefile / C work, and a lot of comments from the Git mailing list, including Jonathan Nieder, Jakub Narebski, Johannes Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and others. [jc: squashed a small Makefile fix from Ramsay] Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 't/t0204-gettext-reencode-sanity.sh')
-rwxr-xr-xt/t0204-gettext-reencode-sanity.sh78
1 files changed, 78 insertions, 0 deletions
diff --git a/t/t0204-gettext-reencode-sanity.sh b/t/t0204-gettext-reencode-sanity.sh
new file mode 100755
index 0000000..189af90
--- /dev/null
+++ b/t/t0204-gettext-reencode-sanity.sh
@@ -0,0 +1,78 @@
+#!/bin/sh
+#
+# Copyright (c) 2010 Ævar Arnfjörð Bjarmason
+#
+
+test_description="Gettext reencoding of our *.po/*.mo files works"
+
+. ./lib-gettext.sh
+
+
+test_expect_success GETTEXT_LOCALE 'gettext: Emitting UTF-8 from our UTF-8 *.mo files / Icelandic' '
+ printf "TILRAUN: Halló Heimur!" >expect &&
+ LANGUAGE=is LC_ALL="$is_IS_locale" gettext "TEST: Hello World!" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success GETTEXT_LOCALE 'gettext: Emitting UTF-8 from our UTF-8 *.mo files / Runes' '
+ printf "TILRAUN: ᚻᛖ ᚳᚹᚫᚦ ᚦᚫᛏ ᚻᛖ ᛒᚢᛞᛖ ᚩᚾ ᚦᚫᛗ ᛚᚪᚾᛞᛖ ᚾᚩᚱᚦᚹᛖᚪᚱᛞᚢᛗ ᚹᛁᚦ ᚦᚪ ᚹᛖᛥᚫ" >expect &&
+ LANGUAGE=is LC_ALL="$is_IS_locale" gettext "TEST: Old English Runes" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success GETTEXT_ISO_LOCALE 'gettext: Emitting ISO-8859-1 from our UTF-8 *.mo files / Icelandic' '
+ printf "TILRAUN: Halló Heimur!" | iconv -f UTF-8 -t ISO8859-1 >expect &&
+ LANGUAGE=is LC_ALL="$is_IS_iso_locale" gettext "TEST: Hello World!" >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success GETTEXT_ISO_LOCALE 'gettext: Emitting ISO-8859-1 from our UTF-8 *.mo files / Runes' '
+ LANGUAGE=is LC_ALL="$is_IS_iso_locale" gettext "TEST: Old English Runes" >runes &&
+
+ if grep "^TEST: Old English Runes$" runes
+ then
+ say "Your system can not handle this complexity and returns the string as-is"
+ else
+ # Both Solaris and GNU libintl will return this stream of
+ # question marks, so it is s probably portable enough
+ printf "TILRAUN: ?? ???? ??? ?? ???? ?? ??? ????? ??????????? ??? ?? ????" >runes-expect &&
+ test_cmp runes-expect runes
+ fi
+'
+
+test_expect_success GETTEXT_LOCALE 'gettext: Fetching a UTF-8 msgid -> UTF-8' '
+ printf "TILRAUN: ‚einfaldar‘ og „tvöfaldar“ gæsalappir" >expect &&
+ LANGUAGE=is LC_ALL="$is_IS_locale" gettext "TEST: ‘single’ and “double” quotes" >actual &&
+ test_cmp expect actual
+'
+
+# How these quotes get transliterated depends on the gettext implementation:
+#
+# Debian: ,einfaldar' og ,,tvöfaldar" [GNU libintl]
+# FreeBSD: `einfaldar` og "tvöfaldar" [GNU libintl]
+# Solaris: ?einfaldar? og ?tvöfaldar? [Solaris libintl]
+#
+# Just make sure the contents are transliterated, and don't use grep -q
+# so that these differences are emitted under --verbose for curious
+# eyes.
+test_expect_success GETTEXT_ISO_LOCALE 'gettext: Fetching a UTF-8 msgid -> ISO-8859-1' '
+ LANGUAGE=is LC_ALL="$is_IS_iso_locale" gettext "TEST: ‘single’ and “double” quotes" >actual &&
+ grep "einfaldar" actual &&
+ grep "$(echo tvöfaldar | iconv -f UTF-8 -t ISO8859-1)" actual
+'
+
+test_expect_success GETTEXT_LOCALE 'gettext.c: git init UTF-8 -> UTF-8' '
+ printf "Bjó til tóma Git lind" >expect &&
+ LANGUAGE=is LC_ALL="$is_IS_locale" git init repo >actual &&
+ test_when_finished "rm -rf repo" &&
+ grep "^$(cat expect) " actual
+'
+
+test_expect_success GETTEXT_ISO_LOCALE 'gettext.c: git init UTF-8 -> ISO-8859-1' '
+ printf "Bjó til tóma Git lind" >expect &&
+ LANGUAGE=is LC_ALL="$is_IS_iso_locale" git init repo >actual &&
+ test_when_finished "rm -rf repo" &&
+ grep "^$(cat expect | iconv -f UTF-8 -t ISO8859-1) " actual
+'
+
+test_done