summaryrefslogtreecommitdiff
path: root/update_unicode.sh
diff options
context:
space:
mode:
authorJunio C Hamano <gitster@pobox.com>2014-06-06 18:29:38 (GMT)
committerJunio C Hamano <gitster@pobox.com>2014-06-06 18:29:38 (GMT)
commit334d40e951fa3b3961135b3183633706d976c4bd (patch)
tree445e33f7e58e9e7e9b30be0952b6bf493ac0931c /update_unicode.sh
parenta0460132a740d8ff0c08dcbd54520f1b795298b9 (diff)
parent9c94389c3ee02df891100b894c1790a524268d91 (diff)
downloadgit-334d40e951fa3b3961135b3183633706d976c4bd.zip
git-334d40e951fa3b3961135b3183633706d976c4bd.tar.gz
git-334d40e951fa3b3961135b3183633706d976c4bd.tar.bz2
Merge branch 'tb/unicode-6.3-zero-width'
Update the logic to compute the display width needed for utf8 strings and allow us to more easily maintain the tables used in that logic. We may want to let the users choose if codepoints with ambiguous widths are treated as a double or single width in a follow-up patch. * tb/unicode-6.3-zero-width: utf8: make it easier to auto-update git_wcwidth() utf8.c: use a table for double_width
Diffstat (limited to 'update_unicode.sh')
-rwxr-xr-xupdate_unicode.sh37
1 files changed, 37 insertions, 0 deletions
diff --git a/update_unicode.sh b/update_unicode.sh
new file mode 100755
index 0000000..000b937
--- /dev/null
+++ b/update_unicode.sh
@@ -0,0 +1,37 @@
+#!/bin/sh
+#See http://www.unicode.org/reports/tr44/
+#
+#Me Enclosing_Mark an enclosing combining mark
+#Mn Nonspacing_Mark a nonspacing combining mark (zero advance width)
+#Cf Format a format control character
+#
+UNICODEWIDTH_H=../unicode_width.h
+if ! test -d unicode; then
+ mkdir unicode
+fi &&
+( cd unicode &&
+ if ! test -f UnicodeData.txt; then
+ wget http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt
+ fi &&
+ if ! test -f EastAsianWidth.txt; then
+ wget http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
+ fi &&
+ if ! test -d uniset; then
+ git clone https://github.com/depp/uniset.git
+ fi &&
+ (
+ cd uniset &&
+ if ! test -x uniset; then
+ autoreconf -i &&
+ ./configure --enable-warnings=-Werror CFLAGS='-O0 -ggdb'
+ fi &&
+ make
+ ) &&
+ echo "static const struct interval zero_width[] = {" >$UNICODEWIDTH_H &&
+ UNICODE_DIR=. ./uniset/uniset --32 cat:Me,Mn,Cf + U+1160..U+11FF - U+00AD |
+ grep -v plane >>$UNICODEWIDTH_H &&
+ echo "};" >>$UNICODEWIDTH_H &&
+ echo "static const struct interval double_width[] = {" >>$UNICODEWIDTH_H &&
+ UNICODE_DIR=. ./uniset/uniset --32 eaw:F,W >>$UNICODEWIDTH_H &&
+ echo "};" >>$UNICODEWIDTH_H
+)