root / trunk / install / launcher / izpack-launcher-1.3 / src / gettext / share / doc / gettext / gettext_13.html @ 7940
History | View | Annotate | Download (80.4 KB)
1 |
<HTML>
|
---|---|
2 |
<HEAD>
|
3 |
<!-- This HTML file has been created by texi2html 1.52a
|
4 |
from gettext.texi on 9 December 2003 -->
|
5 |
|
6 |
<TITLE>GNU gettext utilities - 13 Other Programming Languages</TITLE> |
7 |
</HEAD>
|
8 |
<BODY>
|
9 |
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_12.html">previous</A>, <A HREF="gettext_14.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>. |
10 |
<P><HR><P> |
11 |
|
12 |
|
13 |
<H1><A NAME="SEC217" HREF="gettext_toc.html#TOC217">13 Other Programming Languages</A></H1> |
14 |
|
15 |
<P>
|
16 |
While the presentation of <CODE>gettext</CODE> focuses mostly on C and |
17 |
implicitly applies to C++ as well, its scope is far broader than that: |
18 |
Many programming languages, scripting languages and other textual data |
19 |
like GUI resources or package descriptions can make use of the gettext |
20 |
approach. |
21 |
|
22 |
</P>
|
23 |
|
24 |
|
25 |
|
26 |
<H2><A NAME="SEC218" HREF="gettext_toc.html#TOC218">13.1 The Language Implementor's View</A></H2> |
27 |
<P>
|
28 |
<A NAME="IDX1047"></A> |
29 |
<A NAME="IDX1048"></A> |
30 |
|
31 |
</P>
|
32 |
<P>
|
33 |
All programming and scripting languages that have the notion of strings |
34 |
are eligible to supporting <CODE>gettext</CODE>. Supporting <CODE>gettext</CODE> |
35 |
means the following: |
36 |
|
37 |
</P>
|
38 |
|
39 |
<OL>
|
40 |
<LI>
|
41 |
|
42 |
You should add to the language a syntax for translatable strings. In |
43 |
principle, a function call of <CODE>gettext</CODE> would do, but a shorthand |
44 |
syntax helps keeping the legibility of internationalized programs. For |
45 |
example, in C we use the syntax <CODE>_("string")</CODE>, and in GNU awk we use |
46 |
the shorthand <CODE>_"string"</CODE>. |
47 |
|
48 |
<LI>
|
49 |
|
50 |
You should arrange that evaluation of such a translatable string at |
51 |
runtime calls the <CODE>gettext</CODE> function, or performs equivalent |
52 |
processing. |
53 |
|
54 |
<LI>
|
55 |
|
56 |
Similarly, you should make the functions <CODE>ngettext</CODE>, |
57 |
<CODE>dcgettext</CODE>, <CODE>dcngettext</CODE> available from within the language. |
58 |
These functions are less often used, but are nevertheless necessary for |
59 |
particular purposes: <CODE>ngettext</CODE> for correct plural handling, and |
60 |
<CODE>dcgettext</CODE> and <CODE>dcngettext</CODE> for obeying other locale |
61 |
environment variables than <CODE>LC_MESSAGES</CODE>, such as <CODE>LC_TIME</CODE> or |
62 |
<CODE>LC_MONETARY</CODE>. For these latter functions, you need to make the |
63 |
<CODE>LC_*</CODE> constants, available in the C header <CODE><locale.h></CODE>, |
64 |
referenceable from within the language, usually either as enumeration |
65 |
values or as strings. |
66 |
|
67 |
<LI>
|
68 |
|
69 |
You should allow the programmer to designate a message domain, either by |
70 |
making the <CODE>textdomain</CODE> function available from within the |
71 |
language, or by introducing a magic variable called <CODE>TEXTDOMAIN</CODE>. |
72 |
Similarly, you should allow the programmer to designate where to search |
73 |
for message catalogs, by providing access to the <CODE>bindtextdomain</CODE> |
74 |
function. |
75 |
|
76 |
<LI>
|
77 |
|
78 |
You should either perform a <CODE>setlocale (LC_ALL, "")</CODE> call during |
79 |
the startup of your language runtime, or allow the programmer to do so. |
80 |
Remember that gettext will act as a no-op if the <CODE>LC_MESSAGES</CODE> and |
81 |
<CODE>LC_CTYPE</CODE> locale facets are not both set. |
82 |
|
83 |
<LI>
|
84 |
|
85 |
A programmer should have a way to extract translatable strings from a |
86 |
program into a PO file. The GNU <CODE>xgettext</CODE> program is being |
87 |
extended to support very different programming languages. Please |
88 |
contact the GNU <CODE>gettext</CODE> maintainers to help them doing this. If |
89 |
the string extractor is best integrated into your language's parser, GNU |
90 |
<CODE>xgettext</CODE> can function as a front end to your string extractor. |
91 |
|
92 |
<LI>
|
93 |
|
94 |
The language's library should have a string formatting facility where |
95 |
the arguments of a format string are denoted by a positional number or a |
96 |
name. This is needed because for some languages and some messages with |
97 |
more than one substitutable argument, the translation will need to |
98 |
output the substituted arguments in different order. See section <A HREF="gettext_3.html#SEC18">3.5 Special Comments preceding Keywords</A>. |
99 |
|
100 |
<LI>
|
101 |
|
102 |
If the language has more than one implementation, and not all of the |
103 |
implementations use <CODE>gettext</CODE>, but the programs should be portable |
104 |
across implementations, you should provide a no-i18n emulation, that |
105 |
makes the other implementations accept programs written for yours, |
106 |
without actually translating the strings. |
107 |
|
108 |
<LI>
|
109 |
|
110 |
To help the programmer in the task of marking translatable strings, |
111 |
which is usually performed using the Emacs PO mode, you are welcome to |
112 |
contact the GNU <CODE>gettext</CODE> maintainers, so they can add support for |
113 |
your language to <TT>`po-mode.el´</TT>. |
114 |
</OL>
|
115 |
|
116 |
<P>
|
117 |
On the implementation side, three approaches are possible, with |
118 |
different effects on portability and copyright: |
119 |
|
120 |
</P>
|
121 |
|
122 |
<UL>
|
123 |
<LI>
|
124 |
|
125 |
You may integrate the GNU <CODE>gettext</CODE>'s <TT>`intl/´</TT> directory in |
126 |
your package, as described in section <A HREF="gettext_12.html#SEC189">12 The Maintainer's View</A>. This allows you to |
127 |
have internationalization on all kinds of platforms. Note that when you |
128 |
then distribute your package, it legally falls under the GNU General |
129 |
Public License, and the GNU project will be glad about your contribution |
130 |
to the Free Software pool. |
131 |
|
132 |
<LI>
|
133 |
|
134 |
You may link against GNU <CODE>gettext</CODE> functions if they are found in |
135 |
the C library. For example, an autoconf test for <CODE>gettext()</CODE> and |
136 |
<CODE>ngettext()</CODE> will detect this situation. For the moment, this test |
137 |
will succeed on GNU systems and not on other platforms. No severe |
138 |
copyright restrictions apply. |
139 |
|
140 |
<LI>
|
141 |
|
142 |
You may emulate or reimplement the GNU <CODE>gettext</CODE> functionality. |
143 |
This has the advantage of full portability and no copyright |
144 |
restrictions, but also the drawback that you have to reimplement the GNU |
145 |
<CODE>gettext</CODE> features (such as the <CODE>LANGUAGE</CODE> environment |
146 |
variable, the locale aliases database, the automatic charset conversion, |
147 |
and plural handling). |
148 |
</UL>
|
149 |
|
150 |
|
151 |
|
152 |
<H2><A NAME="SEC219" HREF="gettext_toc.html#TOC219">13.2 The Programmer's View</A></H2> |
153 |
|
154 |
<P>
|
155 |
For the programmer, the general procedure is the same as for the C |
156 |
language. The Emacs PO mode supports other languages, and the GNU |
157 |
<CODE>xgettext</CODE> string extractor recognizes other languages based on the |
158 |
file extension or a command-line option. In some languages, |
159 |
<CODE>setlocale</CODE> is not needed because it is already performed by the |
160 |
underlying language runtime. |
161 |
|
162 |
</P>
|
163 |
|
164 |
|
165 |
<H2><A NAME="SEC220" HREF="gettext_toc.html#TOC220">13.3 The Translator's View</A></H2> |
166 |
|
167 |
<P>
|
168 |
The translator works exactly as in the C language case. The only |
169 |
difference is that when translating format strings, she has to be aware |
170 |
of the language's particular syntax for positional arguments in format |
171 |
strings. |
172 |
|
173 |
</P>
|
174 |
|
175 |
|
176 |
|
177 |
<H3><A NAME="SEC221" HREF="gettext_toc.html#TOC221">13.3.1 C Format Strings</A></H3> |
178 |
|
179 |
<P>
|
180 |
C format strings are described in POSIX (IEEE P1003.1 2001), section |
181 |
XSH 3 fprintf(), |
182 |
<A HREF="http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html">http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html</A>. |
183 |
See also the fprintf(3) manual page, |
184 |
<A HREF="http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php">http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php</A>, |
185 |
<A HREF="http://informatik.fh-wuerzburg.de/student/i510/man/printf.html">http://informatik.fh-wuerzburg.de/student/i510/man/printf.html</A>. |
186 |
|
187 |
</P>
|
188 |
<P>
|
189 |
Although format strings with positions that reorder arguments, such as |
190 |
|
191 |
</P>
|
192 |
|
193 |
<PRE>
|
194 |
"Only %2$d bytes free on '%1$s'." |
195 |
</PRE>
|
196 |
|
197 |
<P>
|
198 |
which is semantically equivalent to |
199 |
|
200 |
</P>
|
201 |
|
202 |
<PRE>
|
203 |
"'%s' has only %d bytes free." |
204 |
</PRE>
|
205 |
|
206 |
<P>
|
207 |
are a POSIX/XSI feature and not specified by ISO C 99, translators can rely |
208 |
on this reordering ability: On the few platforms where <CODE>printf()</CODE>, |
209 |
<CODE>fprintf()</CODE> etc. don't support this feature natively, <TT>`libintl.a´</TT> |
210 |
or <TT>`libintl.so´</TT> provides replacement functions, and GNU <CODE><libintl.h></CODE> |
211 |
activates these replacement functions automatically. |
212 |
|
213 |
</P>
|
214 |
|
215 |
|
216 |
<H3><A NAME="SEC222" HREF="gettext_toc.html#TOC222">13.3.2 Objective C Format Strings</A></H3> |
217 |
|
218 |
<P>
|
219 |
Objective C format strings are like C format strings. They support an |
220 |
additional format directive: "$@", which when executed consumes an argument |
221 |
of type <CODE>Object *</CODE>. |
222 |
|
223 |
</P>
|
224 |
|
225 |
|
226 |
<H3><A NAME="SEC223" HREF="gettext_toc.html#TOC223">13.3.3 Shell Format Strings</A></H3> |
227 |
|
228 |
<P>
|
229 |
Shell format strings, as supported by GNU gettext and the <SAMP>`envsubst´</SAMP> |
230 |
program, are strings with references to shell variables in the form |
231 |
<CODE>$<VAR>variable</VAR></CODE> or <CODE>${<VAR>variable</VAR>}</CODE>. References of the form |
232 |
<CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE>, |
233 |
<CODE>${<VAR>variable</VAR>:-<VAR>default</VAR>}</CODE>, |
234 |
<CODE>${<VAR>variable</VAR>=<VAR>default</VAR>}</CODE>, |
235 |
<CODE>${<VAR>variable</VAR>:=<VAR>default</VAR>}</CODE>, |
236 |
<CODE>${<VAR>variable</VAR>+<VAR>replacement</VAR>}</CODE>, |
237 |
<CODE>${<VAR>variable</VAR>:+<VAR>replacement</VAR>}</CODE>, |
238 |
<CODE>${<VAR>variable</VAR>?<VAR>ignored</VAR>}</CODE>, |
239 |
<CODE>${<VAR>variable</VAR>:?<VAR>ignored</VAR>}</CODE>, |
240 |
that would be valid inside shell scripts, are not supported. The |
241 |
<VAR>variable</VAR> names must consist solely of alphanumeric or underscore |
242 |
ASCII characters, not start with a digit and be nonempty; otherwise such |
243 |
a variable reference is ignored. |
244 |
|
245 |
</P>
|
246 |
|
247 |
|
248 |
<H3><A NAME="SEC224" HREF="gettext_toc.html#TOC224">13.3.4 Python Format Strings</A></H3> |
249 |
|
250 |
<P>
|
251 |
Python format strings are described in |
252 |
Python Library reference / |
253 |
2. Built-in Types, Exceptions and Functions / |
254 |
2.2. Built-in Types / |
255 |
2.2.6. Sequence Types / |
256 |
2.2.6.2. String Formatting Operations. |
257 |
<A HREF="http://www.python.org/doc/2.2.1/lib/typesseq-strings.html">http://www.python.org/doc/2.2.1/lib/typesseq-strings.html</A>. |
258 |
|
259 |
</P>
|
260 |
|
261 |
|
262 |
<H3><A NAME="SEC225" HREF="gettext_toc.html#TOC225">13.3.5 Lisp Format Strings</A></H3> |
263 |
|
264 |
<P>
|
265 |
Lisp format strings are described in the Common Lisp HyperSpec, |
266 |
chapter 22.3 Formatted Output, |
267 |
<A HREF="http://www.lisp.org/HyperSpec/Body/sec_22-3.html">http://www.lisp.org/HyperSpec/Body/sec_22-3.html</A>. |
268 |
|
269 |
</P>
|
270 |
|
271 |
|
272 |
<H3><A NAME="SEC226" HREF="gettext_toc.html#TOC226">13.3.6 Emacs Lisp Format Strings</A></H3> |
273 |
|
274 |
<P>
|
275 |
Emacs Lisp format strings are documented in the Emacs Lisp reference, |
276 |
section Formatting Strings, |
277 |
<A HREF="http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75">http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75</A>. |
278 |
Note that as of version 21, XEmacs supports numbered argument specifications |
279 |
in format strings while FSF Emacs doesn't. |
280 |
|
281 |
</P>
|
282 |
|
283 |
|
284 |
<H3><A NAME="SEC227" HREF="gettext_toc.html#TOC227">13.3.7 librep Format Strings</A></H3> |
285 |
|
286 |
<P>
|
287 |
librep format strings are documented in the librep manual, section |
288 |
Formatted Output, |
289 |
<A HREF="http://librep.sourceforge.net/librep-manual.html#Formatted%20Output">http://librep.sourceforge.net/librep-manual.html#Formatted%20Output</A>, |
290 |
<A HREF="http://www.gwinnup.org/research/docs/librep.html#SEC122">http://www.gwinnup.org/research/docs/librep.html#SEC122</A>. |
291 |
|
292 |
</P>
|
293 |
|
294 |
|
295 |
<H3><A NAME="SEC228" HREF="gettext_toc.html#TOC228">13.3.8 Smalltalk Format Strings</A></H3> |
296 |
|
297 |
<P>
|
298 |
Smalltalk format strings are described in the GNU Smalltalk documentation, |
299 |
class <CODE>CharArray</CODE>, methods <SAMP>`bindWith:´</SAMP> and |
300 |
<SAMP>`bindWithArguments:´</SAMP>. |
301 |
<A HREF="http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238">http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238</A>. |
302 |
In summary, a directive starts with <SAMP>`%´</SAMP> and is followed by <SAMP>`%´</SAMP> |
303 |
or a nonzero digit (<SAMP>`1´</SAMP> to <SAMP>`9´</SAMP>). |
304 |
|
305 |
</P>
|
306 |
|
307 |
|
308 |
<H3><A NAME="SEC229" HREF="gettext_toc.html#TOC229">13.3.9 Java Format Strings</A></H3> |
309 |
|
310 |
<P>
|
311 |
Java format strings are described in the JDK documentation for class |
312 |
<CODE>java.text.MessageFormat</CODE>, |
313 |
<A HREF="http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html">http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html</A>. |
314 |
See also the ICU documentation |
315 |
<A HREF="http://oss.software.ibm.com/icu/apiref/classMessageFormat.html">http://oss.software.ibm.com/icu/apiref/classMessageFormat.html</A>. |
316 |
|
317 |
</P>
|
318 |
|
319 |
|
320 |
<H3><A NAME="SEC230" HREF="gettext_toc.html#TOC230">13.3.10 awk Format Strings</A></H3> |
321 |
|
322 |
<P>
|
323 |
awk format strings are described in the gawk documentation, section |
324 |
Printf, |
325 |
<A HREF="http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf">http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf</A>. |
326 |
|
327 |
</P>
|
328 |
|
329 |
|
330 |
<H3><A NAME="SEC231" HREF="gettext_toc.html#TOC231">13.3.11 Object Pascal Format Strings</A></H3> |
331 |
|
332 |
<P>
|
333 |
Where is this documented? |
334 |
|
335 |
</P>
|
336 |
|
337 |
|
338 |
<H3><A NAME="SEC232" HREF="gettext_toc.html#TOC232">13.3.12 YCP Format Strings</A></H3> |
339 |
|
340 |
<P>
|
341 |
YCP sformat strings are described in the libycp documentation |
342 |
<A HREF="file:/usr/share/doc/packages/libycp/YCP-builtins.html">file:/usr/share/doc/packages/libycp/YCP-builtins.html</A>. |
343 |
In summary, a directive starts with <SAMP>`%´</SAMP> and is followed by <SAMP>`%´</SAMP> |
344 |
or a nonzero digit (<SAMP>`1´</SAMP> to <SAMP>`9´</SAMP>). |
345 |
|
346 |
</P>
|
347 |
|
348 |
|
349 |
<H3><A NAME="SEC233" HREF="gettext_toc.html#TOC233">13.3.13 Tcl Format Strings</A></H3> |
350 |
|
351 |
<P>
|
352 |
Tcl format strings are described in the <TT>`format.n´</TT> manual page, |
353 |
<A HREF="http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm">http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm</A>. |
354 |
|
355 |
</P>
|
356 |
|
357 |
|
358 |
<H3><A NAME="SEC234" HREF="gettext_toc.html#TOC234">13.3.14 Perl Format Strings</A></H3> |
359 |
|
360 |
<P>
|
361 |
There are two kinds format strings in Perl: those acceptable to the |
362 |
Perl built-in function <CODE>printf</CODE>, labelled as <SAMP>`perl-format´</SAMP>, |
363 |
and those acceptable to the <CODE>libintl-perl</CODE> function <CODE>__x</CODE>, |
364 |
labelled as <SAMP>`perl-brace-format´</SAMP>. |
365 |
|
366 |
</P>
|
367 |
<P>
|
368 |
Perl <CODE>printf</CODE> format strings are described in the <CODE>sprintf</CODE> |
369 |
section of <SAMP>`man perlfunc´</SAMP>. |
370 |
|
371 |
</P>
|
372 |
<P>
|
373 |
Perl brace format strings are described in the |
374 |
<TT>`Locale::TextDomain(3pm)´</TT> manual page of the CPAN package |
375 |
libintl-perl. In brief, Perl format uses placeholders put between |
376 |
braces (<SAMP>`{´</SAMP> and <SAMP>`}´</SAMP>). The placeholder must have the syntax |
377 |
of simple identifiers. |
378 |
|
379 |
</P>
|
380 |
|
381 |
|
382 |
<H3><A NAME="SEC235" HREF="gettext_toc.html#TOC235">13.3.15 PHP Format Strings</A></H3> |
383 |
|
384 |
<P>
|
385 |
PHP format strings are described in the documentation of the PHP function |
386 |
<CODE>sprintf</CODE>, in <TT>`phpdoc/manual/function.sprintf.html´</TT> or |
387 |
<A HREF="http://www.php.net/manual/en/function.sprintf.php">http://www.php.net/manual/en/function.sprintf.php</A>. |
388 |
|
389 |
</P>
|
390 |
|
391 |
|
392 |
<H3><A NAME="SEC236" HREF="gettext_toc.html#TOC236">13.3.16 GCC internal Format Strings</A></H3> |
393 |
|
394 |
<P>
|
395 |
These format strings are used inside the GCC sources. In such a format |
396 |
string, a directive starts with <SAMP>`%´</SAMP>, is optionally followed by a |
397 |
size specifier <SAMP>`l´</SAMP>, an optional flag <SAMP>`+´</SAMP>, another optional flag |
398 |
<SAMP>`#´</SAMP>, and is finished by a specifier: <SAMP>`%´</SAMP> denotes a literal |
399 |
percent sign, <SAMP>`c´</SAMP> denotes a character, <SAMP>`s´</SAMP> denotes a string, |
400 |
<SAMP>`i´</SAMP> and <SAMP>`d´</SAMP> denote an integer, <SAMP>`o´</SAMP>, <SAMP>`u´</SAMP>, <SAMP>`x´</SAMP> |
401 |
denote an unsigned integer, <SAMP>`.*s´</SAMP> denotes a string preceded by a |
402 |
width specification, <SAMP>`H´</SAMP> denotes a <SAMP>`location_t *´</SAMP> pointer, |
403 |
<SAMP>`D´</SAMP> denotes a general declaration, <SAMP>`F´</SAMP> denotes a function |
404 |
declaration, <SAMP>`T´</SAMP> denotes a type, <SAMP>`A´</SAMP> denotes a function argument, |
405 |
<SAMP>`C´</SAMP> denotes a tree code, <SAMP>`E´</SAMP> denotes an expression, <SAMP>`L´</SAMP> |
406 |
denotes a programming language, <SAMP>`O´</SAMP> denotes a binary operator, |
407 |
<SAMP>`P´</SAMP> denotes a function parameter, <SAMP>`Q´</SAMP> denotes an assignment |
408 |
operator, <SAMP>`V´</SAMP> denotes a const/volatile qualifier. |
409 |
|
410 |
</P>
|
411 |
|
412 |
|
413 |
<H3><A NAME="SEC237" HREF="gettext_toc.html#TOC237">13.3.17 Qt Format Strings</A></H3> |
414 |
|
415 |
<P>
|
416 |
Qt format strings are described in the documentation of the QString class |
417 |
<A HREF="file:/usr/lib/qt-3.0.5/doc/html/qstring.html">file:/usr/lib/qt-3.0.5/doc/html/qstring.html</A>. |
418 |
In summary, a directive consists of a <SAMP>`%´</SAMP> followed by a digit. The same |
419 |
directive cannot occur more than once in a format string. |
420 |
|
421 |
</P>
|
422 |
|
423 |
|
424 |
<H2><A NAME="SEC238" HREF="gettext_toc.html#TOC238">13.4 The Maintainer's View</A></H2> |
425 |
|
426 |
<P>
|
427 |
For the maintainer, the general procedure differs from the C language |
428 |
case in two ways. |
429 |
|
430 |
</P>
|
431 |
|
432 |
<UL>
|
433 |
<LI>
|
434 |
|
435 |
For those languages that don't use GNU gettext, the <TT>`intl/´</TT> directory |
436 |
is not needed and can be omitted. This means that the maintainer calls the |
437 |
<CODE>gettextize</CODE> program without the <SAMP>`--intl´</SAMP> option, and that he |
438 |
invokes the <CODE>AM_GNU_GETTEXT</CODE> autoconf macro via |
439 |
<SAMP>`AM_GNU_GETTEXT([external])´</SAMP>. |
440 |
|
441 |
<LI>
|
442 |
|
443 |
If only a single programming language is used, the <CODE>XGETTEXT_OPTIONS</CODE> |
444 |
variable in <TT>`po/Makevars´</TT> (see section <A HREF="gettext_12.html#SEC196">12.4.3 <TT>`Makefile´</TT> pieces in <TT>`po/´</TT></A>) should be adjusted to |
445 |
match the <CODE>xgettext</CODE> options for that particular programming language. |
446 |
If the package uses more than one programming language with <CODE>gettext</CODE> |
447 |
support, it becomes necessary to change the POT file construction rule |
448 |
in <TT>`po/Makefile.in.in´</TT>. It is recommended to make one <CODE>xgettext</CODE> |
449 |
invocation per programming language, each with the options appropriate for |
450 |
that language, and to combine the resulting files using <CODE>msgcat</CODE>. |
451 |
</UL>
|
452 |
|
453 |
|
454 |
|
455 |
<H2><A NAME="SEC239" HREF="gettext_toc.html#TOC239">13.5 Individual Programming Languages</A></H2> |
456 |
|
457 |
|
458 |
|
459 |
<H3><A NAME="SEC240" HREF="gettext_toc.html#TOC240">13.5.1 C, C++, Objective C</A></H3> |
460 |
<P>
|
461 |
<A NAME="IDX1049"></A> |
462 |
|
463 |
</P>
|
464 |
<DL COMPACT> |
465 |
|
466 |
<DT>RPMs
|
467 |
<DD>
|
468 |
gcc, gpp, gobjc, glibc, gettext |
469 |
|
470 |
<DT>File extension
|
471 |
<DD>
|
472 |
For C: <CODE>c</CODE>, <CODE>h</CODE>. |
473 |
<BR>For C++: <CODE>C</CODE>, <CODE>c++</CODE>, <CODE>cc</CODE>, <CODE>cxx</CODE>, <CODE>cpp</CODE>, <CODE>hpp</CODE>. |
474 |
<BR>For Objective C: <CODE>m</CODE>. |
475 |
|
476 |
<DT>String syntax
|
477 |
<DD>
|
478 |
<CODE>"abc"</CODE> |
479 |
|
480 |
<DT>gettext shorthand
|
481 |
<DD>
|
482 |
<CODE>_("abc")</CODE> |
483 |
|
484 |
<DT>gettext/ngettext functions
|
485 |
<DD>
|
486 |
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>, |
487 |
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE> |
488 |
|
489 |
<DT>textdomain
|
490 |
<DD>
|
491 |
<CODE>textdomain</CODE> function |
492 |
|
493 |
<DT>bindtextdomain
|
494 |
<DD>
|
495 |
<CODE>bindtextdomain</CODE> function |
496 |
|
497 |
<DT>setlocale
|
498 |
<DD>
|
499 |
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE> |
500 |
|
501 |
<DT>Prerequisite
|
502 |
<DD>
|
503 |
<CODE>#include <libintl.h></CODE> |
504 |
<BR><CODE>#include <locale.h></CODE> |
505 |
<BR><CODE>#define _(string) gettext (string)</CODE> |
506 |
|
507 |
<DT>Use or emulate GNU gettext
|
508 |
<DD>
|
509 |
Use |
510 |
|
511 |
<DT>Extractor
|
512 |
<DD>
|
513 |
<CODE>xgettext -k_</CODE> |
514 |
|
515 |
<DT>Formatting with positions
|
516 |
<DD>
|
517 |
<CODE>fprintf "%2$d %1$d"</CODE> |
518 |
<BR>In C++: <CODE>autosprintf "%2$d %1$d"</CODE> |
519 |
(see section `Introduction' in <CITE>GNU autosprintf</CITE>) |
520 |
|
521 |
<DT>Portability
|
522 |
<DD>
|
523 |
autoconf (gettext.m4) and #if ENABLE_NLS |
524 |
|
525 |
<DT>po-mode marking
|
526 |
<DD>
|
527 |
yes |
528 |
</DL>
|
529 |
|
530 |
<P>
|
531 |
The following examples are available in the <TT>`examples´</TT> directory: |
532 |
<CODE>hello-c</CODE>, <CODE>hello-c-gnome</CODE>, <CODE>hello-c++</CODE>, <CODE>hello-c++-qt</CODE>, |
533 |
<CODE>hello-c++-kde</CODE>, <CODE>hello-c++-gnome</CODE>, <CODE>hello-objc</CODE>, |
534 |
<CODE>hello-objc-gnustep</CODE>, <CODE>hello-objc-gnome</CODE>. |
535 |
|
536 |
</P>
|
537 |
|
538 |
|
539 |
<H3><A NAME="SEC241" HREF="gettext_toc.html#TOC241">13.5.2 sh - Shell Script</A></H3> |
540 |
<P>
|
541 |
<A NAME="IDX1050"></A> |
542 |
|
543 |
</P>
|
544 |
<DL COMPACT> |
545 |
|
546 |
<DT>RPMs
|
547 |
<DD>
|
548 |
bash, gettext |
549 |
|
550 |
<DT>File extension
|
551 |
<DD>
|
552 |
<CODE>sh</CODE> |
553 |
|
554 |
<DT>String syntax
|
555 |
<DD>
|
556 |
<CODE>"abc"</CODE>, <CODE>'abc'</CODE>, <CODE>abc</CODE> |
557 |
|
558 |
<DT>gettext shorthand
|
559 |
<DD>
|
560 |
<CODE>"`gettext \"abc\"`"</CODE> |
561 |
|
562 |
<DT>gettext/ngettext functions
|
563 |
<DD>
|
564 |
<A NAME="IDX1051"></A> |
565 |
<A NAME="IDX1052"></A> |
566 |
<CODE>gettext</CODE>, <CODE>ngettext</CODE> programs |
567 |
<BR><CODE>eval_gettext</CODE>, <CODE>eval_ngettext</CODE> shell functions |
568 |
|
569 |
<DT>textdomain
|
570 |
<DD>
|
571 |
<A NAME="IDX1053"></A> |
572 |
environment variable <CODE>TEXTDOMAIN</CODE> |
573 |
|
574 |
<DT>bindtextdomain
|
575 |
<DD>
|
576 |
<A NAME="IDX1054"></A> |
577 |
environment variable <CODE>TEXTDOMAINDIR</CODE> |
578 |
|
579 |
<DT>setlocale
|
580 |
<DD>
|
581 |
automatic |
582 |
|
583 |
<DT>Prerequisite
|
584 |
<DD>
|
585 |
<CODE>. gettext.sh</CODE> |
586 |
|
587 |
<DT>Use or emulate GNU gettext
|
588 |
<DD>
|
589 |
use |
590 |
|
591 |
<DT>Extractor
|
592 |
<DD>
|
593 |
<CODE>xgettext</CODE> |
594 |
|
595 |
<DT>Formatting with positions
|
596 |
<DD>
|
597 |
--- |
598 |
|
599 |
<DT>Portability
|
600 |
<DD>
|
601 |
fully portable |
602 |
|
603 |
<DT>po-mode marking
|
604 |
<DD>
|
605 |
--- |
606 |
</DL>
|
607 |
|
608 |
<P>
|
609 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-sh</CODE>. |
610 |
|
611 |
</P>
|
612 |
|
613 |
|
614 |
|
615 |
<H4><A NAME="SEC242" HREF="gettext_toc.html#TOC242">13.5.2.1 Preparing Shell Scripts for Internationalization</A></H4> |
616 |
<P>
|
617 |
<A NAME="IDX1055"></A> |
618 |
|
619 |
</P>
|
620 |
<P>
|
621 |
Preparing a shell script for internationalization is conceptually similar |
622 |
to the steps described in section <A HREF="gettext_3.html#SEC13">3 Preparing Program Sources</A>. The concrete steps for shell |
623 |
scripts are as follows. |
624 |
|
625 |
</P>
|
626 |
|
627 |
<OL>
|
628 |
<LI>
|
629 |
|
630 |
Insert the line |
631 |
|
632 |
|
633 |
<PRE>
|
634 |
. gettext.sh |
635 |
</PRE>
|
636 |
|
637 |
near the top of the script. <CODE>gettext.sh</CODE> is a shell function library |
638 |
that provides the functions |
639 |
<CODE>eval_gettext</CODE> (see section <A HREF="gettext_13.html#SEC247">13.5.2.6 Invoking the <CODE>eval_gettext</CODE> function</A>) and |
640 |
<CODE>eval_ngettext</CODE> (see section <A HREF="gettext_13.html#SEC248">13.5.2.7 Invoking the <CODE>eval_ngettext</CODE> function</A>). |
641 |
You have to ensure that <CODE>gettext.sh</CODE> can be found in the <CODE>PATH</CODE>. |
642 |
|
643 |
<LI>
|
644 |
|
645 |
Set and export the <CODE>TEXTDOMAIN</CODE> and <CODE>TEXTDOMAINDIR</CODE> environment |
646 |
variables. Usually <CODE>TEXTDOMAIN</CODE> is the package or program name, and |
647 |
<CODE>TEXTDOMAINDIR</CODE> is the absolute pathname corresponding to |
648 |
<CODE>$prefix/share/locale</CODE>, where <CODE>$prefix</CODE> is the installation location. |
649 |
|
650 |
|
651 |
<PRE>
|
652 |
TEXTDOMAIN=@PACKAGE@ |
653 |
export TEXTDOMAIN |
654 |
TEXTDOMAINDIR=@LOCALEDIR@ |
655 |
export TEXTDOMAINDIR |
656 |
</PRE>
|
657 |
|
658 |
<LI>
|
659 |
|
660 |
Prepare the strings for translation, as described in section <A HREF="gettext_3.html#SEC15">3.2 Preparing Translatable Strings</A>. |
661 |
|
662 |
<LI>
|
663 |
|
664 |
Simplify translatable strings so that they don't contain command substitution |
665 |
(<CODE>"`...`"</CODE> or <CODE>"$(...)"</CODE>), variable access with defaulting (like |
666 |
<CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE>), access to positional arguments |
667 |
(like <CODE>$0</CODE>, <CODE>$1</CODE>, ...) or highly volatile shell variables (like |
668 |
<CODE>$?</CODE>). This can always be done through simple local code restructuring. |
669 |
For example, |
670 |
|
671 |
|
672 |
<PRE>
|
673 |
echo "Usage: $0 [OPTION] FILE..." |
674 |
</PRE>
|
675 |
|
676 |
becomes |
677 |
|
678 |
|
679 |
<PRE>
|
680 |
program_name=$0 |
681 |
echo "Usage: $program_name [OPTION] FILE..." |
682 |
</PRE>
|
683 |
|
684 |
Similarly, |
685 |
|
686 |
|
687 |
<PRE>
|
688 |
echo "Remaining files: `ls | wc -l`" |
689 |
</PRE>
|
690 |
|
691 |
becomes |
692 |
|
693 |
|
694 |
<PRE>
|
695 |
filecount="`ls | wc -l`" |
696 |
echo "Remaining files: $filecount" |
697 |
</PRE>
|
698 |
|
699 |
<LI>
|
700 |
|
701 |
For each translatable string, change the output command <SAMP>`echo´</SAMP> or |
702 |
<SAMP>`$echo´</SAMP> to <SAMP>`gettext´</SAMP> (if the string contains no references to |
703 |
shell variables) or to <SAMP>`eval_gettext´</SAMP> (if it refers to shell variables), |
704 |
followed by a no-argument <SAMP>`echo´</SAMP> command (to account for the terminating |
705 |
newline). Similarly, for cases with plural handling, replace a conditional |
706 |
<SAMP>`echo´</SAMP> command with an invocation of <SAMP>`ngettext´</SAMP> or |
707 |
<SAMP>`eval_ngettext´</SAMP>, followed by a no-argument <SAMP>`echo´</SAMP> command. |
708 |
</OL>
|
709 |
|
710 |
|
711 |
|
712 |
<H4><A NAME="SEC243" HREF="gettext_toc.html#TOC243">13.5.2.2 Contents of <CODE>gettext.sh</CODE></A></H4> |
713 |
|
714 |
<P>
|
715 |
<CODE>gettext.sh</CODE>, contained in the run-time package of GNU gettext, provides |
716 |
the following: |
717 |
|
718 |
</P>
|
719 |
|
720 |
<UL>
|
721 |
<LI>$echo
|
722 |
|
723 |
The variable <CODE>echo</CODE> is set to a command that outputs its first argument |
724 |
and a newline, without interpreting backslashes in the argument string. |
725 |
|
726 |
<LI>eval_gettext
|
727 |
|
728 |
See section <A HREF="gettext_13.html#SEC247">13.5.2.6 Invoking the <CODE>eval_gettext</CODE> function</A>. |
729 |
|
730 |
<LI>eval_ngettext
|
731 |
|
732 |
See section <A HREF="gettext_13.html#SEC248">13.5.2.7 Invoking the <CODE>eval_ngettext</CODE> function</A>. |
733 |
</UL>
|
734 |
|
735 |
|
736 |
|
737 |
<H4><A NAME="SEC244" HREF="gettext_toc.html#TOC244">13.5.2.3 Invoking the <CODE>gettext</CODE> program</A></H4> |
738 |
|
739 |
<P>
|
740 |
<A NAME="IDX1056"></A> |
741 |
<A NAME="IDX1057"></A> |
742 |
|
743 |
<PRE>
|
744 |
gettext [<VAR>option</VAR>] [[<VAR>textdomain</VAR>] <VAR>msgid</VAR>] |
745 |
gettext [<VAR>option</VAR>] -s [<VAR>msgid</VAR>]... |
746 |
</PRE>
|
747 |
|
748 |
<P>
|
749 |
<A NAME="IDX1058"></A> |
750 |
The <CODE>gettext</CODE> program displays the native language translation of a |
751 |
textual message. |
752 |
|
753 |
</P>
|
754 |
<P>
|
755 |
<STRONG>Arguments</STRONG> |
756 |
|
757 |
</P>
|
758 |
<DL COMPACT> |
759 |
|
760 |
<DT><SAMP>`-d <VAR>textdomain</VAR>´</SAMP> |
761 |
<DD>
|
762 |
<DT><SAMP>`--domain=<VAR>textdomain</VAR>´</SAMP> |
763 |
<DD>
|
764 |
<A NAME="IDX1059"></A> |
765 |
<A NAME="IDX1060"></A> |
766 |
Retrieve translated messages from <VAR>textdomain</VAR>. Usually a <VAR>textdomain</VAR> |
767 |
corresponds to a package, a program, or a module of a program. |
768 |
|
769 |
<DT><SAMP>`-e´</SAMP> |
770 |
<DD>
|
771 |
<A NAME="IDX1061"></A> |
772 |
Enable expansion of some escape sequences. This option is for compatibility |
773 |
with the <SAMP>`echo´</SAMP> program or shell built-in. The escape sequences |
774 |
<SAMP>`\b´</SAMP>, <SAMP>`\c´</SAMP>, <SAMP>`\f´</SAMP>, <SAMP>`\n´</SAMP>, <SAMP>`\r´</SAMP>, <SAMP>`\t´</SAMP>, <SAMP>`\v´</SAMP>, |
775 |
<SAMP>`\\´</SAMP>, and <SAMP>`\´</SAMP> followed by one to three octal digits, are interpreted |
776 |
like the <SAMP>`echo´</SAMP> program does. |
777 |
|
778 |
<DT><SAMP>`-E´</SAMP> |
779 |
<DD>
|
780 |
<A NAME="IDX1062"></A> |
781 |
This option is only for compatibility with the <SAMP>`echo´</SAMP> program or shell |
782 |
built-in. It has no effect. |
783 |
|
784 |
<DT><SAMP>`-h´</SAMP> |
785 |
<DD>
|
786 |
<DT><SAMP>`--help´</SAMP> |
787 |
<DD>
|
788 |
<A NAME="IDX1063"></A> |
789 |
<A NAME="IDX1064"></A> |
790 |
Display this help and exit. |
791 |
|
792 |
<DT><SAMP>`-n´</SAMP> |
793 |
<DD>
|
794 |
<A NAME="IDX1065"></A> |
795 |
Suppress trailing newline. By default, <CODE>gettext</CODE> adds a newline to |
796 |
the output. |
797 |
|
798 |
<DT><SAMP>`-V´</SAMP> |
799 |
<DD>
|
800 |
<DT><SAMP>`--version´</SAMP> |
801 |
<DD>
|
802 |
<A NAME="IDX1066"></A> |
803 |
<A NAME="IDX1067"></A> |
804 |
Output version information and exit. |
805 |
|
806 |
<DT><SAMP>`[<VAR>textdomain</VAR>] <VAR>msgid</VAR>´</SAMP> |
807 |
<DD>
|
808 |
Retrieve translated message corresponding to <VAR>msgid</VAR> from <VAR>textdomain</VAR>. |
809 |
|
810 |
</DL>
|
811 |
|
812 |
<P>
|
813 |
If the <VAR>textdomain</VAR> parameter is not given, the domain is determined from |
814 |
the environment variable <CODE>TEXTDOMAIN</CODE>. If the message catalog is not |
815 |
found in the regular directory, another location can be specified with the |
816 |
environment variable <CODE>TEXTDOMAINDIR</CODE>. |
817 |
|
818 |
</P>
|
819 |
<P>
|
820 |
When used with the <CODE>-s</CODE> option the program behaves like the <SAMP>`echo´</SAMP> |
821 |
command. But it does not simply copy its arguments to stdout. Instead those |
822 |
messages found in the selected catalog are translated. |
823 |
|
824 |
</P>
|
825 |
|
826 |
|
827 |
<H4><A NAME="SEC245" HREF="gettext_toc.html#TOC245">13.5.2.4 Invoking the <CODE>ngettext</CODE> program</A></H4> |
828 |
|
829 |
<P>
|
830 |
<A NAME="IDX1068"></A> |
831 |
<A NAME="IDX1069"></A> |
832 |
|
833 |
<PRE>
|
834 |
ngettext [<VAR>option</VAR>] [<VAR>textdomain</VAR>] <VAR>msgid</VAR> <VAR>msgid-plural</VAR> <VAR>count</VAR> |
835 |
</PRE>
|
836 |
|
837 |
<P>
|
838 |
<A NAME="IDX1070"></A> |
839 |
The <CODE>ngettext</CODE> program displays the native language translation of a |
840 |
textual message whose grammatical form depends on a number. |
841 |
|
842 |
</P>
|
843 |
<P>
|
844 |
<STRONG>Arguments</STRONG> |
845 |
|
846 |
</P>
|
847 |
<DL COMPACT> |
848 |
|
849 |
<DT><SAMP>`-d <VAR>textdomain</VAR>´</SAMP> |
850 |
<DD>
|
851 |
<DT><SAMP>`--domain=<VAR>textdomain</VAR>´</SAMP> |
852 |
<DD>
|
853 |
<A NAME="IDX1071"></A> |
854 |
<A NAME="IDX1072"></A> |
855 |
Retrieve translated messages from <VAR>textdomain</VAR>. Usually a <VAR>textdomain</VAR> |
856 |
corresponds to a package, a program, or a module of a program. |
857 |
|
858 |
<DT><SAMP>`-e´</SAMP> |
859 |
<DD>
|
860 |
<A NAME="IDX1073"></A> |
861 |
Enable expansion of some escape sequences. This option is for compatibility |
862 |
with the <SAMP>`gettext´</SAMP> program. The escape sequences |
863 |
<SAMP>`\b´</SAMP>, <SAMP>`\c´</SAMP>, <SAMP>`\f´</SAMP>, <SAMP>`\n´</SAMP>, <SAMP>`\r´</SAMP>, <SAMP>`\t´</SAMP>, <SAMP>`\v´</SAMP>, |
864 |
<SAMP>`\\´</SAMP>, and <SAMP>`\´</SAMP> followed by one to three octal digits, are interpreted |
865 |
like the <SAMP>`echo´</SAMP> program does. |
866 |
|
867 |
<DT><SAMP>`-E´</SAMP> |
868 |
<DD>
|
869 |
<A NAME="IDX1074"></A> |
870 |
This option is only for compatibility with the <SAMP>`gettext´</SAMP> program. It has |
871 |
no effect. |
872 |
|
873 |
<DT><SAMP>`-h´</SAMP> |
874 |
<DD>
|
875 |
<DT><SAMP>`--help´</SAMP> |
876 |
<DD>
|
877 |
<A NAME="IDX1075"></A> |
878 |
<A NAME="IDX1076"></A> |
879 |
Display this help and exit. |
880 |
|
881 |
<DT><SAMP>`-V´</SAMP> |
882 |
<DD>
|
883 |
<DT><SAMP>`--version´</SAMP> |
884 |
<DD>
|
885 |
<A NAME="IDX1077"></A> |
886 |
<A NAME="IDX1078"></A> |
887 |
Output version information and exit. |
888 |
|
889 |
<DT><SAMP>`<VAR>textdomain</VAR>´</SAMP> |
890 |
<DD>
|
891 |
Retrieve translated message from <VAR>textdomain</VAR>. |
892 |
|
893 |
<DT><SAMP>`<VAR>msgid</VAR> <VAR>msgid-plural</VAR>´</SAMP> |
894 |
<DD>
|
895 |
Translate <VAR>msgid</VAR> (English singular) / <VAR>msgid-plural</VAR> (English plural). |
896 |
|
897 |
<DT><SAMP>`<VAR>count</VAR>´</SAMP> |
898 |
<DD>
|
899 |
Choose singular/plural form based on this value. |
900 |
|
901 |
</DL>
|
902 |
|
903 |
<P>
|
904 |
If the <VAR>textdomain</VAR> parameter is not given, the domain is determined from |
905 |
the environment variable <CODE>TEXTDOMAIN</CODE>. If the message catalog is not |
906 |
found in the regular directory, another location can be specified with the |
907 |
environment variable <CODE>TEXTDOMAINDIR</CODE>. |
908 |
|
909 |
</P>
|
910 |
|
911 |
|
912 |
<H4><A NAME="SEC246" HREF="gettext_toc.html#TOC246">13.5.2.5 Invoking the <CODE>envsubst</CODE> program</A></H4> |
913 |
|
914 |
<P>
|
915 |
<A NAME="IDX1079"></A> |
916 |
<A NAME="IDX1080"></A> |
917 |
|
918 |
<PRE>
|
919 |
envsubst [<VAR>option</VAR>] [<VAR>shell-format</VAR>] |
920 |
</PRE>
|
921 |
|
922 |
<P>
|
923 |
<A NAME="IDX1081"></A> |
924 |
<A NAME="IDX1082"></A> |
925 |
<A NAME="IDX1083"></A> |
926 |
The <CODE>envsubst</CODE> program substitutes the values of environment variables. |
927 |
|
928 |
</P>
|
929 |
<P>
|
930 |
<STRONG>Operation mode</STRONG> |
931 |
|
932 |
</P>
|
933 |
<DL COMPACT> |
934 |
|
935 |
<DT><SAMP>`-v´</SAMP> |
936 |
<DD>
|
937 |
<DT><SAMP>`--variables´</SAMP> |
938 |
<DD>
|
939 |
<A NAME="IDX1084"></A> |
940 |
<A NAME="IDX1085"></A> |
941 |
Output the variables occurring in <VAR>shell-format</VAR>. |
942 |
|
943 |
</DL>
|
944 |
|
945 |
<P>
|
946 |
<STRONG>Informative output</STRONG> |
947 |
|
948 |
</P>
|
949 |
<DL COMPACT> |
950 |
|
951 |
<DT><SAMP>`-h´</SAMP> |
952 |
<DD>
|
953 |
<DT><SAMP>`--help´</SAMP> |
954 |
<DD>
|
955 |
<A NAME="IDX1086"></A> |
956 |
<A NAME="IDX1087"></A> |
957 |
Display this help and exit. |
958 |
|
959 |
<DT><SAMP>`-V´</SAMP> |
960 |
<DD>
|
961 |
<DT><SAMP>`--version´</SAMP> |
962 |
<DD>
|
963 |
<A NAME="IDX1088"></A> |
964 |
<A NAME="IDX1089"></A> |
965 |
Output version information and exit. |
966 |
|
967 |
</DL>
|
968 |
|
969 |
<P>
|
970 |
In normal operation mode, standard input is copied to standard output, |
971 |
with references to environment variables of the form <CODE>$VARIABLE</CODE> or |
972 |
<CODE>${VARIABLE}</CODE> being replaced with the corresponding values. If a |
973 |
<VAR>shell-format</VAR> is given, only those environment variables that are |
974 |
referenced in <VAR>shell-format</VAR> are substituted; otherwise all environment |
975 |
variables references occurring in standard input are substituted. |
976 |
|
977 |
</P>
|
978 |
<P>
|
979 |
These substitutions are a subset of the substitutions that a shell performs |
980 |
on unquoted and double-quoted strings. Other kinds of substitutions done |
981 |
by a shell, such as <CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE> or |
982 |
<CODE>$(<VAR>command-list</VAR>)</CODE> or <CODE>`<VAR>command-list</VAR>`</CODE>, are not performed |
983 |
by the <CODE>envsubst</CODE> program, due to security reasons. |
984 |
|
985 |
</P>
|
986 |
<P>
|
987 |
When <CODE>--variables</CODE> is used, standard input is ignored, and the output |
988 |
consists of the environment variables that are referenced in |
989 |
<VAR>shell-format</VAR>, one per line. |
990 |
|
991 |
</P>
|
992 |
|
993 |
|
994 |
<H4><A NAME="SEC247" HREF="gettext_toc.html#TOC247">13.5.2.6 Invoking the <CODE>eval_gettext</CODE> function</A></H4> |
995 |
|
996 |
<P>
|
997 |
<A NAME="IDX1090"></A> |
998 |
|
999 |
<PRE>
|
1000 |
eval_gettext <VAR>msgid</VAR> |
1001 |
</PRE>
|
1002 |
|
1003 |
<P>
|
1004 |
<A NAME="IDX1091"></A> |
1005 |
This function outputs the native language translation of a textual message, |
1006 |
performing dollar-substitution on the result. Note that only shell variables |
1007 |
mentioned in <VAR>msgid</VAR> will be dollar-substituted in the result. |
1008 |
|
1009 |
</P>
|
1010 |
|
1011 |
|
1012 |
<H4><A NAME="SEC248" HREF="gettext_toc.html#TOC248">13.5.2.7 Invoking the <CODE>eval_ngettext</CODE> function</A></H4> |
1013 |
|
1014 |
<P>
|
1015 |
<A NAME="IDX1092"></A> |
1016 |
|
1017 |
<PRE>
|
1018 |
eval_ngettext <VAR>msgid</VAR> <VAR>msgid-plural</VAR> <VAR>count</VAR> |
1019 |
</PRE>
|
1020 |
|
1021 |
<P>
|
1022 |
<A NAME="IDX1093"></A> |
1023 |
This function outputs the native language translation of a textual message |
1024 |
whose grammatical form depends on a number, performing dollar-substitution |
1025 |
on the result. Note that only shell variables mentioned in <VAR>msgid</VAR> or |
1026 |
<VAR>msgid-plural</VAR> will be dollar-substituted in the result. |
1027 |
|
1028 |
</P>
|
1029 |
|
1030 |
|
1031 |
<H3><A NAME="SEC249" HREF="gettext_toc.html#TOC249">13.5.3 bash - Bourne-Again Shell Script</A></H3> |
1032 |
<P>
|
1033 |
<A NAME="IDX1094"></A> |
1034 |
|
1035 |
</P>
|
1036 |
<P>
|
1037 |
GNU <CODE>bash</CODE> 2.0 or newer has a special shorthand for translating a |
1038 |
string and substituting variable values in it: <CODE>$"msgid"</CODE>. But |
1039 |
the use of this construct is <STRONG>discouraged</STRONG>, due to the security |
1040 |
holes it opens and due to its portability problems. |
1041 |
|
1042 |
</P>
|
1043 |
<P>
|
1044 |
The security holes of <CODE>$"..."</CODE> come from the fact that after looking up |
1045 |
the translation of the string, <CODE>bash</CODE> processes it like it processes |
1046 |
any double-quoted string: dollar and backquote processing, like <SAMP>`eval´</SAMP> |
1047 |
does. |
1048 |
|
1049 |
</P>
|
1050 |
|
1051 |
<OL>
|
1052 |
<LI>
|
1053 |
|
1054 |
In a locale whose encoding is one of BIG5, BIG5-HKSCS, GBK, GB18030, SHIFT_JIS, |
1055 |
JOHAB, some double-byte characters have a second byte whose value is |
1056 |
<CODE>0x60</CODE>. For example, the byte sequence <CODE>\xe0\x60</CODE> is a single |
1057 |
character in these locales. Many versions of <CODE>bash</CODE> (all versions |
1058 |
up to bash-2.05, and newer versions on platforms without <CODE>mbsrtowcs()</CODE> |
1059 |
function) don't know about character boundaries and see a backquote character |
1060 |
where there is only a particular Chinese character. Thus it can start |
1061 |
executing part of the translation as a command list. This situation can occur |
1062 |
even without the translator being aware of it: if the translator provides |
1063 |
translations in the UTF-8 encoding, it is the <CODE>gettext()</CODE> function which |
1064 |
will, during its conversion from the translator's encoding to the user's |
1065 |
locale's encoding, produce the dangerous <CODE>\x60</CODE> bytes. |
1066 |
|
1067 |
<LI>
|
1068 |
|
1069 |
A translator could - voluntarily or inadvertantly - use backquotes |
1070 |
<CODE>"`...`"</CODE> or dollar-parentheses <CODE>"$(...)"</CODE> in her translations. |
1071 |
The enclosed strings would be executed as command lists by the shell. |
1072 |
</OL>
|
1073 |
|
1074 |
<P>
|
1075 |
The portability problem is that <CODE>bash</CODE> must be built with |
1076 |
internationalization support; this is normally not the case on systems |
1077 |
that don't have the <CODE>gettext()</CODE> function in libc. |
1078 |
|
1079 |
</P>
|
1080 |
|
1081 |
|
1082 |
<H3><A NAME="SEC250" HREF="gettext_toc.html#TOC250">13.5.4 Python</A></H3> |
1083 |
<P>
|
1084 |
<A NAME="IDX1095"></A> |
1085 |
|
1086 |
</P>
|
1087 |
<DL COMPACT> |
1088 |
|
1089 |
<DT>RPMs
|
1090 |
<DD>
|
1091 |
python |
1092 |
|
1093 |
<DT>File extension
|
1094 |
<DD>
|
1095 |
<CODE>py</CODE> |
1096 |
|
1097 |
<DT>String syntax
|
1098 |
<DD>
|
1099 |
<CODE>'abc'</CODE>, <CODE>u'abc'</CODE>, <CODE>r'abc'</CODE>, <CODE>ur'abc'</CODE>, |
1100 |
<BR><CODE>"abc"</CODE>, <CODE>u"abc"</CODE>, <CODE>r"abc"</CODE>, <CODE>ur"abc"</CODE>, |
1101 |
<BR><CODE>"'abc"'</CODE>, <CODE>u"'abc"'</CODE>, <CODE>r"'abc"'</CODE>, <CODE>ur"'abc"'</CODE>, |
1102 |
<BR><CODE>"""abc"""</CODE>, <CODE>u"""abc"""</CODE>, <CODE>r"""abc"""</CODE>, <CODE>ur"""abc"""</CODE> |
1103 |
|
1104 |
<DT>gettext shorthand
|
1105 |
<DD>
|
1106 |
<CODE>_('abc')</CODE> etc. |
1107 |
|
1108 |
<DT>gettext/ngettext functions
|
1109 |
<DD>
|
1110 |
<CODE>gettext.gettext</CODE>, <CODE>gettext.dgettext</CODE>, |
1111 |
<CODE>gettext.ngettext</CODE>, <CODE>gettext.dngettext</CODE>, |
1112 |
also <CODE>ugettext</CODE>, <CODE>ungettext</CODE> |
1113 |
|
1114 |
<DT>textdomain
|
1115 |
<DD>
|
1116 |
<CODE>gettext.textdomain</CODE> function, or |
1117 |
<CODE>gettext.install(<VAR>domain</VAR>)</CODE> function |
1118 |
|
1119 |
<DT>bindtextdomain
|
1120 |
<DD>
|
1121 |
<CODE>gettext.bindtextdomain</CODE> function, or |
1122 |
<CODE>gettext.install(<VAR>domain</VAR>,<VAR>localedir</VAR>)</CODE> function |
1123 |
|
1124 |
<DT>setlocale
|
1125 |
<DD>
|
1126 |
not used by the gettext emulation |
1127 |
|
1128 |
<DT>Prerequisite
|
1129 |
<DD>
|
1130 |
<CODE>import gettext</CODE> |
1131 |
|
1132 |
<DT>Use or emulate GNU gettext
|
1133 |
<DD>
|
1134 |
emulate. Bug: uses only the first found .mo file, not all of them |
1135 |
|
1136 |
<DT>Extractor
|
1137 |
<DD>
|
1138 |
<CODE>xgettext</CODE> |
1139 |
|
1140 |
<DT>Formatting with positions
|
1141 |
<DD>
|
1142 |
<CODE>'...%(ident)d...' % { 'ident': value }</CODE> |
1143 |
|
1144 |
<DT>Portability
|
1145 |
<DD>
|
1146 |
fully portable |
1147 |
|
1148 |
<DT>po-mode marking
|
1149 |
<DD>
|
1150 |
--- |
1151 |
</DL>
|
1152 |
|
1153 |
<P>
|
1154 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-python</CODE>. |
1155 |
|
1156 |
</P>
|
1157 |
|
1158 |
|
1159 |
<H3><A NAME="SEC251" HREF="gettext_toc.html#TOC251">13.5.5 GNU clisp - Common Lisp</A></H3> |
1160 |
<P>
|
1161 |
<A NAME="IDX1096"></A> |
1162 |
<A NAME="IDX1097"></A> |
1163 |
<A NAME="IDX1098"></A> |
1164 |
|
1165 |
</P>
|
1166 |
<DL COMPACT> |
1167 |
|
1168 |
<DT>RPMs
|
1169 |
<DD>
|
1170 |
clisp 2.28 or newer |
1171 |
|
1172 |
<DT>File extension
|
1173 |
<DD>
|
1174 |
<CODE>lisp</CODE> |
1175 |
|
1176 |
<DT>String syntax
|
1177 |
<DD>
|
1178 |
<CODE>"abc"</CODE> |
1179 |
|
1180 |
<DT>gettext shorthand
|
1181 |
<DD>
|
1182 |
<CODE>(_ "abc")</CODE>, <CODE>(ENGLISH "abc")</CODE> |
1183 |
|
1184 |
<DT>gettext/ngettext functions
|
1185 |
<DD>
|
1186 |
<CODE>i18n:gettext</CODE>, <CODE>i18n:ngettext</CODE> |
1187 |
|
1188 |
<DT>textdomain
|
1189 |
<DD>
|
1190 |
<CODE>i18n:textdomain</CODE> |
1191 |
|
1192 |
<DT>bindtextdomain
|
1193 |
<DD>
|
1194 |
<CODE>i18n:textdomaindir</CODE> |
1195 |
|
1196 |
<DT>setlocale
|
1197 |
<DD>
|
1198 |
automatic |
1199 |
|
1200 |
<DT>Prerequisite
|
1201 |
<DD>
|
1202 |
--- |
1203 |
|
1204 |
<DT>Use or emulate GNU gettext
|
1205 |
<DD>
|
1206 |
use |
1207 |
|
1208 |
<DT>Extractor
|
1209 |
<DD>
|
1210 |
<CODE>xgettext -k_ -kENGLISH</CODE> |
1211 |
|
1212 |
<DT>Formatting with positions
|
1213 |
<DD>
|
1214 |
<CODE>format "~1@*~D ~0@*~D"</CODE> |
1215 |
|
1216 |
<DT>Portability
|
1217 |
<DD>
|
1218 |
On platforms without gettext, no translation. |
1219 |
|
1220 |
<DT>po-mode marking
|
1221 |
<DD>
|
1222 |
--- |
1223 |
</DL>
|
1224 |
|
1225 |
<P>
|
1226 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-clisp</CODE>. |
1227 |
|
1228 |
</P>
|
1229 |
|
1230 |
|
1231 |
<H3><A NAME="SEC252" HREF="gettext_toc.html#TOC252">13.5.6 GNU clisp C sources</A></H3> |
1232 |
<P>
|
1233 |
<A NAME="IDX1099"></A> |
1234 |
|
1235 |
</P>
|
1236 |
<DL COMPACT> |
1237 |
|
1238 |
<DT>RPMs
|
1239 |
<DD>
|
1240 |
clisp |
1241 |
|
1242 |
<DT>File extension
|
1243 |
<DD>
|
1244 |
<CODE>d</CODE> |
1245 |
|
1246 |
<DT>String syntax
|
1247 |
<DD>
|
1248 |
<CODE>"abc"</CODE> |
1249 |
|
1250 |
<DT>gettext shorthand
|
1251 |
<DD>
|
1252 |
<CODE>ENGLISH ? "abc" : ""</CODE> |
1253 |
<BR><CODE>GETTEXT("abc")</CODE> |
1254 |
<BR><CODE>GETTEXTL("abc")</CODE> |
1255 |
|
1256 |
<DT>gettext/ngettext functions
|
1257 |
<DD>
|
1258 |
<CODE>clgettext</CODE>, <CODE>clgettextl</CODE> |
1259 |
|
1260 |
<DT>textdomain
|
1261 |
<DD>
|
1262 |
--- |
1263 |
|
1264 |
<DT>bindtextdomain
|
1265 |
<DD>
|
1266 |
--- |
1267 |
|
1268 |
<DT>setlocale
|
1269 |
<DD>
|
1270 |
automatic |
1271 |
|
1272 |
<DT>Prerequisite
|
1273 |
<DD>
|
1274 |
<CODE>#include "lispbibl.c"</CODE> |
1275 |
|
1276 |
<DT>Use or emulate GNU gettext
|
1277 |
<DD>
|
1278 |
use |
1279 |
|
1280 |
<DT>Extractor
|
1281 |
<DD>
|
1282 |
<CODE>clisp-xgettext</CODE> |
1283 |
|
1284 |
<DT>Formatting with positions
|
1285 |
<DD>
|
1286 |
<CODE>fprintf "%2$d %1$d"</CODE> |
1287 |
|
1288 |
<DT>Portability
|
1289 |
<DD>
|
1290 |
On platforms without gettext, no translation. |
1291 |
|
1292 |
<DT>po-mode marking
|
1293 |
<DD>
|
1294 |
--- |
1295 |
</DL>
|
1296 |
|
1297 |
|
1298 |
|
1299 |
<H3><A NAME="SEC253" HREF="gettext_toc.html#TOC253">13.5.7 Emacs Lisp</A></H3> |
1300 |
<P>
|
1301 |
<A NAME="IDX1100"></A> |
1302 |
|
1303 |
</P>
|
1304 |
<DL COMPACT> |
1305 |
|
1306 |
<DT>RPMs
|
1307 |
<DD>
|
1308 |
emacs, xemacs |
1309 |
|
1310 |
<DT>File extension
|
1311 |
<DD>
|
1312 |
<CODE>el</CODE> |
1313 |
|
1314 |
<DT>String syntax
|
1315 |
<DD>
|
1316 |
<CODE>"abc"</CODE> |
1317 |
|
1318 |
<DT>gettext shorthand
|
1319 |
<DD>
|
1320 |
<CODE>(_"abc")</CODE> |
1321 |
|
1322 |
<DT>gettext/ngettext functions
|
1323 |
<DD>
|
1324 |
<CODE>gettext</CODE>, <CODE>dgettext</CODE> (xemacs only) |
1325 |
|
1326 |
<DT>textdomain
|
1327 |
<DD>
|
1328 |
<CODE>domain</CODE> special form (xemacs only) |
1329 |
|
1330 |
<DT>bindtextdomain
|
1331 |
<DD>
|
1332 |
<CODE>bind-text-domain</CODE> function (xemacs only) |
1333 |
|
1334 |
<DT>setlocale
|
1335 |
<DD>
|
1336 |
automatic |
1337 |
|
1338 |
<DT>Prerequisite
|
1339 |
<DD>
|
1340 |
--- |
1341 |
|
1342 |
<DT>Use or emulate GNU gettext
|
1343 |
<DD>
|
1344 |
use |
1345 |
|
1346 |
<DT>Extractor
|
1347 |
<DD>
|
1348 |
<CODE>xgettext</CODE> |
1349 |
|
1350 |
<DT>Formatting with positions
|
1351 |
<DD>
|
1352 |
<CODE>format "%2$d %1$d"</CODE> |
1353 |
|
1354 |
<DT>Portability
|
1355 |
<DD>
|
1356 |
Only XEmacs. Without <CODE>I18N3</CODE> defined at build time, no translation. |
1357 |
|
1358 |
<DT>po-mode marking
|
1359 |
<DD>
|
1360 |
--- |
1361 |
</DL>
|
1362 |
|
1363 |
|
1364 |
|
1365 |
<H3><A NAME="SEC254" HREF="gettext_toc.html#TOC254">13.5.8 librep</A></H3> |
1366 |
<P>
|
1367 |
<A NAME="IDX1101"></A> |
1368 |
|
1369 |
</P>
|
1370 |
<DL COMPACT> |
1371 |
|
1372 |
<DT>RPMs
|
1373 |
<DD>
|
1374 |
librep 0.15.3 or newer |
1375 |
|
1376 |
<DT>File extension
|
1377 |
<DD>
|
1378 |
<CODE>jl</CODE> |
1379 |
|
1380 |
<DT>String syntax
|
1381 |
<DD>
|
1382 |
<CODE>"abc"</CODE> |
1383 |
|
1384 |
<DT>gettext shorthand
|
1385 |
<DD>
|
1386 |
<CODE>(_"abc")</CODE> |
1387 |
|
1388 |
<DT>gettext/ngettext functions
|
1389 |
<DD>
|
1390 |
<CODE>gettext</CODE> |
1391 |
|
1392 |
<DT>textdomain
|
1393 |
<DD>
|
1394 |
<CODE>textdomain</CODE> function |
1395 |
|
1396 |
<DT>bindtextdomain
|
1397 |
<DD>
|
1398 |
<CODE>bindtextdomain</CODE> function |
1399 |
|
1400 |
<DT>setlocale
|
1401 |
<DD>
|
1402 |
--- |
1403 |
|
1404 |
<DT>Prerequisite
|
1405 |
<DD>
|
1406 |
<CODE>(require 'rep.i18n.gettext)</CODE> |
1407 |
|
1408 |
<DT>Use or emulate GNU gettext
|
1409 |
<DD>
|
1410 |
use |
1411 |
|
1412 |
<DT>Extractor
|
1413 |
<DD>
|
1414 |
<CODE>xgettext</CODE> |
1415 |
|
1416 |
<DT>Formatting with positions
|
1417 |
<DD>
|
1418 |
<CODE>format "%2$d %1$d"</CODE> |
1419 |
|
1420 |
<DT>Portability
|
1421 |
<DD>
|
1422 |
On platforms without gettext, no translation. |
1423 |
|
1424 |
<DT>po-mode marking
|
1425 |
<DD>
|
1426 |
--- |
1427 |
</DL>
|
1428 |
|
1429 |
<P>
|
1430 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-librep</CODE>. |
1431 |
|
1432 |
</P>
|
1433 |
|
1434 |
|
1435 |
<H3><A NAME="SEC255" HREF="gettext_toc.html#TOC255">13.5.9 GNU Smalltalk</A></H3> |
1436 |
<P>
|
1437 |
<A NAME="IDX1102"></A> |
1438 |
|
1439 |
</P>
|
1440 |
<DL COMPACT> |
1441 |
|
1442 |
<DT>RPMs
|
1443 |
<DD>
|
1444 |
smalltalk |
1445 |
|
1446 |
<DT>File extension
|
1447 |
<DD>
|
1448 |
<CODE>st</CODE> |
1449 |
|
1450 |
<DT>String syntax
|
1451 |
<DD>
|
1452 |
<CODE>'abc'</CODE> |
1453 |
|
1454 |
<DT>gettext shorthand
|
1455 |
<DD>
|
1456 |
<CODE>NLS ? 'abc'</CODE> |
1457 |
|
1458 |
<DT>gettext/ngettext functions
|
1459 |
<DD>
|
1460 |
<CODE>LcMessagesDomain>>#at:</CODE>, <CODE>LcMessagesDomain>>#at:plural:with:</CODE> |
1461 |
|
1462 |
<DT>textdomain
|
1463 |
<DD>
|
1464 |
<CODE>LcMessages>>#domain:localeDirectory:</CODE> (returns a <CODE>LcMessagesDomain</CODE> |
1465 |
object).<BR>
|
1466 |
Example: <CODE>I18N Locale default messages domain: 'gettext' localeDirectory: /usr/local/share/locale'</CODE> |
1467 |
|
1468 |
<DT>bindtextdomain
|
1469 |
<DD>
|
1470 |
<CODE>LcMessages>>#domain:localeDirectory:</CODE>, see above. |
1471 |
|
1472 |
<DT>setlocale
|
1473 |
<DD>
|
1474 |
Automatic if you use <CODE>I18N Locale default</CODE>. |
1475 |
|
1476 |
<DT>Prerequisite
|
1477 |
<DD>
|
1478 |
<CODE>PackageLoader fileInPackage: 'I18N'!</CODE> |
1479 |
|
1480 |
<DT>Use or emulate GNU gettext
|
1481 |
<DD>
|
1482 |
emulate |
1483 |
|
1484 |
<DT>Extractor
|
1485 |
<DD>
|
1486 |
<CODE>xgettext</CODE> |
1487 |
|
1488 |
<DT>Formatting with positions
|
1489 |
<DD>
|
1490 |
<CODE>'%1 %2' bindWith: 'Hello' with: 'world'</CODE> |
1491 |
|
1492 |
<DT>Portability
|
1493 |
<DD>
|
1494 |
fully portable |
1495 |
|
1496 |
<DT>po-mode marking
|
1497 |
<DD>
|
1498 |
--- |
1499 |
</DL>
|
1500 |
|
1501 |
<P>
|
1502 |
An example is available in the <TT>`examples´</TT> directory: |
1503 |
<CODE>hello-smalltalk</CODE>. |
1504 |
|
1505 |
</P>
|
1506 |
|
1507 |
|
1508 |
<H3><A NAME="SEC256" HREF="gettext_toc.html#TOC256">13.5.10 Java</A></H3> |
1509 |
<P>
|
1510 |
<A NAME="IDX1103"></A> |
1511 |
|
1512 |
</P>
|
1513 |
<DL COMPACT> |
1514 |
|
1515 |
<DT>RPMs
|
1516 |
<DD>
|
1517 |
java, java2 |
1518 |
|
1519 |
<DT>File extension
|
1520 |
<DD>
|
1521 |
<CODE>java</CODE> |
1522 |
|
1523 |
<DT>String syntax
|
1524 |
<DD>
|
1525 |
"abc" |
1526 |
|
1527 |
<DT>gettext shorthand
|
1528 |
<DD>
|
1529 |
_("abc") |
1530 |
|
1531 |
<DT>gettext/ngettext functions
|
1532 |
<DD>
|
1533 |
<CODE>GettextResource.gettext</CODE>, <CODE>GettextResource.ngettext</CODE> |
1534 |
|
1535 |
<DT>textdomain
|
1536 |
<DD>
|
1537 |
---, use <CODE>ResourceBundle.getResource</CODE> instead |
1538 |
|
1539 |
<DT>bindtextdomain
|
1540 |
<DD>
|
1541 |
---, use CLASSPATH instead |
1542 |
|
1543 |
<DT>setlocale
|
1544 |
<DD>
|
1545 |
automatic |
1546 |
|
1547 |
<DT>Prerequisite
|
1548 |
<DD>
|
1549 |
--- |
1550 |
|
1551 |
<DT>Use or emulate GNU gettext
|
1552 |
<DD>
|
1553 |
---, uses a Java specific message catalog format |
1554 |
|
1555 |
<DT>Extractor
|
1556 |
<DD>
|
1557 |
<CODE>xgettext -k_</CODE> |
1558 |
|
1559 |
<DT>Formatting with positions
|
1560 |
<DD>
|
1561 |
<CODE>MessageFormat.format "{1,number} {0,number}"</CODE> |
1562 |
|
1563 |
<DT>Portability
|
1564 |
<DD>
|
1565 |
fully portable |
1566 |
|
1567 |
<DT>po-mode marking
|
1568 |
<DD>
|
1569 |
--- |
1570 |
</DL>
|
1571 |
|
1572 |
<P>
|
1573 |
Before marking strings as internationalizable, uses of the string |
1574 |
concatenation operator need to be converted to <CODE>MessageFormat</CODE> |
1575 |
applications. For example, <CODE>"file "+filename+" not found"</CODE> becomes |
1576 |
<CODE>MessageFormat.format("file {0} not found", new Object[] { filename })</CODE>. |
1577 |
Only after this is done, can the strings be marked and extracted. |
1578 |
|
1579 |
</P>
|
1580 |
<P>
|
1581 |
GNU gettext uses the native Java internationalization mechanism, namely |
1582 |
<CODE>ResourceBundle</CODE>s. There are two formats of <CODE>ResourceBundle</CODE>s: |
1583 |
<CODE>.properties</CODE> files and <CODE>.class</CODE> files. The <CODE>.properties</CODE> |
1584 |
format is a text file which the translators can directly edit, like PO |
1585 |
files, but which doesn't support plural forms. Whereas the <CODE>.class</CODE> |
1586 |
format is compiled from <CODE>.java</CODE> source code and can support plural |
1587 |
forms (provided it is accessed through an appropriate API, see below). |
1588 |
|
1589 |
</P>
|
1590 |
<P>
|
1591 |
To convert a PO file to a <CODE>.properties</CODE> file, the <CODE>msgcat</CODE> |
1592 |
program can be used with the option <CODE>--properties-output</CODE>. To convert |
1593 |
a <CODE>.properties</CODE> file back to a PO file, the <CODE>msgcat</CODE> program |
1594 |
can be used with the option <CODE>--properties-input</CODE>. All the tools |
1595 |
that manipulate PO files can work with <CODE>.properties</CODE> files as well, |
1596 |
if given the <CODE>--properties-input</CODE> and/or <CODE>--properties-output</CODE> |
1597 |
option. |
1598 |
|
1599 |
</P>
|
1600 |
<P>
|
1601 |
To convert a PO file to a ResourceBundle class, the <CODE>msgfmt</CODE> program |
1602 |
can be used with the option <CODE>--java</CODE> or <CODE>--java2</CODE>. To convert a |
1603 |
ResourceBundle back to a PO file, the <CODE>msgunfmt</CODE> program can be used |
1604 |
with the option <CODE>--java</CODE>. |
1605 |
|
1606 |
</P>
|
1607 |
<P>
|
1608 |
Two different programmatic APIs can be used to access ResourceBundles. |
1609 |
Note that both APIs work with all kinds of ResourceBundles, whether |
1610 |
GNU gettext generated classes, or other <CODE>.class</CODE> or <CODE>.properties</CODE> |
1611 |
files. |
1612 |
|
1613 |
</P>
|
1614 |
|
1615 |
<OL>
|
1616 |
<LI>
|
1617 |
|
1618 |
The <CODE>java.util.ResourceBundle</CODE> API. |
1619 |
|
1620 |
In particular, its <CODE>getString</CODE> function returns a string translation. |
1621 |
Note that a missing translation yields a <CODE>MissingResourceException</CODE>. |
1622 |
|
1623 |
This has the advantage of being the standard API. And it does not require |
1624 |
any additional libraries, only the <CODE>msgcat</CODE> generated <CODE>.properties</CODE> |
1625 |
files or the <CODE>msgfmt</CODE> generated <CODE>.class</CODE> files. But it cannot do |
1626 |
plural handling, even if the resource was generated by <CODE>msgfmt</CODE> from |
1627 |
a PO file with plural handling. |
1628 |
|
1629 |
<LI>
|
1630 |
|
1631 |
The <CODE>gnu.gettext.GettextResource</CODE> API. |
1632 |
|
1633 |
Reference documentation in Javadoc 1.1 style format |
1634 |
is in the <A HREF="javadoc1/tree.html">javadoc1 directory</A> and |
1635 |
in Javadoc 2 style format |
1636 |
in the <A HREF="javadoc2/index.html">javadoc2 directory</A>. |
1637 |
|
1638 |
Its <CODE>gettext</CODE> function returns a string translation. Note that when |
1639 |
a translation is missing, the <VAR>msgid</VAR> argument is returned unchanged. |
1640 |
|
1641 |
This has the advantage of having the <CODE>ngettext</CODE> function for plural |
1642 |
handling. |
1643 |
|
1644 |
<A NAME="IDX1104"></A> |
1645 |
To use this API, one needs the <CODE>libintl.jar</CODE> file which is part of |
1646 |
the GNU gettext package and distributed under the LGPL. |
1647 |
</OL>
|
1648 |
|
1649 |
<P>
|
1650 |
Three examples, using the second API, are available in the <TT>`examples´</TT> |
1651 |
directory: <CODE>hello-java</CODE>, <CODE>hello-java-awt</CODE>, <CODE>hello-java-swing</CODE>. |
1652 |
|
1653 |
</P>
|
1654 |
|
1655 |
|
1656 |
<H3><A NAME="SEC257" HREF="gettext_toc.html#TOC257">13.5.11 GNU awk</A></H3> |
1657 |
<P>
|
1658 |
<A NAME="IDX1105"></A> |
1659 |
<A NAME="IDX1106"></A> |
1660 |
|
1661 |
</P>
|
1662 |
<DL COMPACT> |
1663 |
|
1664 |
<DT>RPMs
|
1665 |
<DD>
|
1666 |
gawk 3.1 or newer |
1667 |
|
1668 |
<DT>File extension
|
1669 |
<DD>
|
1670 |
<CODE>awk</CODE> |
1671 |
|
1672 |
<DT>String syntax
|
1673 |
<DD>
|
1674 |
<CODE>"abc"</CODE> |
1675 |
|
1676 |
<DT>gettext shorthand
|
1677 |
<DD>
|
1678 |
<CODE>_"abc"</CODE> |
1679 |
|
1680 |
<DT>gettext/ngettext functions
|
1681 |
<DD>
|
1682 |
<CODE>dcgettext</CODE>, missing <CODE>dcngettext</CODE> in gawk-3.1.0 |
1683 |
|
1684 |
<DT>textdomain
|
1685 |
<DD>
|
1686 |
<CODE>TEXTDOMAIN</CODE> variable |
1687 |
|
1688 |
<DT>bindtextdomain
|
1689 |
<DD>
|
1690 |
<CODE>bindtextdomain</CODE> function |
1691 |
|
1692 |
<DT>setlocale
|
1693 |
<DD>
|
1694 |
automatic, but missing <CODE>setlocale (LC_MESSAGES, "")</CODE> in gawk-3.1.0 |
1695 |
|
1696 |
<DT>Prerequisite
|
1697 |
<DD>
|
1698 |
--- |
1699 |
|
1700 |
<DT>Use or emulate GNU gettext
|
1701 |
<DD>
|
1702 |
use |
1703 |
|
1704 |
<DT>Extractor
|
1705 |
<DD>
|
1706 |
<CODE>xgettext</CODE> |
1707 |
|
1708 |
<DT>Formatting with positions
|
1709 |
<DD>
|
1710 |
<CODE>printf "%2$d %1$d"</CODE> (GNU awk only) |
1711 |
|
1712 |
<DT>Portability
|
1713 |
<DD>
|
1714 |
On platforms without gettext, no translation. On non-GNU awks, you must |
1715 |
define <CODE>dcgettext</CODE>, <CODE>dcngettext</CODE> and <CODE>bindtextdomain</CODE> |
1716 |
yourself. |
1717 |
|
1718 |
<DT>po-mode marking
|
1719 |
<DD>
|
1720 |
--- |
1721 |
</DL>
|
1722 |
|
1723 |
<P>
|
1724 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-gawk</CODE>. |
1725 |
|
1726 |
</P>
|
1727 |
|
1728 |
|
1729 |
<H3><A NAME="SEC258" HREF="gettext_toc.html#TOC258">13.5.12 Pascal - Free Pascal Compiler</A></H3> |
1730 |
<P>
|
1731 |
<A NAME="IDX1107"></A> |
1732 |
<A NAME="IDX1108"></A> |
1733 |
<A NAME="IDX1109"></A> |
1734 |
|
1735 |
</P>
|
1736 |
<DL COMPACT> |
1737 |
|
1738 |
<DT>RPMs
|
1739 |
<DD>
|
1740 |
fpk |
1741 |
|
1742 |
<DT>File extension
|
1743 |
<DD>
|
1744 |
<CODE>pp</CODE>, <CODE>pas</CODE> |
1745 |
|
1746 |
<DT>String syntax
|
1747 |
<DD>
|
1748 |
<CODE>'abc'</CODE> |
1749 |
|
1750 |
<DT>gettext shorthand
|
1751 |
<DD>
|
1752 |
automatic |
1753 |
|
1754 |
<DT>gettext/ngettext functions
|
1755 |
<DD>
|
1756 |
---, use <CODE>ResourceString</CODE> data type instead |
1757 |
|
1758 |
<DT>textdomain
|
1759 |
<DD>
|
1760 |
---, use <CODE>TranslateResourceStrings</CODE> function instead |
1761 |
|
1762 |
<DT>bindtextdomain
|
1763 |
<DD>
|
1764 |
---, use <CODE>TranslateResourceStrings</CODE> function instead |
1765 |
|
1766 |
<DT>setlocale
|
1767 |
<DD>
|
1768 |
automatic, but uses only LANG, not LC_MESSAGES or LC_ALL |
1769 |
|
1770 |
<DT>Prerequisite
|
1771 |
<DD>
|
1772 |
<CODE>{$mode delphi}</CODE> or <CODE>{$mode objfpc}</CODE><BR><CODE>uses gettext;</CODE> |
1773 |
|
1774 |
<DT>Use or emulate GNU gettext
|
1775 |
<DD>
|
1776 |
emulate partially |
1777 |
|
1778 |
<DT>Extractor
|
1779 |
<DD>
|
1780 |
<CODE>ppc386</CODE> followed by <CODE>xgettext</CODE> or <CODE>rstconv</CODE> |
1781 |
|
1782 |
<DT>Formatting with positions
|
1783 |
<DD>
|
1784 |
<CODE>uses sysutils;</CODE><BR><CODE>format "%1:d %0:d"</CODE> |
1785 |
|
1786 |
<DT>Portability
|
1787 |
<DD>
|
1788 |
? |
1789 |
|
1790 |
<DT>po-mode marking
|
1791 |
<DD>
|
1792 |
--- |
1793 |
</DL>
|
1794 |
|
1795 |
<P>
|
1796 |
The Pascal compiler has special support for the <CODE>ResourceString</CODE> data |
1797 |
type. It generates a <CODE>.rst</CODE> file. This is then converted to a |
1798 |
<CODE>.pot</CODE> file by use of <CODE>xgettext</CODE> or <CODE>rstconv</CODE>. At runtime, |
1799 |
a <CODE>.mo</CODE> file corresponding to translations of this <CODE>.pot</CODE> file |
1800 |
can be loaded using the <CODE>TranslateResourceStrings</CODE> function in the |
1801 |
<CODE>gettext</CODE> unit. |
1802 |
|
1803 |
</P>
|
1804 |
<P>
|
1805 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-pascal</CODE>. |
1806 |
|
1807 |
</P>
|
1808 |
|
1809 |
|
1810 |
<H3><A NAME="SEC259" HREF="gettext_toc.html#TOC259">13.5.13 wxWindows library</A></H3> |
1811 |
<P>
|
1812 |
<A NAME="IDX1110"></A> |
1813 |
|
1814 |
</P>
|
1815 |
<DL COMPACT> |
1816 |
|
1817 |
<DT>RPMs
|
1818 |
<DD>
|
1819 |
wxGTK, gettext |
1820 |
|
1821 |
<DT>File extension
|
1822 |
<DD>
|
1823 |
<CODE>cpp</CODE> |
1824 |
|
1825 |
<DT>String syntax
|
1826 |
<DD>
|
1827 |
<CODE>"abc"</CODE> |
1828 |
|
1829 |
<DT>gettext shorthand
|
1830 |
<DD>
|
1831 |
<CODE>_("abc")</CODE> |
1832 |
|
1833 |
<DT>gettext/ngettext functions
|
1834 |
<DD>
|
1835 |
<CODE>wxLocale::GetString</CODE>, <CODE>wxGetTranslation</CODE> |
1836 |
|
1837 |
<DT>textdomain
|
1838 |
<DD>
|
1839 |
<CODE>wxLocale::AddCatalog</CODE> |
1840 |
|
1841 |
<DT>bindtextdomain
|
1842 |
<DD>
|
1843 |
<CODE>wxLocale::AddCatalogLookupPathPrefix</CODE> |
1844 |
|
1845 |
<DT>setlocale
|
1846 |
<DD>
|
1847 |
<CODE>wxLocale::Init</CODE>, <CODE>wxSetLocale</CODE> |
1848 |
|
1849 |
<DT>Prerequisite
|
1850 |
<DD>
|
1851 |
<CODE>#include <wx/intl.h></CODE> |
1852 |
|
1853 |
<DT>Use or emulate GNU gettext
|
1854 |
<DD>
|
1855 |
emulate, see <CODE>include/wx/intl.h</CODE> and <CODE>src/common/intl.cpp</CODE> |
1856 |
|
1857 |
<DT>Extractor
|
1858 |
<DD>
|
1859 |
<CODE>xgettext</CODE> |
1860 |
|
1861 |
<DT>Formatting with positions
|
1862 |
<DD>
|
1863 |
--- |
1864 |
|
1865 |
<DT>Portability
|
1866 |
<DD>
|
1867 |
fully portable |
1868 |
|
1869 |
<DT>po-mode marking
|
1870 |
<DD>
|
1871 |
yes |
1872 |
</DL>
|
1873 |
|
1874 |
|
1875 |
|
1876 |
<H3><A NAME="SEC260" HREF="gettext_toc.html#TOC260">13.5.14 YCP - YaST2 scripting language</A></H3> |
1877 |
<P>
|
1878 |
<A NAME="IDX1111"></A> |
1879 |
<A NAME="IDX1112"></A> |
1880 |
|
1881 |
</P>
|
1882 |
<DL COMPACT> |
1883 |
|
1884 |
<DT>RPMs
|
1885 |
<DD>
|
1886 |
libycp, libycp-devel, yast2-core, yast2-core-devel |
1887 |
|
1888 |
<DT>File extension
|
1889 |
<DD>
|
1890 |
<CODE>ycp</CODE> |
1891 |
|
1892 |
<DT>String syntax
|
1893 |
<DD>
|
1894 |
<CODE>"abc"</CODE> |
1895 |
|
1896 |
<DT>gettext shorthand
|
1897 |
<DD>
|
1898 |
<CODE>_("abc")</CODE> |
1899 |
|
1900 |
<DT>gettext/ngettext functions
|
1901 |
<DD>
|
1902 |
<CODE>_()</CODE> with 1 or 3 arguments |
1903 |
|
1904 |
<DT>textdomain
|
1905 |
<DD>
|
1906 |
<CODE>textdomain</CODE> statement |
1907 |
|
1908 |
<DT>bindtextdomain
|
1909 |
<DD>
|
1910 |
--- |
1911 |
|
1912 |
<DT>setlocale
|
1913 |
<DD>
|
1914 |
--- |
1915 |
|
1916 |
<DT>Prerequisite
|
1917 |
<DD>
|
1918 |
--- |
1919 |
|
1920 |
<DT>Use or emulate GNU gettext
|
1921 |
<DD>
|
1922 |
use |
1923 |
|
1924 |
<DT>Extractor
|
1925 |
<DD>
|
1926 |
<CODE>xgettext</CODE> |
1927 |
|
1928 |
<DT>Formatting with positions
|
1929 |
<DD>
|
1930 |
<CODE>sformat "%2 %1"</CODE> |
1931 |
|
1932 |
<DT>Portability
|
1933 |
<DD>
|
1934 |
fully portable |
1935 |
|
1936 |
<DT>po-mode marking
|
1937 |
<DD>
|
1938 |
--- |
1939 |
</DL>
|
1940 |
|
1941 |
<P>
|
1942 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-ycp</CODE>. |
1943 |
|
1944 |
</P>
|
1945 |
|
1946 |
|
1947 |
<H3><A NAME="SEC261" HREF="gettext_toc.html#TOC261">13.5.15 Tcl - Tk's scripting language</A></H3> |
1948 |
<P>
|
1949 |
<A NAME="IDX1113"></A> |
1950 |
<A NAME="IDX1114"></A> |
1951 |
|
1952 |
</P>
|
1953 |
<DL COMPACT> |
1954 |
|
1955 |
<DT>RPMs
|
1956 |
<DD>
|
1957 |
tcl |
1958 |
|
1959 |
<DT>File extension
|
1960 |
<DD>
|
1961 |
<CODE>tcl</CODE> |
1962 |
|
1963 |
<DT>String syntax
|
1964 |
<DD>
|
1965 |
<CODE>"abc"</CODE> |
1966 |
|
1967 |
<DT>gettext shorthand
|
1968 |
<DD>
|
1969 |
<CODE>[_ "abc"]</CODE> |
1970 |
|
1971 |
<DT>gettext/ngettext functions
|
1972 |
<DD>
|
1973 |
<CODE>::msgcat::mc</CODE> |
1974 |
|
1975 |
<DT>textdomain
|
1976 |
<DD>
|
1977 |
--- |
1978 |
|
1979 |
<DT>bindtextdomain
|
1980 |
<DD>
|
1981 |
---, use <CODE>::msgcat::mcload</CODE> instead |
1982 |
|
1983 |
<DT>setlocale
|
1984 |
<DD>
|
1985 |
automatic, uses LANG, but ignores LC_MESSAGES and LC_ALL |
1986 |
|
1987 |
<DT>Prerequisite
|
1988 |
<DD>
|
1989 |
<CODE>package require msgcat</CODE> |
1990 |
<BR><CODE>proc _ {s} {return [::msgcat::mc $s]}</CODE> |
1991 |
|
1992 |
<DT>Use or emulate GNU gettext
|
1993 |
<DD>
|
1994 |
---, uses a Tcl specific message catalog format |
1995 |
|
1996 |
<DT>Extractor
|
1997 |
<DD>
|
1998 |
<CODE>xgettext -k_</CODE> |
1999 |
|
2000 |
<DT>Formatting with positions
|
2001 |
<DD>
|
2002 |
<CODE>format "%2\$d %1\$d"</CODE> |
2003 |
|
2004 |
<DT>Portability
|
2005 |
<DD>
|
2006 |
fully portable |
2007 |
|
2008 |
<DT>po-mode marking
|
2009 |
<DD>
|
2010 |
--- |
2011 |
</DL>
|
2012 |
|
2013 |
<P>
|
2014 |
Two examples are available in the <TT>`examples´</TT> directory: |
2015 |
<CODE>hello-tcl</CODE>, <CODE>hello-tcl-tk</CODE>. |
2016 |
|
2017 |
</P>
|
2018 |
<P>
|
2019 |
Before marking strings as internationalizable, substitutions of variables |
2020 |
into the string need to be converted to <CODE>format</CODE> applications. For |
2021 |
example, <CODE>"file $filename not found"</CODE> becomes |
2022 |
<CODE>[format "file %s not found" $filename]</CODE>. |
2023 |
Only after this is done, can the strings be marked and extracted. |
2024 |
After marking, this example becomes |
2025 |
<CODE>[format [_ "file %s not found"] $filename]</CODE> or |
2026 |
<CODE>[msgcat::mc "file %s not found" $filename]</CODE>. Note that the |
2027 |
<CODE>msgcat::mc</CODE> function implicitly calls <CODE>format</CODE> when more than one |
2028 |
argument is given. |
2029 |
|
2030 |
</P>
|
2031 |
|
2032 |
|
2033 |
<H3><A NAME="SEC262" HREF="gettext_toc.html#TOC262">13.5.16 Perl</A></H3> |
2034 |
<P>
|
2035 |
<A NAME="IDX1115"></A> |
2036 |
|
2037 |
</P>
|
2038 |
<DL COMPACT> |
2039 |
|
2040 |
<DT>RPMs
|
2041 |
<DD>
|
2042 |
perl |
2043 |
|
2044 |
<DT>File extension
|
2045 |
<DD>
|
2046 |
<CODE>pl</CODE>, <CODE>PL</CODE>, <CODE>pm</CODE>, <CODE>cgi</CODE> |
2047 |
|
2048 |
<DT>String syntax
|
2049 |
<DD>
|
2050 |
|
2051 |
<UL>
|
2052 |
|
2053 |
<LI><CODE>"abc"</CODE> |
2054 |
|
2055 |
<LI><CODE>'abc'</CODE> |
2056 |
|
2057 |
<LI><CODE>qq (abc)</CODE> |
2058 |
|
2059 |
<LI><CODE>q (abc)</CODE> |
2060 |
|
2061 |
<LI><CODE>qr /abc/</CODE> |
2062 |
|
2063 |
<LI><CODE>qx (/bin/date)</CODE> |
2064 |
|
2065 |
<LI><CODE>/pattern match/</CODE> |
2066 |
|
2067 |
<LI><CODE>?pattern match?</CODE> |
2068 |
|
2069 |
<LI><CODE>s/substitution/operators/</CODE> |
2070 |
|
2071 |
<LI><CODE>$tied_hash{"message"}</CODE> |
2072 |
|
2073 |
<LI><CODE>$tied_hash_reference->{"message"}</CODE> |
2074 |
|
2075 |
<LI>etc., issue the command <SAMP>`man perlsyn´</SAMP> for details |
2076 |
|
2077 |
</UL>
|
2078 |
|
2079 |
<DT>gettext shorthand
|
2080 |
<DD>
|
2081 |
<CODE>__</CODE> (double underscore) |
2082 |
|
2083 |
<DT>gettext/ngettext functions
|
2084 |
<DD>
|
2085 |
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>, |
2086 |
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE> |
2087 |
|
2088 |
<DT>textdomain
|
2089 |
<DD>
|
2090 |
<CODE>textdomain</CODE> function |
2091 |
|
2092 |
<DT>bindtextdomain
|
2093 |
<DD>
|
2094 |
<CODE>bindtextdomain</CODE> function |
2095 |
|
2096 |
<DT>bind_textdomain_codeset
|
2097 |
<DD>
|
2098 |
<CODE>bind_textdomain_codeset</CODE> function |
2099 |
|
2100 |
<DT>setlocale
|
2101 |
<DD>
|
2102 |
Use <CODE>setlocale (LC_ALL, "");</CODE> |
2103 |
|
2104 |
<DT>Prerequisite
|
2105 |
<DD>
|
2106 |
<CODE>use POSIX;</CODE> |
2107 |
<BR><CODE>use Locale::TextDomain;</CODE> (included in the package libintl-perl |
2108 |
which is available on the Comprehensive Perl Archive Network CPAN, |
2109 |
http://www.cpan.org/). |
2110 |
|
2111 |
<DT>Use or emulate GNU gettext
|
2112 |
<DD>
|
2113 |
platform dependent: gettext_pp emulates, gettext_xs uses GNU gettext |
2114 |
|
2115 |
<DT>Extractor
|
2116 |
<DD>
|
2117 |
<CODE>xgettext -k__ -k\$__ -k%__ -k__x -k__n:1,2 -k__nx:1,2 -k__xn:1,2 -kN__ -k</CODE> |
2118 |
|
2119 |
<DT>Formatting with positions
|
2120 |
<DD>
|
2121 |
Both kinds of format strings support formatting with positions. |
2122 |
<BR><CODE>printf "%2\$d %1\$d", ...</CODE> (requires Perl 5.8.0 or newer) |
2123 |
<BR><CODE>__expand("[new] replaces [old]", old => $oldvalue, new => $newvalue)</CODE> |
2124 |
|
2125 |
<DT>Portability
|
2126 |
<DD>
|
2127 |
The <CODE>libintl-perl</CODE> package is platform independent but is not |
2128 |
part of the Perl core. The programmer is responsible for |
2129 |
providing a dummy implementation of the required functions if the |
2130 |
package is not installed on the target system. |
2131 |
|
2132 |
<DT>po-mode marking
|
2133 |
<DD>
|
2134 |
--- |
2135 |
|
2136 |
<DT>Documentation
|
2137 |
<DD>
|
2138 |
Included in <CODE>libintl-perl</CODE>, available on CPAN |
2139 |
(http://www.cpan.org/). |
2140 |
|
2141 |
</DL>
|
2142 |
|
2143 |
<P>
|
2144 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-perl</CODE>. |
2145 |
|
2146 |
</P>
|
2147 |
<P>
|
2148 |
<A NAME="IDX1116"></A> |
2149 |
|
2150 |
</P>
|
2151 |
<P>
|
2152 |
The <CODE>xgettext</CODE> parser backend for Perl differs significantly from |
2153 |
the parser backends for other programming languages, just as Perl |
2154 |
itself differs significantly from other programming languages. The |
2155 |
Perl parser backend offers many more string marking facilities than |
2156 |
the other backends but it also has some Perl specific limitations, the |
2157 |
worst probably being its imperfectness. |
2158 |
|
2159 |
</P>
|
2160 |
|
2161 |
|
2162 |
|
2163 |
<H4><A NAME="SEC263" HREF="gettext_toc.html#TOC263">13.5.16.1 General Problems Parsing Perl Code</A></H4> |
2164 |
|
2165 |
<P>
|
2166 |
It is often heard that only Perl can parse Perl. This is not true. |
2167 |
Perl cannot be <EM>parsed</EM> at all, it can only be <EM>executed</EM>. |
2168 |
Perl has various built-in ambiguities that can only be resolved at runtime. |
2169 |
|
2170 |
</P>
|
2171 |
<P>
|
2172 |
The following example may illustrate one common problem: |
2173 |
|
2174 |
</P>
|
2175 |
|
2176 |
<PRE>
|
2177 |
print gettext "Hello World!"; |
2178 |
</PRE>
|
2179 |
|
2180 |
<P>
|
2181 |
Although this example looks like a bullet-proof case of a function |
2182 |
invocation, it is not: |
2183 |
|
2184 |
</P>
|
2185 |
|
2186 |
<PRE>
|
2187 |
open gettext, ">testfile" or die;
|
2188 |
print gettext "Hello world!" |
2189 |
</PRE>
|
2190 |
|
2191 |
<P>
|
2192 |
In this context, the string <CODE>gettext</CODE> looks more like a |
2193 |
file handle. But not necessarily: |
2194 |
|
2195 |
</P>
|
2196 |
|
2197 |
<PRE>
|
2198 |
use Locale::Messages qw (:libintl_h); |
2199 |
open gettext ">testfile" or die;
|
2200 |
print gettext "Hello world!"; |
2201 |
</PRE>
|
2202 |
|
2203 |
<P>
|
2204 |
Now, the file is probably syntactically incorrect, provided that the module |
2205 |
<CODE>Locale::Messages</CODE> found first in the Perl include path exports a |
2206 |
function <CODE>gettext</CODE>. But what if the module |
2207 |
<CODE>Locale::Messages</CODE> really looks like this? |
2208 |
|
2209 |
</P>
|
2210 |
|
2211 |
<PRE>
|
2212 |
use vars qw (*gettext); |
2213 |
|
2214 |
1; |
2215 |
</PRE>
|
2216 |
|
2217 |
<P>
|
2218 |
In this case, the string <CODE>gettext</CODE> will be interpreted as a file |
2219 |
handle again, and the above example will create a file <TT>`testfile´</TT> |
2220 |
and write the string "Hello world!" into it. Even advanced |
2221 |
control flow analysis will not really help: |
2222 |
|
2223 |
</P>
|
2224 |
|
2225 |
<PRE>
|
2226 |
if (0.5 < rand) {
|
2227 |
eval "use Sane"; |
2228 |
} else { |
2229 |
eval "use InSane"; |
2230 |
} |
2231 |
print gettext "Hello world!"; |
2232 |
</PRE>
|
2233 |
|
2234 |
<P>
|
2235 |
If the module <CODE>Sane</CODE> exports a function <CODE>gettext</CODE> that does |
2236 |
what we expect, and the module <CODE>InSane</CODE> opens a file for writing |
2237 |
and associates the <EM>handle</EM> <CODE>gettext</CODE> with this output |
2238 |
stream, we are clueless again about what will happen at runtime. It is |
2239 |
completely unpredictable. The truth is that Perl has so many ways to |
2240 |
fill its symbol table at runtime that it is impossible to interpret a |
2241 |
particular piece of code without executing it. |
2242 |
|
2243 |
</P>
|
2244 |
<P>
|
2245 |
Of course, <CODE>xgettext</CODE> will not execute your Perl sources while |
2246 |
scanning for translatable strings, but rather use heuristics in order |
2247 |
to guess what you meant. |
2248 |
|
2249 |
</P>
|
2250 |
<P>
|
2251 |
Another problem is the ambiguity of the slash and the question mark. |
2252 |
Their interpretation depends on the context: |
2253 |
|
2254 |
</P>
|
2255 |
|
2256 |
<PRE>
|
2257 |
# A pattern match. |
2258 |
print "OK\n" if /foobar/; |
2259 |
|
2260 |
# A division. |
2261 |
print 1 / 2; |
2262 |
|
2263 |
# Another pattern match. |
2264 |
print "OK\n" if ?foobar?; |
2265 |
|
2266 |
# Conditional. |
2267 |
print $x ? "foo" : "bar"; |
2268 |
</PRE>
|
2269 |
|
2270 |
<P>
|
2271 |
The slash may either act as the division operator or introduce a |
2272 |
pattern match, whereas the question mark may act as the ternary |
2273 |
conditional operator or as a pattern match, too. Other programming |
2274 |
languages like <CODE>awk</CODE> present similar problems, but the consequences of a |
2275 |
misinterpretation are particularly nasty with Perl sources. In <CODE>awk</CODE> |
2276 |
for instance, a statement can never exceed one line and the parser |
2277 |
can recover from a parsing error at the next newline and interpret |
2278 |
the rest of the input stream correctly. Perl is different, as a |
2279 |
pattern match is terminated by the next appearance of the delimiter |
2280 |
(the slash or the question mark) in the input stream, regardless of |
2281 |
the semantic context. If a slash is really a division sign but |
2282 |
mis-interpreted as a pattern match, the rest of the input file is most |
2283 |
probably parsed incorrectly. |
2284 |
|
2285 |
</P>
|
2286 |
<P>
|
2287 |
If you find that <CODE>xgettext</CODE> fails to extract strings from |
2288 |
portions of your sources, you should therefore look out for slashes |
2289 |
and/or question marks preceding these sections. You may have come |
2290 |
across a bug in <CODE>xgettext</CODE>'s Perl parser (and of course you |
2291 |
should report that bug). In the meantime you should consider to |
2292 |
reformulate your code in a manner less challenging to <CODE>xgettext</CODE>. |
2293 |
|
2294 |
</P>
|
2295 |
|
2296 |
|
2297 |
<H4><A NAME="SEC264" HREF="gettext_toc.html#TOC264">13.5.16.2 Which keywords will xgettext look for?</A></H4> |
2298 |
<P>
|
2299 |
<A NAME="IDX1117"></A> |
2300 |
|
2301 |
</P>
|
2302 |
<P>
|
2303 |
Unless you instruct <CODE>xgettext</CODE> otherwise by invoking it with one |
2304 |
of the options <CODE>--keyword</CODE> or <CODE>-k</CODE>, it will recognize the |
2305 |
following keywords in your Perl sources: |
2306 |
|
2307 |
</P>
|
2308 |
|
2309 |
<UL>
|
2310 |
|
2311 |
<LI><CODE>gettext</CODE> |
2312 |
|
2313 |
<LI><CODE>dgettext</CODE> |
2314 |
|
2315 |
<LI><CODE>dcgettext</CODE> |
2316 |
|
2317 |
<LI><CODE>ngettext:1,2</CODE> |
2318 |
|
2319 |
The first (singular) and the second (plural) argument will be |
2320 |
extracted. |
2321 |
|
2322 |
<LI><CODE>dngettext:1,2</CODE> |
2323 |
|
2324 |
The first (singular) and the second (plural) argument will be |
2325 |
extracted. |
2326 |
|
2327 |
<LI><CODE>dcngettext:1,2</CODE> |
2328 |
|
2329 |
The first (singular) and the second (plural) argument will be |
2330 |
extracted. |
2331 |
|
2332 |
<LI><CODE>gettext_noop</CODE> |
2333 |
|
2334 |
<LI><CODE>%gettext</CODE> |
2335 |
|
2336 |
The keys of lookups into the hash <CODE>%gettext</CODE> will be extracted. |
2337 |
|
2338 |
<LI><CODE>$gettext</CODE> |
2339 |
|
2340 |
The keys of lookups into the hash reference <CODE>$gettext</CODE> will be extracted. |
2341 |
|
2342 |
</UL>
|
2343 |
|
2344 |
|
2345 |
|
2346 |
<H4><A NAME="SEC265" HREF="gettext_toc.html#TOC265">13.5.16.3 How to Extract Hash Keys</A></H4> |
2347 |
<P>
|
2348 |
<A NAME="IDX1118"></A> |
2349 |
|
2350 |
</P>
|
2351 |
<P>
|
2352 |
Translating messages at runtime is normally performed by looking up the |
2353 |
original string in the translation database and returning the |
2354 |
translated version. The "natural" Perl implementation is a hash |
2355 |
lookup, and, of course, <CODE>xgettext</CODE> supports such practice. |
2356 |
|
2357 |
</P>
|
2358 |
|
2359 |
<PRE>
|
2360 |
print __"Hello world!"; |
2361 |
print $__{"Hello world!"}; |
2362 |
print $__->{"Hello world!"};
|
2363 |
print $$__{"Hello world!"}; |
2364 |
</PRE>
|
2365 |
|
2366 |
<P>
|
2367 |
The above four lines all do the same thing. The Perl module |
2368 |
<CODE>Locale::TextDomain</CODE> exports by default a hash <CODE>%__</CODE> that |
2369 |
is tied to the function <CODE>__()</CODE>. It also exports a reference |
2370 |
<CODE>$__</CODE> to <CODE>%__</CODE>. |
2371 |
|
2372 |
</P>
|
2373 |
<P>
|
2374 |
If an argument to the <CODE>xgettext</CODE> option <CODE>--keyword</CODE>, |
2375 |
resp. <CODE>-k</CODE> starts with a percent sign, the rest of the keyword is |
2376 |
interpreted as the name of a hash. If it starts with a dollar |
2377 |
sign, the rest of the keyword is interpreted as a reference to a |
2378 |
hash. |
2379 |
|
2380 |
</P>
|
2381 |
<P>
|
2382 |
Note that you can omit the quotation marks (single or double) around |
2383 |
the hash key (almost) whenever Perl itself allows it: |
2384 |
|
2385 |
</P>
|
2386 |
|
2387 |
<PRE>
|
2388 |
print $gettext{Error}; |
2389 |
</PRE>
|
2390 |
|
2391 |
<P>
|
2392 |
The exact rule is: You can omit the surrounding quotes, when the hash |
2393 |
key is a valid C (!) identifier, i. e. when it starts with an |
2394 |
underscore or an ASCII letter and is followed by an arbitrary number |
2395 |
of underscores, ASCII letters or digits. Other Unicode characters |
2396 |
are <EM>not</EM> allowed, regardless of the <CODE>use utf8</CODE> pragma. |
2397 |
|
2398 |
</P>
|
2399 |
|
2400 |
|
2401 |
<H4><A NAME="SEC266" HREF="gettext_toc.html#TOC266">13.5.16.4 What are Strings And Quote-like Expressions?</A></H4> |
2402 |
<P>
|
2403 |
<A NAME="IDX1119"></A> |
2404 |
|
2405 |
</P>
|
2406 |
<P>
|
2407 |
Perl offers a plethora of different string constructs. Those that can |
2408 |
be used either as arguments to functions or inside braces for hash |
2409 |
lookups are generally supported by <CODE>xgettext</CODE>. |
2410 |
|
2411 |
</P>
|
2412 |
|
2413 |
<UL>
|
2414 |
<LI><STRONG>double-quoted strings</STRONG> |
2415 |
|
2416 |
<BR>
|
2417 |
|
2418 |
<PRE>
|
2419 |
print gettext "Hello World!"; |
2420 |
</PRE>
|
2421 |
|
2422 |
<LI><STRONG>single-quoted strings</STRONG> |
2423 |
|
2424 |
<BR>
|
2425 |
|
2426 |
<PRE>
|
2427 |
print gettext 'Hello World!'; |
2428 |
</PRE>
|
2429 |
|
2430 |
<LI><STRONG>the operator qq</STRONG> |
2431 |
|
2432 |
<BR>
|
2433 |
|
2434 |
<PRE>
|
2435 |
print gettext qq |Hello World!|; |
2436 |
print gettext qq <E-mail: <guido\@imperia.net>>; |
2437 |
</PRE>
|
2438 |
|
2439 |
The operator <CODE>qq</CODE> is fully supported. You can use arbitrary |
2440 |
delimiters, including the four bracketing delimiters (round, angle, |
2441 |
square, curly) that nest. |
2442 |
|
2443 |
<LI><STRONG>the operator q</STRONG> |
2444 |
|
2445 |
<BR>
|
2446 |
|
2447 |
<PRE>
|
2448 |
print gettext q |Hello World!|; |
2449 |
print gettext q <E-mail: <guido@imperia.net>>; |
2450 |
</PRE>
|
2451 |
|
2452 |
The operator <CODE>q</CODE> is fully supported. You can use arbitrary |
2453 |
delimiters, including the four bracketing delimiters (round, angle, |
2454 |
square, curly) that nest. |
2455 |
|
2456 |
<LI><STRONG>the operator qx</STRONG> |
2457 |
|
2458 |
<BR>
|
2459 |
|
2460 |
<PRE>
|
2461 |
print gettext qx ;LANGUAGE=C /bin/date; |
2462 |
print gettext qx [/usr/bin/ls | grep '^[A-Z]*']; |
2463 |
</PRE>
|
2464 |
|
2465 |
The operator <CODE>qx</CODE> is fully supported. You can use arbitrary |
2466 |
delimiters, including the four bracketing delimiters (round, angle, |
2467 |
square, curly) that nest. |
2468 |
|
2469 |
The example is actually a useless use of <CODE>gettext</CODE>. It will |
2470 |
invoke the <CODE>gettext</CODE> function on the output of the command |
2471 |
specified with the <CODE>qx</CODE> operator. The feature was included |
2472 |
in order to make the interface consistent (the parser will extract |
2473 |
all strings and quote-like expressions). |
2474 |
|
2475 |
<LI><STRONG>here documents</STRONG> |
2476 |
|
2477 |
<BR>
|
2478 |
|
2479 |
<PRE>
|
2480 |
print gettext <<'EOF'; |
2481 |
program not found in $PATH |
2482 |
EOF |
2483 |
|
2484 |
print ngettext <<EOF, <<"EOF"; |
2485 |
one file deleted |
2486 |
EOF |
2487 |
several files deleted |
2488 |
EOF |
2489 |
</PRE>
|
2490 |
|
2491 |
Here-documents are recognized. If the delimiter is enclosed in single |
2492 |
quotes, the string is not interpolated. If it is enclosed in double |
2493 |
quotes or has no quotes at all, the string is interpolated. |
2494 |
|
2495 |
Delimiters that start with a digit are not supported! |
2496 |
|
2497 |
</UL>
|
2498 |
|
2499 |
|
2500 |
|
2501 |
<H4><A NAME="SEC267" HREF="gettext_toc.html#TOC267">13.5.16.5 Invalid Uses Of String Interpolation</A></H4> |
2502 |
<P>
|
2503 |
<A NAME="IDX1120"></A> |
2504 |
|
2505 |
</P>
|
2506 |
<P>
|
2507 |
Perl is capable of interpolating variables into strings. This offers |
2508 |
some nice features in localized programs but can also lead to |
2509 |
problems. |
2510 |
|
2511 |
</P>
|
2512 |
<P>
|
2513 |
A common error is a construct like the following: |
2514 |
|
2515 |
</P>
|
2516 |
|
2517 |
<PRE>
|
2518 |
print gettext "This is the program $0!\n"; |
2519 |
</PRE>
|
2520 |
|
2521 |
<P>
|
2522 |
Perl will interpolate at runtime the value of the variable <CODE>$0</CODE> |
2523 |
into the argument of the <CODE>gettext()</CODE> function. Hence, this |
2524 |
argument is not a string constant but a variable argument (<CODE>$0</CODE> |
2525 |
is a global variable that holds the name of the Perl script being |
2526 |
executed). The interpolation is performed by Perl before the string |
2527 |
argument is passed to <CODE>gettext()</CODE> and will therefore depend on |
2528 |
the name of the script which can only be determined at runtime. |
2529 |
Consequently, it is almost impossible that a translation can be looked |
2530 |
up at runtime (except if, by accident, the interpolated string is found |
2531 |
in the message catalog). |
2532 |
|
2533 |
</P>
|
2534 |
<P>
|
2535 |
The <CODE>xgettext</CODE> program will therefore terminate parsing with a fatal |
2536 |
error if it encounters a variable inside of an extracted string. In |
2537 |
general, this will happen for all kinds of string interpolations that |
2538 |
cannot be safely performed at compile time. If you absolutely know |
2539 |
what you are doing, you can always circumvent this behavior: |
2540 |
|
2541 |
</P>
|
2542 |
|
2543 |
<PRE>
|
2544 |
my $know_what_i_am_doing = "This is program $0!\n"; |
2545 |
print gettext $know_what_i_am_doing; |
2546 |
</PRE>
|
2547 |
|
2548 |
<P>
|
2549 |
Since the parser only recognizes strings and quote-like expressions, |
2550 |
but not variables or other terms, the above construct will be |
2551 |
accepted. You will have to find another way, however, to let your |
2552 |
original string make it into your message catalog. |
2553 |
|
2554 |
</P>
|
2555 |
<P>
|
2556 |
If invoked with the option <CODE>--extract-all</CODE>, resp. <CODE>-a</CODE>, |
2557 |
variable interpolation will be accepted. Rationale: You will |
2558 |
generally use this option in order to prepare your sources for |
2559 |
internationalization. |
2560 |
|
2561 |
</P>
|
2562 |
<P>
|
2563 |
Please see the manual page <SAMP>`man perlop´</SAMP> for details of strings and |
2564 |
quote-like expressions that are subject to interpolation and those |
2565 |
that are not. Safe interpolations (that will not lead to a fatal |
2566 |
error) are: |
2567 |
|
2568 |
</P>
|
2569 |
|
2570 |
<UL>
|
2571 |
|
2572 |
<LI>the escape sequences <CODE>\t</CODE> (tab, HT, TAB), <CODE>\n</CODE> |
2573 |
|
2574 |
(newline, NL), <CODE>\r</CODE> (return, CR), <CODE>\f</CODE> (form feed, FF), |
2575 |
<CODE>\b</CODE> (backspace, BS), <CODE>\a</CODE> (alarm, bell, BEL), and <CODE>\e</CODE> |
2576 |
(escape, ESC). |
2577 |
|
2578 |
<LI>octal chars, like <CODE>\033</CODE> |
2579 |
|
2580 |
<BR>
|
2581 |
Note that octal escapes in the range of 400-777 are translated into a |
2582 |
UTF-8 representation, regardless of the presence of the <CODE>use utf8</CODE> pragma. |
2583 |
|
2584 |
<LI>hex chars, like <CODE>\x1b</CODE> |
2585 |
|
2586 |
<LI>wide hex chars, like <CODE>\x{263a}</CODE> |
2587 |
|
2588 |
<BR>
|
2589 |
Note that this escape is translated into a UTF-8 representation, |
2590 |
regardless of the presence of the <CODE>use utf8</CODE> pragma. |
2591 |
|
2592 |
<LI>control chars, like <CODE>\c[</CODE> (CTRL-[) |
2593 |
|
2594 |
<LI>named Unicode chars, like <CODE>\N{LATIN CAPITAL LETTER C WITH CEDILLA}</CODE> |
2595 |
|
2596 |
<BR>
|
2597 |
Note that this escape is translated into a UTF-8 representation, |
2598 |
regardless of the presence of the <CODE>use utf8</CODE> pragma. |
2599 |
</UL>
|
2600 |
|
2601 |
<P>
|
2602 |
The following escapes are considered partially safe: |
2603 |
|
2604 |
</P>
|
2605 |
|
2606 |
<UL>
|
2607 |
|
2608 |
<LI><CODE>\l</CODE> lowercase next char |
2609 |
|
2610 |
<LI><CODE>\u</CODE> uppercase next char |
2611 |
|
2612 |
<LI><CODE>\L</CODE> lowercase till \E |
2613 |
|
2614 |
<LI><CODE>\U</CODE> uppercase till \E |
2615 |
|
2616 |
<LI><CODE>\E</CODE> end case modification |
2617 |
|
2618 |
<LI><CODE>\Q</CODE> quote non-word characters till \E |
2619 |
|
2620 |
</UL>
|
2621 |
|
2622 |
<P>
|
2623 |
These escapes are only considered safe if the string consists of |
2624 |
ASCII characters only. Translation of characters outside the range |
2625 |
defined by ASCII is locale-dependent and can actually only be performed |
2626 |
at runtime; <CODE>xgettext</CODE> doesn't do these locale-dependent translations |
2627 |
at extraction time. |
2628 |
|
2629 |
</P>
|
2630 |
<P>
|
2631 |
Except for the modifier <CODE>\Q</CODE>, these translations, albeit valid, |
2632 |
are generally useless and only obfuscate your sources. If a |
2633 |
translation can be safely performed at compile time you can just as |
2634 |
well write what you mean. |
2635 |
|
2636 |
</P>
|
2637 |
|
2638 |
|
2639 |
<H4><A NAME="SEC268" HREF="gettext_toc.html#TOC268">13.5.16.6 Valid Uses Of String Interpolation</A></H4> |
2640 |
<P>
|
2641 |
<A NAME="IDX1121"></A> |
2642 |
|
2643 |
</P>
|
2644 |
<P>
|
2645 |
Perl is often used to generate sources for other programming languages |
2646 |
or arbitrary file formats. Web applications that output HTML code |
2647 |
make a prominent example for such usage. |
2648 |
|
2649 |
</P>
|
2650 |
<P>
|
2651 |
You will often come across situations where you want to intersperse |
2652 |
code written in the target (programming) language with translatable |
2653 |
messages, like in the following HTML example: |
2654 |
|
2655 |
</P>
|
2656 |
|
2657 |
<PRE>
|
2658 |
print gettext <<EOF; |
2659 |
<h1>My Homepage</h1> |
2660 |
<script language="JavaScript"><!-- |
2661 |
for (i = 0; i < 100; ++i) {
|
2662 |
alert ("Thank you so much for visiting my homepage!"); |
2663 |
} |
2664 |
//--></script> |
2665 |
EOF |
2666 |
</PRE>
|
2667 |
|
2668 |
<P>
|
2669 |
The parser will extract the entire here document, and it will appear |
2670 |
entirely in the resulting PO file, including the JavaScript snippet |
2671 |
embedded in the HTML code. If you exaggerate with constructs like |
2672 |
the above, you will run the risk that the translators of your package |
2673 |
will look out for a less challenging project. You should consider an |
2674 |
alternative expression here: |
2675 |
|
2676 |
</P>
|
2677 |
|
2678 |
<PRE>
|
2679 |
print <<EOF; |
2680 |
<h1>$gettext{"My Homepage"}</h1> |
2681 |
<script language="JavaScript"><!-- |
2682 |
for (i = 0; i < 100; ++i) {
|
2683 |
alert ("$gettext{'Thank you so much for visiting my homepage!'}"); |
2684 |
} |
2685 |
//--></script> |
2686 |
EOF |
2687 |
</PRE>
|
2688 |
|
2689 |
<P>
|
2690 |
Only the translatable portions of the code will be extracted here, and |
2691 |
the resulting PO file will begrudgingly improve in terms of readability. |
2692 |
|
2693 |
</P>
|
2694 |
<P>
|
2695 |
You can interpolate hash lookups in all strings or quote-like |
2696 |
expressions that are subject to interpolation (see the manual page |
2697 |
<SAMP>`man perlop´</SAMP> for details). Double interpolation is invalid, however: |
2698 |
|
2699 |
</P>
|
2700 |
|
2701 |
<PRE>
|
2702 |
# TRANSLATORS: Replace "the earth" with the name of your planet. |
2703 |
print gettext qq{Welcome to $gettext->{"the earth"}};
|
2704 |
</PRE>
|
2705 |
|
2706 |
<P>
|
2707 |
The <CODE>qq</CODE>-quoted string is recognized as an argument to <CODE>xgettext</CODE> in |
2708 |
the first place, and checked for invalid variable interpolation. The |
2709 |
dollar sign of hash-dereferencing will therefore terminate the parser |
2710 |
with an "invalid interpolation" error. |
2711 |
|
2712 |
</P>
|
2713 |
<P>
|
2714 |
It is valid to interpolate hash lookups in regular expressions: |
2715 |
|
2716 |
</P>
|
2717 |
|
2718 |
<PRE>
|
2719 |
if ($var =~ /$gettext{"the earth"}/) { |
2720 |
print gettext "Match!\n"; |
2721 |
} |
2722 |
s/$gettext{"U. S. A."}/$gettext{"U. S. A."} $gettext{"(dial +0)"}/g; |
2723 |
</PRE>
|
2724 |
|
2725 |
|
2726 |
|
2727 |
<H4><A NAME="SEC269" HREF="gettext_toc.html#TOC269">13.5.16.7 When To Use Parentheses</A></H4> |
2728 |
<P>
|
2729 |
<A NAME="IDX1122"></A> |
2730 |
|
2731 |
</P>
|
2732 |
<P>
|
2733 |
In Perl, parentheses around function arguments are mostly optional. |
2734 |
<CODE>xgettext</CODE> will always assume that all |
2735 |
recognized keywords (except for hashs and hash references) are names |
2736 |
of properly prototyped functions, and will (hopefully) only require |
2737 |
parentheses where Perl itself requires them. All constructs in the |
2738 |
following example are therefore ok to use: |
2739 |
|
2740 |
</P>
|
2741 |
|
2742 |
<PRE>
|
2743 |
print gettext ("Hello World!\n"); |
2744 |
print gettext "Hello World!\n"; |
2745 |
print dgettext ($package => "Hello World!\n");
|
2746 |
print dgettext $package, "Hello World!\n"; |
2747 |
|
2748 |
# The "fat comma" => turns the left-hand side argument into a
|
2749 |
# single-quoted string! |
2750 |
print dgettext smellovision => "Hello World!\n";
|
2751 |
|
2752 |
# The following assignment only works with prototyped functions. |
2753 |
# Otherwise, the functions will act as "greedy" list operators and |
2754 |
# eat up all following arguments. |
2755 |
my $anonymous_hash = { |
2756 |
planet => gettext "earth",
|
2757 |
cakes => ngettext "one cake", "several cakes", $n,
|
2758 |
still => $works,
|
2759 |
}; |
2760 |
# The same without fat comma: |
2761 |
my $other_hash = { |
2762 |
'planet', gettext "earth", |
2763 |
'cakes', ngettext "one cake", "several cakes", $n, |
2764 |
'still', $works, |
2765 |
}; |
2766 |
|
2767 |
# Parentheses are only significant for the first argument. |
2768 |
print dngettext 'package', ("one cake", "several cakes", $n), $discarded; |
2769 |
</PRE>
|
2770 |
|
2771 |
|
2772 |
|
2773 |
<H4><A NAME="SEC270" HREF="gettext_toc.html#TOC270">13.5.16.8 How To Grok with Long Lines</A></H4> |
2774 |
<P>
|
2775 |
<A NAME="IDX1123"></A> |
2776 |
|
2777 |
</P>
|
2778 |
<P>
|
2779 |
The necessity of long messages can often lead to a cumbersome or |
2780 |
unreadable coding style. Perl has several options that may prevent |
2781 |
you from writing unreadable code, and |
2782 |
<CODE>xgettext</CODE> does its best to do likewise. This is where the dot |
2783 |
operator (the string concatenation operator) may come in handy: |
2784 |
|
2785 |
</P>
|
2786 |
|
2787 |
<PRE>
|
2788 |
print gettext ("This is a very long" |
2789 |
. " message that is still" |
2790 |
. " readable, because" |
2791 |
. " it is split into" |
2792 |
. " multiple lines.\n"); |
2793 |
</PRE>
|
2794 |
|
2795 |
<P>
|
2796 |
Perl is smart enough to concatenate these constant string fragments |
2797 |
into one long string at compile time, and so is |
2798 |
<CODE>xgettext</CODE>. You will only find one long message in the resulting |
2799 |
POT file. |
2800 |
|
2801 |
</P>
|
2802 |
<P>
|
2803 |
Note that the future Perl 6 will probably use the underscore |
2804 |
(<SAMP>`_´</SAMP>) as the string concatenation operator, and the dot |
2805 |
(<SAMP>`.´</SAMP>) for dereferencing. This new syntax is not yet supported by |
2806 |
<CODE>xgettext</CODE>. |
2807 |
|
2808 |
</P>
|
2809 |
<P>
|
2810 |
If embedded newline characters are not an issue, or even desired, you |
2811 |
may also insert newline characters inside quoted strings wherever you |
2812 |
feel like it: |
2813 |
|
2814 |
</P>
|
2815 |
|
2816 |
<PRE>
|
2817 |
print gettext ("<em>In HTML output |
2818 |
embedded newlines are generally no |
2819 |
problem, since adjacent whitespace |
2820 |
is always rendered into a single |
2821 |
space character.</em>"); |
2822 |
</PRE>
|
2823 |
|
2824 |
<P>
|
2825 |
You may also consider to use here documents: |
2826 |
|
2827 |
</P>
|
2828 |
|
2829 |
<PRE>
|
2830 |
print gettext <<EOF; |
2831 |
<em>In HTML output |
2832 |
embedded newlines are generally no |
2833 |
problem, since adjacent whitespace |
2834 |
is always rendered into a single |
2835 |
space character.</em> |
2836 |
EOF |
2837 |
</PRE>
|
2838 |
|
2839 |
<P>
|
2840 |
Please do not forget, that the line breaks are real, i. e. they |
2841 |
translate into newline characters that will consequently show up in |
2842 |
the resulting POT file. |
2843 |
|
2844 |
</P>
|
2845 |
|
2846 |
|
2847 |
<H4><A NAME="SEC271" HREF="gettext_toc.html#TOC271">13.5.16.9 Bugs, Pitfalls, And Things That Do Not Work</A></H4> |
2848 |
<P>
|
2849 |
<A NAME="IDX1124"></A> |
2850 |
|
2851 |
</P>
|
2852 |
<P>
|
2853 |
The foregoing sections should have proven that |
2854 |
<CODE>xgettext</CODE> is quite smart in extracting translatable strings from |
2855 |
Perl sources. Yet, some more or less exotic constructs that could be |
2856 |
expected to work, actually do not work. |
2857 |
|
2858 |
</P>
|
2859 |
<P>
|
2860 |
One of the more relevant limitations can be found in the |
2861 |
implementation of variable interpolation inside quoted strings. Only |
2862 |
simple hash lookups can be used there: |
2863 |
|
2864 |
</P>
|
2865 |
|
2866 |
<PRE>
|
2867 |
print <<EOF; |
2868 |
$gettext{"The dot operator" |
2869 |
. " does not work" |
2870 |
. "here!"} |
2871 |
Likewise, you cannot @{[ gettext ("interpolate function calls") ]} |
2872 |
inside quoted strings or quote-like expressions. |
2873 |
EOF |
2874 |
</PRE>
|
2875 |
|
2876 |
<P>
|
2877 |
This is valid Perl code and will actually trigger invocations of the |
2878 |
<CODE>gettext</CODE> function at runtime. Yet, the Perl parser in |
2879 |
<CODE>xgettext</CODE> will fail to recognize the strings. A less obvious |
2880 |
example can be found in the interpolation of regular expressions: |
2881 |
|
2882 |
</P>
|
2883 |
|
2884 |
<PRE>
|
2885 |
s/<!--START_OF_WEEK-->/gettext ("Sunday")/e; |
2886 |
</PRE>
|
2887 |
|
2888 |
<P>
|
2889 |
The modifier <CODE>e</CODE> will cause the substitution to be interpreted as |
2890 |
an evaluable statement. Consequently, at runtime the function |
2891 |
<CODE>gettext()</CODE> is called, but again, the parser fails to extract the |
2892 |
string "Sunday". Use a temporary variable as a simple workaround if |
2893 |
you really happen to need this feature: |
2894 |
|
2895 |
</P>
|
2896 |
|
2897 |
<PRE>
|
2898 |
my $sunday = gettext "Sunday"; |
2899 |
s/<!--START_OF_WEEK-->/$sunday/; |
2900 |
</PRE>
|
2901 |
|
2902 |
<P>
|
2903 |
Hash slices would also be handy but are not recognized: |
2904 |
|
2905 |
</P>
|
2906 |
|
2907 |
<PRE>
|
2908 |
my @weekdays = @gettext{'Sunday', 'Monday', 'Tuesday', 'Wednesday', |
2909 |
'Thursday', 'Friday', 'Saturday'}; |
2910 |
# Or even: |
2911 |
@weekdays = @gettext{qw (Sunday Monday Tuesday Wednesday Thursday |
2912 |
Friday Saturday) }; |
2913 |
</PRE>
|
2914 |
|
2915 |
<P>
|
2916 |
This is perfectly valid usage of the tied hash <CODE>%gettext</CODE> but the |
2917 |
strings are not recognized and therefore will not be extracted. |
2918 |
|
2919 |
</P>
|
2920 |
<P>
|
2921 |
Another caveat of the current version is its rudimentary support for |
2922 |
non-ASCII characters in identifiers. You may encounter serious |
2923 |
problems if you use identifiers with characters outside the range of |
2924 |
'A'-'Z', 'a'-'z', '0'-'9' and the underscore '_'. |
2925 |
|
2926 |
</P>
|
2927 |
<P>
|
2928 |
Maybe some of these missing features will be implemented in future |
2929 |
versions, but since you can always make do without them at minimal effort, |
2930 |
these todos have very low priority. |
2931 |
|
2932 |
</P>
|
2933 |
<P>
|
2934 |
A nasty problem are brace format strings that already contain braces |
2935 |
as part of the normal text, for example the usage strings typically |
2936 |
encountered in programs: |
2937 |
|
2938 |
</P>
|
2939 |
|
2940 |
<PRE>
|
2941 |
die "usage: $0 {OPTIONS} FILENAME...\n"; |
2942 |
</PRE>
|
2943 |
|
2944 |
<P>
|
2945 |
If you want to internationalize this code with Perl brace format strings, |
2946 |
you will run into a problem: |
2947 |
|
2948 |
</P>
|
2949 |
|
2950 |
<PRE>
|
2951 |
die __x ("usage: {program} {OPTIONS} FILENAME...\n", program => $0);
|
2952 |
</PRE>
|
2953 |
|
2954 |
<P>
|
2955 |
Whereas <SAMP>`{program}´</SAMP> is a placeholder, <SAMP>`{OPTIONS}´</SAMP> |
2956 |
is not and should probably be translated. Yet, there is no way to teach |
2957 |
the Perl parser in <CODE>xgettext</CODE> to recognize the first one, and leave |
2958 |
the other one alone. |
2959 |
|
2960 |
</P>
|
2961 |
<P>
|
2962 |
There are two possible work-arounds for this problem. If you are |
2963 |
sure that your program will run under Perl 5.8.0 or newer (these |
2964 |
Perl versions handle positional parameters in <CODE>printf()</CODE>) or |
2965 |
if you are sure that the translator will not have to reorder the arguments |
2966 |
in her translation -- for example if you have only one brace placeholder |
2967 |
in your string, or if it describes a syntax, like in this one --, you can |
2968 |
mark the string as <CODE>no-perl-brace-format</CODE> and use <CODE>printf()</CODE>: |
2969 |
|
2970 |
</P>
|
2971 |
|
2972 |
<PRE>
|
2973 |
# xgettext: no-perl-brace-format |
2974 |
die sprintf ("usage: %s {OPTIONS} FILENAME...\n", $0); |
2975 |
</PRE>
|
2976 |
|
2977 |
<P>
|
2978 |
If you want to use the more portable Perl brace format, you will have to do |
2979 |
put placeholders in place of the literal braces: |
2980 |
|
2981 |
</P>
|
2982 |
|
2983 |
<PRE>
|
2984 |
die __x ("usage: {program} {[}OPTIONS{]} FILENAME...\n", |
2985 |
program => $0, '[' => '{', ']' => '}'); |
2986 |
</PRE>
|
2987 |
|
2988 |
<P>
|
2989 |
Perl brace format strings know no escaping mechanism. No matter how this |
2990 |
escaping mechanism looked like, it would either give the programmer a |
2991 |
hard time, make translating Perl brace format strings heavy-going, or |
2992 |
result in a performance penalty at runtime, when the format directives |
2993 |
get executed. Most of the time you will happily get along with |
2994 |
<CODE>printf()</CODE> for this special case. |
2995 |
|
2996 |
</P>
|
2997 |
|
2998 |
|
2999 |
<H3><A NAME="SEC272" HREF="gettext_toc.html#TOC272">13.5.17 PHP Hypertext Preprocessor</A></H3> |
3000 |
<P>
|
3001 |
<A NAME="IDX1125"></A> |
3002 |
|
3003 |
</P>
|
3004 |
<DL COMPACT> |
3005 |
|
3006 |
<DT>RPMs
|
3007 |
<DD>
|
3008 |
mod_php4, mod_php4-core, phpdoc |
3009 |
|
3010 |
<DT>File extension
|
3011 |
<DD>
|
3012 |
<CODE>php</CODE>, <CODE>php3</CODE>, <CODE>php4</CODE> |
3013 |
|
3014 |
<DT>String syntax
|
3015 |
<DD>
|
3016 |
<CODE>"abc"</CODE>, <CODE>'abc'</CODE> |
3017 |
|
3018 |
<DT>gettext shorthand
|
3019 |
<DD>
|
3020 |
<CODE>_("abc")</CODE> |
3021 |
|
3022 |
<DT>gettext/ngettext functions
|
3023 |
<DD>
|
3024 |
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>; starting with PHP 4.2.0 |
3025 |
also <CODE>ngettext</CODE>, <CODE>dngettext</CODE>, <CODE>dcngettext</CODE> |
3026 |
|
3027 |
<DT>textdomain
|
3028 |
<DD>
|
3029 |
<CODE>textdomain</CODE> function |
3030 |
|
3031 |
<DT>bindtextdomain
|
3032 |
<DD>
|
3033 |
<CODE>bindtextdomain</CODE> function |
3034 |
|
3035 |
<DT>setlocale
|
3036 |
<DD>
|
3037 |
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE> |
3038 |
|
3039 |
<DT>Prerequisite
|
3040 |
<DD>
|
3041 |
--- |
3042 |
|
3043 |
<DT>Use or emulate GNU gettext
|
3044 |
<DD>
|
3045 |
use |
3046 |
|
3047 |
<DT>Extractor
|
3048 |
<DD>
|
3049 |
<CODE>xgettext</CODE> |
3050 |
|
3051 |
<DT>Formatting with positions
|
3052 |
<DD>
|
3053 |
<CODE>printf "%2\$d %1\$d"</CODE> |
3054 |
|
3055 |
<DT>Portability
|
3056 |
<DD>
|
3057 |
On platforms without gettext, the functions are not available. |
3058 |
|
3059 |
<DT>po-mode marking
|
3060 |
<DD>
|
3061 |
--- |
3062 |
</DL>
|
3063 |
|
3064 |
<P>
|
3065 |
An example is available in the <TT>`examples´</TT> directory: <CODE>hello-php</CODE>. |
3066 |
|
3067 |
</P>
|
3068 |
|
3069 |
|
3070 |
<H3><A NAME="SEC273" HREF="gettext_toc.html#TOC273">13.5.18 Pike</A></H3> |
3071 |
<P>
|
3072 |
<A NAME="IDX1126"></A> |
3073 |
|
3074 |
</P>
|
3075 |
<DL COMPACT> |
3076 |
|
3077 |
<DT>RPMs
|
3078 |
<DD>
|
3079 |
roxen |
3080 |
|
3081 |
<DT>File extension
|
3082 |
<DD>
|
3083 |
<CODE>pike</CODE> |
3084 |
|
3085 |
<DT>String syntax
|
3086 |
<DD>
|
3087 |
<CODE>"abc"</CODE> |
3088 |
|
3089 |
<DT>gettext shorthand
|
3090 |
<DD>
|
3091 |
--- |
3092 |
|
3093 |
<DT>gettext/ngettext functions
|
3094 |
<DD>
|
3095 |
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE> |
3096 |
|
3097 |
<DT>textdomain
|
3098 |
<DD>
|
3099 |
<CODE>textdomain</CODE> function |
3100 |
|
3101 |
<DT>bindtextdomain
|
3102 |
<DD>
|
3103 |
<CODE>bindtextdomain</CODE> function |
3104 |
|
3105 |
<DT>setlocale
|
3106 |
<DD>
|
3107 |
<CODE>setlocale</CODE> function |
3108 |
|
3109 |
<DT>Prerequisite
|
3110 |
<DD>
|
3111 |
<CODE>import Locale.Gettext;</CODE> |
3112 |
|
3113 |
<DT>Use or emulate GNU gettext
|
3114 |
<DD>
|
3115 |
use |
3116 |
|
3117 |
<DT>Extractor
|
3118 |
<DD>
|
3119 |
--- |
3120 |
|
3121 |
<DT>Formatting with positions
|
3122 |
<DD>
|
3123 |
--- |
3124 |
|
3125 |
<DT>Portability
|
3126 |
<DD>
|
3127 |
On platforms without gettext, the functions are not available. |
3128 |
|
3129 |
<DT>po-mode marking
|
3130 |
<DD>
|
3131 |
--- |
3132 |
</DL>
|
3133 |
|
3134 |
|
3135 |
|
3136 |
<H3><A NAME="SEC274" HREF="gettext_toc.html#TOC274">13.5.19 GNU Compiler Collection sources</A></H3> |
3137 |
<P>
|
3138 |
<A NAME="IDX1127"></A> |
3139 |
|
3140 |
</P>
|
3141 |
<DL COMPACT> |
3142 |
|
3143 |
<DT>RPMs
|
3144 |
<DD>
|
3145 |
gcc |
3146 |
|
3147 |
<DT>File extension
|
3148 |
<DD>
|
3149 |
<CODE>c</CODE>, <CODE>h</CODE>. |
3150 |
|
3151 |
<DT>String syntax
|
3152 |
<DD>
|
3153 |
<CODE>"abc"</CODE> |
3154 |
|
3155 |
<DT>gettext shorthand
|
3156 |
<DD>
|
3157 |
<CODE>_("abc")</CODE> |
3158 |
|
3159 |
<DT>gettext/ngettext functions
|
3160 |
<DD>
|
3161 |
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>, |
3162 |
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE> |
3163 |
|
3164 |
<DT>textdomain
|
3165 |
<DD>
|
3166 |
<CODE>textdomain</CODE> function |
3167 |
|
3168 |
<DT>bindtextdomain
|
3169 |
<DD>
|
3170 |
<CODE>bindtextdomain</CODE> function |
3171 |
|
3172 |
<DT>setlocale
|
3173 |
<DD>
|
3174 |
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE> |
3175 |
|
3176 |
<DT>Prerequisite
|
3177 |
<DD>
|
3178 |
<CODE>#include "intl.h"</CODE> |
3179 |
|
3180 |
<DT>Use or emulate GNU gettext
|
3181 |
<DD>
|
3182 |
Use |
3183 |
|
3184 |
<DT>Extractor
|
3185 |
<DD>
|
3186 |
<CODE>xgettext -k_</CODE> |
3187 |
|
3188 |
<DT>Formatting with positions
|
3189 |
<DD>
|
3190 |
--- |
3191 |
|
3192 |
<DT>Portability
|
3193 |
<DD>
|
3194 |
Uses autoconf macros |
3195 |
|
3196 |
<DT>po-mode marking
|
3197 |
<DD>
|
3198 |
yes |
3199 |
</DL>
|
3200 |
|
3201 |
|
3202 |
|
3203 |
<H2><A NAME="SEC275" HREF="gettext_toc.html#TOC275">13.6 Internationalizable Data</A></H2> |
3204 |
|
3205 |
<P>
|
3206 |
Here is a list of other data formats which can be internationalized |
3207 |
using GNU gettext. |
3208 |
|
3209 |
</P>
|
3210 |
|
3211 |
|
3212 |
|
3213 |
<H3><A NAME="SEC276" HREF="gettext_toc.html#TOC276">13.6.1 POT - Portable Object Template</A></H3> |
3214 |
|
3215 |
<DL COMPACT> |
3216 |
|
3217 |
<DT>RPMs
|
3218 |
<DD>
|
3219 |
gettext |
3220 |
|
3221 |
<DT>File extension
|
3222 |
<DD>
|
3223 |
<CODE>pot</CODE>, <CODE>po</CODE> |
3224 |
|
3225 |
<DT>Extractor
|
3226 |
<DD>
|
3227 |
<CODE>xgettext</CODE> |
3228 |
</DL>
|
3229 |
|
3230 |
|
3231 |
|
3232 |
<H3><A NAME="SEC277" HREF="gettext_toc.html#TOC277">13.6.2 Resource String Table</A></H3> |
3233 |
<P>
|
3234 |
<A NAME="IDX1128"></A> |
3235 |
|
3236 |
</P>
|
3237 |
<DL COMPACT> |
3238 |
|
3239 |
<DT>RPMs
|
3240 |
<DD>
|
3241 |
fpk |
3242 |
|
3243 |
<DT>File extension
|
3244 |
<DD>
|
3245 |
<CODE>rst</CODE> |
3246 |
|
3247 |
<DT>Extractor
|
3248 |
<DD>
|
3249 |
<CODE>xgettext</CODE>, <CODE>rstconv</CODE> |
3250 |
</DL>
|
3251 |
|
3252 |
|
3253 |
|
3254 |
<H3><A NAME="SEC278" HREF="gettext_toc.html#TOC278">13.6.3 Glade - GNOME user interface description</A></H3> |
3255 |
|
3256 |
<DL COMPACT> |
3257 |
|
3258 |
<DT>RPMs
|
3259 |
<DD>
|
3260 |
glade, libglade, glade2, libglade2, intltool |
3261 |
|
3262 |
<DT>File extension
|
3263 |
<DD>
|
3264 |
<CODE>glade</CODE>, <CODE>glade2</CODE> |
3265 |
|
3266 |
<DT>Extractor
|
3267 |
<DD>
|
3268 |
<CODE>xgettext</CODE>, <CODE>libglade-xgettext</CODE>, <CODE>xml-i18n-extract</CODE>, <CODE>intltool-extract</CODE> |
3269 |
</DL>
|
3270 |
|
3271 |
<P><HR><P> |
3272 |
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_12.html">previous</A>, <A HREF="gettext_14.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>. |
3273 |
</BODY>
|
3274 |
</HTML>
|