Statistics
| Revision:

root / trunk / install / launcher / izpack-launcher-1.3 / src / gettext / share / doc / gettext / gettext_13.html @ 7940

History | View | Annotate | Download (80.4 KB)

1
<HTML>
2
<HEAD>
3
<!-- This HTML file has been created by texi2html 1.52a
4
     from gettext.texi on 9 December 2003 -->
5

    
6
<TITLE>GNU gettext utilities - 13  Other Programming Languages</TITLE>
7
</HEAD>
8
<BODY>
9
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_12.html">previous</A>, <A HREF="gettext_14.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
10
<P><HR><P>
11

    
12

    
13
<H1><A NAME="SEC217" HREF="gettext_toc.html#TOC217">13  Other Programming Languages</A></H1>
14

    
15
<P>
16
While the presentation of <CODE>gettext</CODE> focuses mostly on C and
17
implicitly applies to C++ as well, its scope is far broader than that:
18
Many programming languages, scripting languages and other textual data
19
like GUI resources or package descriptions can make use of the gettext
20
approach.
21

    
22
</P>
23

    
24

    
25

    
26
<H2><A NAME="SEC218" HREF="gettext_toc.html#TOC218">13.1  The Language Implementor's View</A></H2>
27
<P>
28
<A NAME="IDX1047"></A>
29
<A NAME="IDX1048"></A>
30

    
31
</P>
32
<P>
33
All programming and scripting languages that have the notion of strings
34
are eligible to supporting <CODE>gettext</CODE>.  Supporting <CODE>gettext</CODE>
35
means the following:
36

    
37
</P>
38

    
39
<OL>
40
<LI>
41

    
42
You should add to the language a syntax for translatable strings.  In
43
principle, a function call of <CODE>gettext</CODE> would do, but a shorthand
44
syntax helps keeping the legibility of internationalized programs.  For
45
example, in C we use the syntax <CODE>_("string")</CODE>, and in GNU awk we use
46
the shorthand <CODE>_"string"</CODE>.
47

    
48
<LI>
49

    
50
You should arrange that evaluation of such a translatable string at
51
runtime calls the <CODE>gettext</CODE> function, or performs equivalent
52
processing.
53

    
54
<LI>
55

    
56
Similarly, you should make the functions <CODE>ngettext</CODE>,
57
<CODE>dcgettext</CODE>, <CODE>dcngettext</CODE> available from within the language.
58
These functions are less often used, but are nevertheless necessary for
59
particular purposes: <CODE>ngettext</CODE> for correct plural handling, and
60
<CODE>dcgettext</CODE> and <CODE>dcngettext</CODE> for obeying other locale
61
environment variables than <CODE>LC_MESSAGES</CODE>, such as <CODE>LC_TIME</CODE> or
62
<CODE>LC_MONETARY</CODE>.  For these latter functions, you need to make the
63
<CODE>LC_*</CODE> constants, available in the C header <CODE>&#60;locale.h&#62;</CODE>,
64
referenceable from within the language, usually either as enumeration
65
values or as strings.
66

    
67
<LI>
68

    
69
You should allow the programmer to designate a message domain, either by
70
making the <CODE>textdomain</CODE> function available from within the
71
language, or by introducing a magic variable called <CODE>TEXTDOMAIN</CODE>.
72
Similarly, you should allow the programmer to designate where to search
73
for message catalogs, by providing access to the <CODE>bindtextdomain</CODE>
74
function.
75

    
76
<LI>
77

    
78
You should either perform a <CODE>setlocale (LC_ALL, "")</CODE> call during
79
the startup of your language runtime, or allow the programmer to do so.
80
Remember that gettext will act as a no-op if the <CODE>LC_MESSAGES</CODE> and
81
<CODE>LC_CTYPE</CODE> locale facets are not both set.
82

    
83
<LI>
84

    
85
A programmer should have a way to extract translatable strings from a
86
program into a PO file.  The GNU <CODE>xgettext</CODE> program is being
87
extended to support very different programming languages.  Please
88
contact the GNU <CODE>gettext</CODE> maintainers to help them doing this.  If
89
the string extractor is best integrated into your language's parser, GNU
90
<CODE>xgettext</CODE> can function as a front end to your string extractor.
91

    
92
<LI>
93

    
94
The language's library should have a string formatting facility where
95
the arguments of a format string are denoted by a positional number or a
96
name.  This is needed because for some languages and some messages with
97
more than one substitutable argument, the translation will need to
98
output the substituted arguments in different order.  See section <A HREF="gettext_3.html#SEC18">3.5  Special Comments preceding Keywords</A>.
99

    
100
<LI>
101

    
102
If the language has more than one implementation, and not all of the
103
implementations use <CODE>gettext</CODE>, but the programs should be portable
104
across implementations, you should provide a no-i18n emulation, that
105
makes the other implementations accept programs written for yours,
106
without actually translating the strings.
107

    
108
<LI>
109

    
110
To help the programmer in the task of marking translatable strings,
111
which is usually performed using the Emacs PO mode, you are welcome to
112
contact the GNU <CODE>gettext</CODE> maintainers, so they can add support for
113
your language to <TT>`po-mode.el&acute;</TT>.
114
</OL>
115

    
116
<P>
117
On the implementation side, three approaches are possible, with
118
different effects on portability and copyright:
119

    
120
</P>
121

    
122
<UL>
123
<LI>
124

    
125
You may integrate the GNU <CODE>gettext</CODE>'s <TT>`intl/&acute;</TT> directory in
126
your package, as described in section <A HREF="gettext_12.html#SEC189">12  The Maintainer's View</A>.  This allows you to
127
have internationalization on all kinds of platforms.  Note that when you
128
then distribute your package, it legally falls under the GNU General
129
Public License, and the GNU project will be glad about your contribution
130
to the Free Software pool.
131

    
132
<LI>
133

    
134
You may link against GNU <CODE>gettext</CODE> functions if they are found in
135
the C library.  For example, an autoconf test for <CODE>gettext()</CODE> and
136
<CODE>ngettext()</CODE> will detect this situation.  For the moment, this test
137
will succeed on GNU systems and not on other platforms.  No severe
138
copyright restrictions apply.
139

    
140
<LI>
141

    
142
You may emulate or reimplement the GNU <CODE>gettext</CODE> functionality.
143
This has the advantage of full portability and no copyright
144
restrictions, but also the drawback that you have to reimplement the GNU
145
<CODE>gettext</CODE> features (such as the <CODE>LANGUAGE</CODE> environment
146
variable, the locale aliases database, the automatic charset conversion,
147
and plural handling).
148
</UL>
149

    
150

    
151

    
152
<H2><A NAME="SEC219" HREF="gettext_toc.html#TOC219">13.2  The Programmer's View</A></H2>
153

    
154
<P>
155
For the programmer, the general procedure is the same as for the C
156
language.  The Emacs PO mode supports other languages, and the GNU
157
<CODE>xgettext</CODE> string extractor recognizes other languages based on the
158
file extension or a command-line option.  In some languages,
159
<CODE>setlocale</CODE> is not needed because it is already performed by the
160
underlying language runtime.
161

    
162
</P>
163

    
164

    
165
<H2><A NAME="SEC220" HREF="gettext_toc.html#TOC220">13.3  The Translator's View</A></H2>
166

    
167
<P>
168
The translator works exactly as in the C language case.  The only
169
difference is that when translating format strings, she has to be aware
170
of the language's particular syntax for positional arguments in format
171
strings.
172

    
173
</P>
174

    
175

    
176

    
177
<H3><A NAME="SEC221" HREF="gettext_toc.html#TOC221">13.3.1  C Format Strings</A></H3>
178

    
179
<P>
180
C format strings are described in POSIX (IEEE P1003.1 2001), section
181
XSH 3 fprintf(),
182
<A HREF="http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html">http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html</A>.
183
See also the fprintf(3) manual page,
184
<A HREF="http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php">http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php</A>,
185
<A HREF="http://informatik.fh-wuerzburg.de/student/i510/man/printf.html">http://informatik.fh-wuerzburg.de/student/i510/man/printf.html</A>.
186

    
187
</P>
188
<P>
189
Although format strings with positions that reorder arguments, such as
190

    
191
</P>
192

    
193
<PRE>
194
"Only %2$d bytes free on '%1$s'."
195
</PRE>
196

    
197
<P>
198
which is semantically equivalent to
199

    
200
</P>
201

    
202
<PRE>
203
"'%s' has only %d bytes free."
204
</PRE>
205

    
206
<P>
207
are a POSIX/XSI feature and not specified by ISO C 99, translators can rely
208
on this reordering ability: On the few platforms where <CODE>printf()</CODE>,
209
<CODE>fprintf()</CODE> etc. don't support this feature natively, <TT>`libintl.a&acute;</TT>
210
or <TT>`libintl.so&acute;</TT> provides replacement functions, and GNU <CODE>&#60;libintl.h&#62;</CODE>
211
activates these replacement functions automatically.
212

    
213
</P>
214

    
215

    
216
<H3><A NAME="SEC222" HREF="gettext_toc.html#TOC222">13.3.2  Objective C Format Strings</A></H3>
217

    
218
<P>
219
Objective C format strings are like C format strings.  They support an
220
additional format directive: "$@", which when executed consumes an argument
221
of type <CODE>Object *</CODE>.
222

    
223
</P>
224

    
225

    
226
<H3><A NAME="SEC223" HREF="gettext_toc.html#TOC223">13.3.3  Shell Format Strings</A></H3>
227

    
228
<P>
229
Shell format strings, as supported by GNU gettext and the <SAMP>`envsubst&acute;</SAMP>
230
program, are strings with references to shell variables in the form
231
<CODE>$<VAR>variable</VAR></CODE> or <CODE>${<VAR>variable</VAR>}</CODE>.  References of the form
232
<CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE>,
233
<CODE>${<VAR>variable</VAR>:-<VAR>default</VAR>}</CODE>,
234
<CODE>${<VAR>variable</VAR>=<VAR>default</VAR>}</CODE>,
235
<CODE>${<VAR>variable</VAR>:=<VAR>default</VAR>}</CODE>,
236
<CODE>${<VAR>variable</VAR>+<VAR>replacement</VAR>}</CODE>,
237
<CODE>${<VAR>variable</VAR>:+<VAR>replacement</VAR>}</CODE>,
238
<CODE>${<VAR>variable</VAR>?<VAR>ignored</VAR>}</CODE>,
239
<CODE>${<VAR>variable</VAR>:?<VAR>ignored</VAR>}</CODE>,
240
that would be valid inside shell scripts, are not supported.  The
241
<VAR>variable</VAR> names must consist solely of alphanumeric or underscore
242
ASCII characters, not start with a digit and be nonempty; otherwise such
243
a variable reference is ignored.
244

    
245
</P>
246

    
247

    
248
<H3><A NAME="SEC224" HREF="gettext_toc.html#TOC224">13.3.4  Python Format Strings</A></H3>
249

    
250
<P>
251
Python format strings are described in
252
Python Library reference /
253
2. Built-in Types, Exceptions and Functions /
254
2.2. Built-in Types /
255
2.2.6. Sequence Types /
256
2.2.6.2. String Formatting Operations.
257
<A HREF="http://www.python.org/doc/2.2.1/lib/typesseq-strings.html">http://www.python.org/doc/2.2.1/lib/typesseq-strings.html</A>.
258

    
259
</P>
260

    
261

    
262
<H3><A NAME="SEC225" HREF="gettext_toc.html#TOC225">13.3.5  Lisp Format Strings</A></H3>
263

    
264
<P>
265
Lisp format strings are described in the Common Lisp HyperSpec,
266
chapter 22.3 Formatted Output,
267
<A HREF="http://www.lisp.org/HyperSpec/Body/sec_22-3.html">http://www.lisp.org/HyperSpec/Body/sec_22-3.html</A>.
268

    
269
</P>
270

    
271

    
272
<H3><A NAME="SEC226" HREF="gettext_toc.html#TOC226">13.3.6  Emacs Lisp Format Strings</A></H3>
273

    
274
<P>
275
Emacs Lisp format strings are documented in the Emacs Lisp reference,
276
section Formatting Strings,
277
<A HREF="http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75">http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75</A>.
278
Note that as of version 21, XEmacs supports numbered argument specifications
279
in format strings while FSF Emacs doesn't.
280

    
281
</P>
282

    
283

    
284
<H3><A NAME="SEC227" HREF="gettext_toc.html#TOC227">13.3.7  librep Format Strings</A></H3>
285

    
286
<P>
287
librep format strings are documented in the librep manual, section
288
Formatted Output,
289
<A HREF="http://librep.sourceforge.net/librep-manual.html#Formatted%20Output">http://librep.sourceforge.net/librep-manual.html#Formatted%20Output</A>,
290
<A HREF="http://www.gwinnup.org/research/docs/librep.html#SEC122">http://www.gwinnup.org/research/docs/librep.html#SEC122</A>.
291

    
292
</P>
293

    
294

    
295
<H3><A NAME="SEC228" HREF="gettext_toc.html#TOC228">13.3.8  Smalltalk Format Strings</A></H3>
296

    
297
<P>
298
Smalltalk format strings are described in the GNU Smalltalk documentation,
299
class <CODE>CharArray</CODE>, methods <SAMP>`bindWith:&acute;</SAMP> and
300
<SAMP>`bindWithArguments:&acute;</SAMP>.
301
<A HREF="http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238">http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238</A>.
302
In summary, a directive starts with <SAMP>`%&acute;</SAMP> and is followed by <SAMP>`%&acute;</SAMP>
303
or a nonzero digit (<SAMP>`1&acute;</SAMP> to <SAMP>`9&acute;</SAMP>).
304

    
305
</P>
306

    
307

    
308
<H3><A NAME="SEC229" HREF="gettext_toc.html#TOC229">13.3.9  Java Format Strings</A></H3>
309

    
310
<P>
311
Java format strings are described in the JDK documentation for class
312
<CODE>java.text.MessageFormat</CODE>,
313
<A HREF="http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html">http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html</A>.
314
See also the ICU documentation
315
<A HREF="http://oss.software.ibm.com/icu/apiref/classMessageFormat.html">http://oss.software.ibm.com/icu/apiref/classMessageFormat.html</A>.
316

    
317
</P>
318

    
319

    
320
<H3><A NAME="SEC230" HREF="gettext_toc.html#TOC230">13.3.10  awk Format Strings</A></H3>
321

    
322
<P>
323
awk format strings are described in the gawk documentation, section
324
Printf,
325
<A HREF="http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf">http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf</A>.
326

    
327
</P>
328

    
329

    
330
<H3><A NAME="SEC231" HREF="gettext_toc.html#TOC231">13.3.11  Object Pascal Format Strings</A></H3>
331

    
332
<P>
333
Where is this documented?
334

    
335
</P>
336

    
337

    
338
<H3><A NAME="SEC232" HREF="gettext_toc.html#TOC232">13.3.12  YCP Format Strings</A></H3>
339

    
340
<P>
341
YCP sformat strings are described in the libycp documentation
342
<A HREF="file:/usr/share/doc/packages/libycp/YCP-builtins.html">file:/usr/share/doc/packages/libycp/YCP-builtins.html</A>.
343
In summary, a directive starts with <SAMP>`%&acute;</SAMP> and is followed by <SAMP>`%&acute;</SAMP>
344
or a nonzero digit (<SAMP>`1&acute;</SAMP> to <SAMP>`9&acute;</SAMP>).
345

    
346
</P>
347

    
348

    
349
<H3><A NAME="SEC233" HREF="gettext_toc.html#TOC233">13.3.13  Tcl Format Strings</A></H3>
350

    
351
<P>
352
Tcl format strings are described in the <TT>`format.n&acute;</TT> manual page,
353
<A HREF="http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm">http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm</A>.
354

    
355
</P>
356

    
357

    
358
<H3><A NAME="SEC234" HREF="gettext_toc.html#TOC234">13.3.14  Perl Format Strings</A></H3>
359

    
360
<P>
361
There are two kinds format strings in Perl: those acceptable to the
362
Perl built-in function <CODE>printf</CODE>, labelled as <SAMP>`perl-format&acute;</SAMP>,
363
and those acceptable to the <CODE>libintl-perl</CODE> function <CODE>__x</CODE>,
364
labelled as <SAMP>`perl-brace-format&acute;</SAMP>.
365

    
366
</P>
367
<P>
368
Perl <CODE>printf</CODE> format strings are described in the <CODE>sprintf</CODE>
369
section of <SAMP>`man perlfunc&acute;</SAMP>.
370

    
371
</P>
372
<P>
373
Perl brace format strings are described in the
374
<TT>`Locale::TextDomain(3pm)&acute;</TT> manual page of the CPAN package
375
libintl-perl.  In brief, Perl format uses placeholders put between
376
braces (<SAMP>`{&acute;</SAMP> and <SAMP>`}&acute;</SAMP>).  The placeholder must have the syntax
377
of simple identifiers.
378

    
379
</P>
380

    
381

    
382
<H3><A NAME="SEC235" HREF="gettext_toc.html#TOC235">13.3.15  PHP Format Strings</A></H3>
383

    
384
<P>
385
PHP format strings are described in the documentation of the PHP function
386
<CODE>sprintf</CODE>, in <TT>`phpdoc/manual/function.sprintf.html&acute;</TT> or
387
<A HREF="http://www.php.net/manual/en/function.sprintf.php">http://www.php.net/manual/en/function.sprintf.php</A>.
388

    
389
</P>
390

    
391

    
392
<H3><A NAME="SEC236" HREF="gettext_toc.html#TOC236">13.3.16  GCC internal Format Strings</A></H3>
393

    
394
<P>
395
These format strings are used inside the GCC sources.  In such a format
396
string, a directive starts with <SAMP>`%&acute;</SAMP>, is optionally followed by a
397
size specifier <SAMP>`l&acute;</SAMP>, an optional flag <SAMP>`+&acute;</SAMP>, another optional flag
398
<SAMP>`#&acute;</SAMP>, and is finished by a specifier: <SAMP>`%&acute;</SAMP> denotes a literal
399
percent sign, <SAMP>`c&acute;</SAMP> denotes a character, <SAMP>`s&acute;</SAMP> denotes a string,
400
<SAMP>`i&acute;</SAMP> and <SAMP>`d&acute;</SAMP> denote an integer, <SAMP>`o&acute;</SAMP>, <SAMP>`u&acute;</SAMP>, <SAMP>`x&acute;</SAMP>
401
denote an unsigned integer, <SAMP>`.*s&acute;</SAMP> denotes a string preceded by a
402
width specification, <SAMP>`H&acute;</SAMP> denotes a <SAMP>`location_t *&acute;</SAMP> pointer,
403
<SAMP>`D&acute;</SAMP> denotes a general declaration, <SAMP>`F&acute;</SAMP> denotes a function
404
declaration, <SAMP>`T&acute;</SAMP> denotes a type, <SAMP>`A&acute;</SAMP> denotes a function argument,
405
<SAMP>`C&acute;</SAMP> denotes a tree code, <SAMP>`E&acute;</SAMP> denotes an expression, <SAMP>`L&acute;</SAMP>
406
denotes a programming language, <SAMP>`O&acute;</SAMP> denotes a binary operator,
407
<SAMP>`P&acute;</SAMP> denotes a function parameter, <SAMP>`Q&acute;</SAMP> denotes an assignment
408
operator, <SAMP>`V&acute;</SAMP> denotes a const/volatile qualifier.
409

    
410
</P>
411

    
412

    
413
<H3><A NAME="SEC237" HREF="gettext_toc.html#TOC237">13.3.17  Qt Format Strings</A></H3>
414

    
415
<P>
416
Qt format strings are described in the documentation of the QString class
417
<A HREF="file:/usr/lib/qt-3.0.5/doc/html/qstring.html">file:/usr/lib/qt-3.0.5/doc/html/qstring.html</A>.
418
In summary, a directive consists of a <SAMP>`%&acute;</SAMP> followed by a digit. The same
419
directive cannot occur more than once in a format string.
420

    
421
</P>
422

    
423

    
424
<H2><A NAME="SEC238" HREF="gettext_toc.html#TOC238">13.4  The Maintainer's View</A></H2>
425

    
426
<P>
427
For the maintainer, the general procedure differs from the C language
428
case in two ways.
429

    
430
</P>
431

    
432
<UL>
433
<LI>
434

    
435
For those languages that don't use GNU gettext, the <TT>`intl/&acute;</TT> directory
436
is not needed and can be omitted.  This means that the maintainer calls the
437
<CODE>gettextize</CODE> program without the <SAMP>`--intl&acute;</SAMP> option, and that he
438
invokes the <CODE>AM_GNU_GETTEXT</CODE> autoconf macro via
439
<SAMP>`AM_GNU_GETTEXT([external])&acute;</SAMP>.
440

    
441
<LI>
442

    
443
If only a single programming language is used, the <CODE>XGETTEXT_OPTIONS</CODE>
444
variable in <TT>`po/Makevars&acute;</TT> (see section <A HREF="gettext_12.html#SEC196">12.4.3  <TT>`Makefile&acute;</TT> pieces in <TT>`po/&acute;</TT></A>) should be adjusted to
445
match the <CODE>xgettext</CODE> options for that particular programming language.
446
If the package uses more than one programming language with <CODE>gettext</CODE>
447
support, it becomes necessary to change the POT file construction rule
448
in <TT>`po/Makefile.in.in&acute;</TT>.  It is recommended to make one <CODE>xgettext</CODE>
449
invocation per programming language, each with the options appropriate for
450
that language, and to combine the resulting files using <CODE>msgcat</CODE>.
451
</UL>
452

    
453

    
454

    
455
<H2><A NAME="SEC239" HREF="gettext_toc.html#TOC239">13.5  Individual Programming Languages</A></H2>
456

    
457

    
458

    
459
<H3><A NAME="SEC240" HREF="gettext_toc.html#TOC240">13.5.1  C, C++, Objective C</A></H3>
460
<P>
461
<A NAME="IDX1049"></A>
462

    
463
</P>
464
<DL COMPACT>
465

    
466
<DT>RPMs
467
<DD>
468
gcc, gpp, gobjc, glibc, gettext
469

    
470
<DT>File extension
471
<DD>
472
For C: <CODE>c</CODE>, <CODE>h</CODE>.
473
<BR>For C++: <CODE>C</CODE>, <CODE>c++</CODE>, <CODE>cc</CODE>, <CODE>cxx</CODE>, <CODE>cpp</CODE>, <CODE>hpp</CODE>.
474
<BR>For Objective C: <CODE>m</CODE>.
475

    
476
<DT>String syntax
477
<DD>
478
<CODE>"abc"</CODE>
479

    
480
<DT>gettext shorthand
481
<DD>
482
<CODE>_("abc")</CODE>
483

    
484
<DT>gettext/ngettext functions
485
<DD>
486
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
487
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
488

    
489
<DT>textdomain
490
<DD>
491
<CODE>textdomain</CODE> function
492

    
493
<DT>bindtextdomain
494
<DD>
495
<CODE>bindtextdomain</CODE> function
496

    
497
<DT>setlocale
498
<DD>
499
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
500

    
501
<DT>Prerequisite
502
<DD>
503
<CODE>#include &#60;libintl.h&#62;</CODE>
504
<BR><CODE>#include &#60;locale.h&#62;</CODE>
505
<BR><CODE>#define _(string) gettext (string)</CODE>
506

    
507
<DT>Use or emulate GNU gettext
508
<DD>
509
Use
510

    
511
<DT>Extractor
512
<DD>
513
<CODE>xgettext -k_</CODE>
514

    
515
<DT>Formatting with positions
516
<DD>
517
<CODE>fprintf "%2$d %1$d"</CODE>
518
<BR>In C++: <CODE>autosprintf "%2$d %1$d"</CODE>
519
(see section `Introduction' in <CITE>GNU autosprintf</CITE>)
520

    
521
<DT>Portability
522
<DD>
523
autoconf (gettext.m4) and #if ENABLE_NLS
524

    
525
<DT>po-mode marking
526
<DD>
527
yes
528
</DL>
529

    
530
<P>
531
The following examples are available in the <TT>`examples&acute;</TT> directory:
532
<CODE>hello-c</CODE>, <CODE>hello-c-gnome</CODE>, <CODE>hello-c++</CODE>, <CODE>hello-c++-qt</CODE>, 
533
<CODE>hello-c++-kde</CODE>, <CODE>hello-c++-gnome</CODE>, <CODE>hello-objc</CODE>, 
534
<CODE>hello-objc-gnustep</CODE>, <CODE>hello-objc-gnome</CODE>.
535

    
536
</P>
537

    
538

    
539
<H3><A NAME="SEC241" HREF="gettext_toc.html#TOC241">13.5.2  sh - Shell Script</A></H3>
540
<P>
541
<A NAME="IDX1050"></A>
542

    
543
</P>
544
<DL COMPACT>
545

    
546
<DT>RPMs
547
<DD>
548
bash, gettext
549

    
550
<DT>File extension
551
<DD>
552
<CODE>sh</CODE>
553

    
554
<DT>String syntax
555
<DD>
556
<CODE>"abc"</CODE>, <CODE>'abc'</CODE>, <CODE>abc</CODE>
557

    
558
<DT>gettext shorthand
559
<DD>
560
<CODE>"`gettext \"abc\"`"</CODE>
561

    
562
<DT>gettext/ngettext functions
563
<DD>
564
<A NAME="IDX1051"></A>
565
<A NAME="IDX1052"></A>
566
<CODE>gettext</CODE>, <CODE>ngettext</CODE> programs
567
<BR><CODE>eval_gettext</CODE>, <CODE>eval_ngettext</CODE> shell functions
568

    
569
<DT>textdomain
570
<DD>
571
<A NAME="IDX1053"></A>
572
environment variable <CODE>TEXTDOMAIN</CODE>
573

    
574
<DT>bindtextdomain
575
<DD>
576
<A NAME="IDX1054"></A>
577
environment variable <CODE>TEXTDOMAINDIR</CODE>
578

    
579
<DT>setlocale
580
<DD>
581
automatic
582

    
583
<DT>Prerequisite
584
<DD>
585
<CODE>. gettext.sh</CODE>
586

    
587
<DT>Use or emulate GNU gettext
588
<DD>
589
use
590

    
591
<DT>Extractor
592
<DD>
593
<CODE>xgettext</CODE>
594

    
595
<DT>Formatting with positions
596
<DD>
597
---
598

    
599
<DT>Portability
600
<DD>
601
fully portable
602

    
603
<DT>po-mode marking
604
<DD>
605
---
606
</DL>
607

    
608
<P>
609
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-sh</CODE>.
610

    
611
</P>
612

    
613

    
614

    
615
<H4><A NAME="SEC242" HREF="gettext_toc.html#TOC242">13.5.2.1  Preparing Shell Scripts for Internationalization</A></H4>
616
<P>
617
<A NAME="IDX1055"></A>
618

    
619
</P>
620
<P>
621
Preparing a shell script for internationalization is conceptually similar
622
to the steps described in section <A HREF="gettext_3.html#SEC13">3  Preparing Program Sources</A>.  The concrete steps for shell
623
scripts are as follows.
624

    
625
</P>
626

    
627
<OL>
628
<LI>
629

    
630
Insert the line
631

    
632

    
633
<PRE>
634
. gettext.sh
635
</PRE>
636

    
637
near the top of the script.  <CODE>gettext.sh</CODE> is a shell function library
638
that provides the functions
639
<CODE>eval_gettext</CODE> (see section <A HREF="gettext_13.html#SEC247">13.5.2.6  Invoking the <CODE>eval_gettext</CODE> function</A>) and
640
<CODE>eval_ngettext</CODE> (see section <A HREF="gettext_13.html#SEC248">13.5.2.7  Invoking the <CODE>eval_ngettext</CODE> function</A>).
641
You have to ensure that <CODE>gettext.sh</CODE> can be found in the <CODE>PATH</CODE>.
642

    
643
<LI>
644

    
645
Set and export the <CODE>TEXTDOMAIN</CODE> and <CODE>TEXTDOMAINDIR</CODE> environment
646
variables.  Usually <CODE>TEXTDOMAIN</CODE> is the package or program name, and
647
<CODE>TEXTDOMAINDIR</CODE> is the absolute pathname corresponding to
648
<CODE>$prefix/share/locale</CODE>, where <CODE>$prefix</CODE> is the installation location.
649

    
650

    
651
<PRE>
652
TEXTDOMAIN=@PACKAGE@
653
export TEXTDOMAIN
654
TEXTDOMAINDIR=@LOCALEDIR@
655
export TEXTDOMAINDIR
656
</PRE>
657

    
658
<LI>
659

    
660
Prepare the strings for translation, as described in section <A HREF="gettext_3.html#SEC15">3.2  Preparing Translatable Strings</A>.
661

    
662
<LI>
663

    
664
Simplify translatable strings so that they don't contain command substitution
665
(<CODE>"`...`"</CODE> or <CODE>"$(...)"</CODE>), variable access with defaulting (like
666
<CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE>), access to positional arguments
667
(like <CODE>$0</CODE>, <CODE>$1</CODE>, ...) or highly volatile shell variables (like
668
<CODE>$?</CODE>). This can always be done through simple local code restructuring.
669
For example,
670

    
671

    
672
<PRE>
673
echo "Usage: $0 [OPTION] FILE..."
674
</PRE>
675

    
676
becomes
677

    
678

    
679
<PRE>
680
program_name=$0
681
echo "Usage: $program_name [OPTION] FILE..."
682
</PRE>
683

    
684
Similarly,
685

    
686

    
687
<PRE>
688
echo "Remaining files: `ls | wc -l`"
689
</PRE>
690

    
691
becomes
692

    
693

    
694
<PRE>
695
filecount="`ls | wc -l`"
696
echo "Remaining files: $filecount"
697
</PRE>
698

    
699
<LI>
700

    
701
For each translatable string, change the output command <SAMP>`echo&acute;</SAMP> or
702
<SAMP>`$echo&acute;</SAMP> to <SAMP>`gettext&acute;</SAMP> (if the string contains no references to
703
shell variables) or to <SAMP>`eval_gettext&acute;</SAMP> (if it refers to shell variables),
704
followed by a no-argument <SAMP>`echo&acute;</SAMP> command (to account for the terminating
705
newline). Similarly, for cases with plural handling, replace a conditional
706
<SAMP>`echo&acute;</SAMP> command with an invocation of <SAMP>`ngettext&acute;</SAMP> or
707
<SAMP>`eval_ngettext&acute;</SAMP>, followed by a no-argument <SAMP>`echo&acute;</SAMP> command.
708
</OL>
709

    
710

    
711

    
712
<H4><A NAME="SEC243" HREF="gettext_toc.html#TOC243">13.5.2.2  Contents of <CODE>gettext.sh</CODE></A></H4>
713

    
714
<P>
715
<CODE>gettext.sh</CODE>, contained in the run-time package of GNU gettext, provides
716
the following:
717

    
718
</P>
719

    
720
<UL>
721
<LI>$echo
722

    
723
The variable <CODE>echo</CODE> is set to a command that outputs its first argument
724
and a newline, without interpreting backslashes in the argument string.
725

    
726
<LI>eval_gettext
727

    
728
See section <A HREF="gettext_13.html#SEC247">13.5.2.6  Invoking the <CODE>eval_gettext</CODE> function</A>.
729

    
730
<LI>eval_ngettext
731

    
732
See section <A HREF="gettext_13.html#SEC248">13.5.2.7  Invoking the <CODE>eval_ngettext</CODE> function</A>.
733
</UL>
734

    
735

    
736

    
737
<H4><A NAME="SEC244" HREF="gettext_toc.html#TOC244">13.5.2.3  Invoking the <CODE>gettext</CODE> program</A></H4>
738

    
739
<P>
740
<A NAME="IDX1056"></A>
741
<A NAME="IDX1057"></A>
742

    
743
<PRE>
744
gettext [<VAR>option</VAR>] [[<VAR>textdomain</VAR>] <VAR>msgid</VAR>]
745
gettext [<VAR>option</VAR>] -s [<VAR>msgid</VAR>]...
746
</PRE>
747

    
748
<P>
749
<A NAME="IDX1058"></A>
750
The <CODE>gettext</CODE> program displays the native language translation of a
751
textual message.
752

    
753
</P>
754
<P>
755
<STRONG>Arguments</STRONG>
756

    
757
</P>
758
<DL COMPACT>
759

    
760
<DT><SAMP>`-d <VAR>textdomain</VAR>&acute;</SAMP>
761
<DD>
762
<DT><SAMP>`--domain=<VAR>textdomain</VAR>&acute;</SAMP>
763
<DD>
764
<A NAME="IDX1059"></A>
765
<A NAME="IDX1060"></A>
766
Retrieve translated messages from <VAR>textdomain</VAR>.  Usually a <VAR>textdomain</VAR>
767
corresponds to a package, a program, or a module of a program.
768

    
769
<DT><SAMP>`-e&acute;</SAMP>
770
<DD>
771
<A NAME="IDX1061"></A>
772
Enable expansion of some escape sequences.  This option is for compatibility
773
with the <SAMP>`echo&acute;</SAMP> program or shell built-in.  The escape sequences
774
<SAMP>`\b&acute;</SAMP>, <SAMP>`\c&acute;</SAMP>, <SAMP>`\f&acute;</SAMP>, <SAMP>`\n&acute;</SAMP>, <SAMP>`\r&acute;</SAMP>, <SAMP>`\t&acute;</SAMP>, <SAMP>`\v&acute;</SAMP>,
775
<SAMP>`\\&acute;</SAMP>, and <SAMP>`\&acute;</SAMP> followed by one to three octal digits, are interpreted
776
like the <SAMP>`echo&acute;</SAMP> program does.
777

    
778
<DT><SAMP>`-E&acute;</SAMP>
779
<DD>
780
<A NAME="IDX1062"></A>
781
This option is only for compatibility with the <SAMP>`echo&acute;</SAMP> program or shell
782
built-in.  It has no effect.
783

    
784
<DT><SAMP>`-h&acute;</SAMP>
785
<DD>
786
<DT><SAMP>`--help&acute;</SAMP>
787
<DD>
788
<A NAME="IDX1063"></A>
789
<A NAME="IDX1064"></A>
790
Display this help and exit.
791

    
792
<DT><SAMP>`-n&acute;</SAMP>
793
<DD>
794
<A NAME="IDX1065"></A>
795
Suppress trailing newline.  By default, <CODE>gettext</CODE> adds a newline to
796
the output.
797

    
798
<DT><SAMP>`-V&acute;</SAMP>
799
<DD>
800
<DT><SAMP>`--version&acute;</SAMP>
801
<DD>
802
<A NAME="IDX1066"></A>
803
<A NAME="IDX1067"></A>
804
Output version information and exit.
805

    
806
<DT><SAMP>`[<VAR>textdomain</VAR>] <VAR>msgid</VAR>&acute;</SAMP>
807
<DD>
808
Retrieve translated message corresponding to <VAR>msgid</VAR> from <VAR>textdomain</VAR>.
809

    
810
</DL>
811

    
812
<P>
813
If the <VAR>textdomain</VAR> parameter is not given, the domain is determined from
814
the environment variable <CODE>TEXTDOMAIN</CODE>.  If the message catalog is not
815
found in the regular directory, another location can be specified with the
816
environment variable <CODE>TEXTDOMAINDIR</CODE>.
817

    
818
</P>
819
<P>
820
When used with the <CODE>-s</CODE> option the program behaves like the <SAMP>`echo&acute;</SAMP>
821
command.  But it does not simply copy its arguments to stdout.  Instead those
822
messages found in the selected catalog are translated.
823

    
824
</P>
825

    
826

    
827
<H4><A NAME="SEC245" HREF="gettext_toc.html#TOC245">13.5.2.4  Invoking the <CODE>ngettext</CODE> program</A></H4>
828

    
829
<P>
830
<A NAME="IDX1068"></A>
831
<A NAME="IDX1069"></A>
832

    
833
<PRE>
834
ngettext [<VAR>option</VAR>] [<VAR>textdomain</VAR>] <VAR>msgid</VAR> <VAR>msgid-plural</VAR> <VAR>count</VAR>
835
</PRE>
836

    
837
<P>
838
<A NAME="IDX1070"></A>
839
The <CODE>ngettext</CODE> program displays the native language translation of a
840
textual message whose grammatical form depends on a number.
841

    
842
</P>
843
<P>
844
<STRONG>Arguments</STRONG>
845

    
846
</P>
847
<DL COMPACT>
848

    
849
<DT><SAMP>`-d <VAR>textdomain</VAR>&acute;</SAMP>
850
<DD>
851
<DT><SAMP>`--domain=<VAR>textdomain</VAR>&acute;</SAMP>
852
<DD>
853
<A NAME="IDX1071"></A>
854
<A NAME="IDX1072"></A>
855
Retrieve translated messages from <VAR>textdomain</VAR>.  Usually a <VAR>textdomain</VAR>
856
corresponds to a package, a program, or a module of a program.
857

    
858
<DT><SAMP>`-e&acute;</SAMP>
859
<DD>
860
<A NAME="IDX1073"></A>
861
Enable expansion of some escape sequences.  This option is for compatibility
862
with the <SAMP>`gettext&acute;</SAMP> program.  The escape sequences
863
<SAMP>`\b&acute;</SAMP>, <SAMP>`\c&acute;</SAMP>, <SAMP>`\f&acute;</SAMP>, <SAMP>`\n&acute;</SAMP>, <SAMP>`\r&acute;</SAMP>, <SAMP>`\t&acute;</SAMP>, <SAMP>`\v&acute;</SAMP>,
864
<SAMP>`\\&acute;</SAMP>, and <SAMP>`\&acute;</SAMP> followed by one to three octal digits, are interpreted
865
like the <SAMP>`echo&acute;</SAMP> program does.
866

    
867
<DT><SAMP>`-E&acute;</SAMP>
868
<DD>
869
<A NAME="IDX1074"></A>
870
This option is only for compatibility with the <SAMP>`gettext&acute;</SAMP> program.  It has
871
no effect.
872

    
873
<DT><SAMP>`-h&acute;</SAMP>
874
<DD>
875
<DT><SAMP>`--help&acute;</SAMP>
876
<DD>
877
<A NAME="IDX1075"></A>
878
<A NAME="IDX1076"></A>
879
Display this help and exit.
880

    
881
<DT><SAMP>`-V&acute;</SAMP>
882
<DD>
883
<DT><SAMP>`--version&acute;</SAMP>
884
<DD>
885
<A NAME="IDX1077"></A>
886
<A NAME="IDX1078"></A>
887
Output version information and exit.
888

    
889
<DT><SAMP>`<VAR>textdomain</VAR>&acute;</SAMP>
890
<DD>
891
Retrieve translated message from <VAR>textdomain</VAR>.
892

    
893
<DT><SAMP>`<VAR>msgid</VAR> <VAR>msgid-plural</VAR>&acute;</SAMP>
894
<DD>
895
Translate <VAR>msgid</VAR> (English singular) / <VAR>msgid-plural</VAR> (English plural).
896

    
897
<DT><SAMP>`<VAR>count</VAR>&acute;</SAMP>
898
<DD>
899
Choose singular/plural form based on this value.
900

    
901
</DL>
902

    
903
<P>
904
If the <VAR>textdomain</VAR> parameter is not given, the domain is determined from
905
the environment variable <CODE>TEXTDOMAIN</CODE>.  If the message catalog is not
906
found in the regular directory, another location can be specified with the
907
environment variable <CODE>TEXTDOMAINDIR</CODE>.
908

    
909
</P>
910

    
911

    
912
<H4><A NAME="SEC246" HREF="gettext_toc.html#TOC246">13.5.2.5  Invoking the <CODE>envsubst</CODE> program</A></H4>
913

    
914
<P>
915
<A NAME="IDX1079"></A>
916
<A NAME="IDX1080"></A>
917

    
918
<PRE>
919
envsubst [<VAR>option</VAR>] [<VAR>shell-format</VAR>]
920
</PRE>
921

    
922
<P>
923
<A NAME="IDX1081"></A>
924
<A NAME="IDX1082"></A>
925
<A NAME="IDX1083"></A>
926
The <CODE>envsubst</CODE> program substitutes the values of environment variables.
927

    
928
</P>
929
<P>
930
<STRONG>Operation mode</STRONG>
931

    
932
</P>
933
<DL COMPACT>
934

    
935
<DT><SAMP>`-v&acute;</SAMP>
936
<DD>
937
<DT><SAMP>`--variables&acute;</SAMP>
938
<DD>
939
<A NAME="IDX1084"></A>
940
<A NAME="IDX1085"></A>
941
Output the variables occurring in <VAR>shell-format</VAR>.
942

    
943
</DL>
944

    
945
<P>
946
<STRONG>Informative output</STRONG>
947

    
948
</P>
949
<DL COMPACT>
950

    
951
<DT><SAMP>`-h&acute;</SAMP>
952
<DD>
953
<DT><SAMP>`--help&acute;</SAMP>
954
<DD>
955
<A NAME="IDX1086"></A>
956
<A NAME="IDX1087"></A>
957
Display this help and exit.
958

    
959
<DT><SAMP>`-V&acute;</SAMP>
960
<DD>
961
<DT><SAMP>`--version&acute;</SAMP>
962
<DD>
963
<A NAME="IDX1088"></A>
964
<A NAME="IDX1089"></A>
965
Output version information and exit.
966

    
967
</DL>
968

    
969
<P>
970
In normal operation mode, standard input is copied to standard output,
971
with references to environment variables of the form <CODE>$VARIABLE</CODE> or
972
<CODE>${VARIABLE}</CODE> being replaced with the corresponding values.  If a
973
<VAR>shell-format</VAR> is given, only those environment variables that are
974
referenced in <VAR>shell-format</VAR> are substituted; otherwise all environment
975
variables references occurring in standard input are substituted.
976

    
977
</P>
978
<P>
979
These substitutions are a subset of the substitutions that a shell performs
980
on unquoted and double-quoted strings.  Other kinds of substitutions done
981
by a shell, such as <CODE>${<VAR>variable</VAR>-<VAR>default</VAR>}</CODE> or
982
<CODE>$(<VAR>command-list</VAR>)</CODE> or <CODE>`<VAR>command-list</VAR>`</CODE>, are not performed
983
by the <CODE>envsubst</CODE> program, due to security reasons.
984

    
985
</P>
986
<P>
987
When <CODE>--variables</CODE> is used, standard input is ignored, and the output
988
consists of the environment variables that are referenced in
989
<VAR>shell-format</VAR>, one per line.
990

    
991
</P>
992

    
993

    
994
<H4><A NAME="SEC247" HREF="gettext_toc.html#TOC247">13.5.2.6  Invoking the <CODE>eval_gettext</CODE> function</A></H4>
995

    
996
<P>
997
<A NAME="IDX1090"></A>
998

    
999
<PRE>
1000
eval_gettext <VAR>msgid</VAR>
1001
</PRE>
1002

    
1003
<P>
1004
<A NAME="IDX1091"></A>
1005
This function outputs the native language translation of a textual message,
1006
performing dollar-substitution on the result.  Note that only shell variables
1007
mentioned in <VAR>msgid</VAR> will be dollar-substituted in the result.
1008

    
1009
</P>
1010

    
1011

    
1012
<H4><A NAME="SEC248" HREF="gettext_toc.html#TOC248">13.5.2.7  Invoking the <CODE>eval_ngettext</CODE> function</A></H4>
1013

    
1014
<P>
1015
<A NAME="IDX1092"></A>
1016

    
1017
<PRE>
1018
eval_ngettext <VAR>msgid</VAR> <VAR>msgid-plural</VAR> <VAR>count</VAR>
1019
</PRE>
1020

    
1021
<P>
1022
<A NAME="IDX1093"></A>
1023
This function outputs the native language translation of a textual message
1024
whose grammatical form depends on a number, performing dollar-substitution
1025
on the result.  Note that only shell variables mentioned in <VAR>msgid</VAR> or
1026
<VAR>msgid-plural</VAR> will be dollar-substituted in the result.
1027

    
1028
</P>
1029

    
1030

    
1031
<H3><A NAME="SEC249" HREF="gettext_toc.html#TOC249">13.5.3  bash - Bourne-Again Shell Script</A></H3>
1032
<P>
1033
<A NAME="IDX1094"></A>
1034

    
1035
</P>
1036
<P>
1037
GNU <CODE>bash</CODE> 2.0 or newer has a special shorthand for translating a
1038
string and substituting variable values in it: <CODE>$"msgid"</CODE>.  But
1039
the use of this construct is <STRONG>discouraged</STRONG>, due to the security
1040
holes it opens and due to its portability problems.
1041

    
1042
</P>
1043
<P>
1044
The security holes of <CODE>$"..."</CODE> come from the fact that after looking up
1045
the translation of the string, <CODE>bash</CODE> processes it like it processes
1046
any double-quoted string: dollar and backquote processing, like <SAMP>`eval&acute;</SAMP>
1047
does.
1048

    
1049
</P>
1050

    
1051
<OL>
1052
<LI>
1053

    
1054
In a locale whose encoding is one of BIG5, BIG5-HKSCS, GBK, GB18030, SHIFT_JIS,
1055
JOHAB, some double-byte characters have a second byte whose value is
1056
<CODE>0x60</CODE>.  For example, the byte sequence <CODE>\xe0\x60</CODE> is a single
1057
character in these locales.  Many versions of <CODE>bash</CODE> (all versions
1058
up to bash-2.05, and newer versions on platforms without <CODE>mbsrtowcs()</CODE>
1059
function) don't know about character boundaries and see a backquote character
1060
where there is only a particular Chinese character.  Thus it can start
1061
executing part of the translation as a command list.  This situation can occur
1062
even without the translator being aware of it: if the translator provides
1063
translations in the UTF-8 encoding, it is the <CODE>gettext()</CODE> function which
1064
will, during its conversion from the translator's encoding to the user's
1065
locale's encoding, produce the dangerous <CODE>\x60</CODE> bytes.
1066

    
1067
<LI>
1068

    
1069
A translator could - voluntarily or inadvertantly - use backquotes
1070
<CODE>"`...`"</CODE> or dollar-parentheses <CODE>"$(...)"</CODE> in her translations.
1071
The enclosed strings would be executed as command lists by the shell.
1072
</OL>
1073

    
1074
<P>
1075
The portability problem is that <CODE>bash</CODE> must be built with
1076
internationalization support; this is normally not the case on systems
1077
that don't have the <CODE>gettext()</CODE> function in libc.
1078

    
1079
</P>
1080

    
1081

    
1082
<H3><A NAME="SEC250" HREF="gettext_toc.html#TOC250">13.5.4  Python</A></H3>
1083
<P>
1084
<A NAME="IDX1095"></A>
1085

    
1086
</P>
1087
<DL COMPACT>
1088

    
1089
<DT>RPMs
1090
<DD>
1091
python
1092

    
1093
<DT>File extension
1094
<DD>
1095
<CODE>py</CODE>
1096

    
1097
<DT>String syntax
1098
<DD>
1099
<CODE>'abc'</CODE>, <CODE>u'abc'</CODE>, <CODE>r'abc'</CODE>, <CODE>ur'abc'</CODE>,
1100
<BR><CODE>"abc"</CODE>, <CODE>u"abc"</CODE>, <CODE>r"abc"</CODE>, <CODE>ur"abc"</CODE>,
1101
<BR><CODE>"'abc"'</CODE>, <CODE>u"'abc"'</CODE>, <CODE>r"'abc"'</CODE>, <CODE>ur"'abc"'</CODE>,
1102
<BR><CODE>"""abc"""</CODE>, <CODE>u"""abc"""</CODE>, <CODE>r"""abc"""</CODE>, <CODE>ur"""abc"""</CODE>
1103

    
1104
<DT>gettext shorthand
1105
<DD>
1106
<CODE>_('abc')</CODE> etc.
1107

    
1108
<DT>gettext/ngettext functions
1109
<DD>
1110
<CODE>gettext.gettext</CODE>, <CODE>gettext.dgettext</CODE>,
1111
<CODE>gettext.ngettext</CODE>, <CODE>gettext.dngettext</CODE>,
1112
also <CODE>ugettext</CODE>, <CODE>ungettext</CODE>
1113

    
1114
<DT>textdomain
1115
<DD>
1116
<CODE>gettext.textdomain</CODE> function, or
1117
<CODE>gettext.install(<VAR>domain</VAR>)</CODE> function
1118

    
1119
<DT>bindtextdomain
1120
<DD>
1121
<CODE>gettext.bindtextdomain</CODE> function, or
1122
<CODE>gettext.install(<VAR>domain</VAR>,<VAR>localedir</VAR>)</CODE> function
1123

    
1124
<DT>setlocale
1125
<DD>
1126
not used by the gettext emulation
1127

    
1128
<DT>Prerequisite
1129
<DD>
1130
<CODE>import gettext</CODE>
1131

    
1132
<DT>Use or emulate GNU gettext
1133
<DD>
1134
emulate.  Bug: uses only the first found .mo file, not all of them
1135

    
1136
<DT>Extractor
1137
<DD>
1138
<CODE>xgettext</CODE>
1139

    
1140
<DT>Formatting with positions
1141
<DD>
1142
<CODE>'...%(ident)d...' % { 'ident': value }</CODE>
1143

    
1144
<DT>Portability
1145
<DD>
1146
fully portable
1147

    
1148
<DT>po-mode marking
1149
<DD>
1150
---
1151
</DL>
1152

    
1153
<P>
1154
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-python</CODE>.
1155

    
1156
</P>
1157

    
1158

    
1159
<H3><A NAME="SEC251" HREF="gettext_toc.html#TOC251">13.5.5  GNU clisp - Common Lisp</A></H3>
1160
<P>
1161
<A NAME="IDX1096"></A>
1162
<A NAME="IDX1097"></A>
1163
<A NAME="IDX1098"></A>
1164

    
1165
</P>
1166
<DL COMPACT>
1167

    
1168
<DT>RPMs
1169
<DD>
1170
clisp 2.28 or newer
1171

    
1172
<DT>File extension
1173
<DD>
1174
<CODE>lisp</CODE>
1175

    
1176
<DT>String syntax
1177
<DD>
1178
<CODE>"abc"</CODE>
1179

    
1180
<DT>gettext shorthand
1181
<DD>
1182
<CODE>(_ "abc")</CODE>, <CODE>(ENGLISH "abc")</CODE>
1183

    
1184
<DT>gettext/ngettext functions
1185
<DD>
1186
<CODE>i18n:gettext</CODE>, <CODE>i18n:ngettext</CODE>
1187

    
1188
<DT>textdomain
1189
<DD>
1190
<CODE>i18n:textdomain</CODE>
1191

    
1192
<DT>bindtextdomain
1193
<DD>
1194
<CODE>i18n:textdomaindir</CODE>
1195

    
1196
<DT>setlocale
1197
<DD>
1198
automatic
1199

    
1200
<DT>Prerequisite
1201
<DD>
1202
---
1203

    
1204
<DT>Use or emulate GNU gettext
1205
<DD>
1206
use
1207

    
1208
<DT>Extractor
1209
<DD>
1210
<CODE>xgettext -k_ -kENGLISH</CODE>
1211

    
1212
<DT>Formatting with positions
1213
<DD>
1214
<CODE>format "~1@*~D ~0@*~D"</CODE>
1215

    
1216
<DT>Portability
1217
<DD>
1218
On platforms without gettext, no translation.
1219

    
1220
<DT>po-mode marking
1221
<DD>
1222
---
1223
</DL>
1224

    
1225
<P>
1226
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-clisp</CODE>.
1227

    
1228
</P>
1229

    
1230

    
1231
<H3><A NAME="SEC252" HREF="gettext_toc.html#TOC252">13.5.6  GNU clisp C sources</A></H3>
1232
<P>
1233
<A NAME="IDX1099"></A>
1234

    
1235
</P>
1236
<DL COMPACT>
1237

    
1238
<DT>RPMs
1239
<DD>
1240
clisp
1241

    
1242
<DT>File extension
1243
<DD>
1244
<CODE>d</CODE>
1245

    
1246
<DT>String syntax
1247
<DD>
1248
<CODE>"abc"</CODE>
1249

    
1250
<DT>gettext shorthand
1251
<DD>
1252
<CODE>ENGLISH ? "abc" : ""</CODE>
1253
<BR><CODE>GETTEXT("abc")</CODE>
1254
<BR><CODE>GETTEXTL("abc")</CODE>
1255

    
1256
<DT>gettext/ngettext functions
1257
<DD>
1258
<CODE>clgettext</CODE>, <CODE>clgettextl</CODE>
1259

    
1260
<DT>textdomain
1261
<DD>
1262
---
1263

    
1264
<DT>bindtextdomain
1265
<DD>
1266
---
1267

    
1268
<DT>setlocale
1269
<DD>
1270
automatic
1271

    
1272
<DT>Prerequisite
1273
<DD>
1274
<CODE>#include "lispbibl.c"</CODE>
1275

    
1276
<DT>Use or emulate GNU gettext
1277
<DD>
1278
use
1279

    
1280
<DT>Extractor
1281
<DD>
1282
<CODE>clisp-xgettext</CODE>
1283

    
1284
<DT>Formatting with positions
1285
<DD>
1286
<CODE>fprintf "%2$d %1$d"</CODE>
1287

    
1288
<DT>Portability
1289
<DD>
1290
On platforms without gettext, no translation.
1291

    
1292
<DT>po-mode marking
1293
<DD>
1294
---
1295
</DL>
1296

    
1297

    
1298

    
1299
<H3><A NAME="SEC253" HREF="gettext_toc.html#TOC253">13.5.7  Emacs Lisp</A></H3>
1300
<P>
1301
<A NAME="IDX1100"></A>
1302

    
1303
</P>
1304
<DL COMPACT>
1305

    
1306
<DT>RPMs
1307
<DD>
1308
emacs, xemacs
1309

    
1310
<DT>File extension
1311
<DD>
1312
<CODE>el</CODE>
1313

    
1314
<DT>String syntax
1315
<DD>
1316
<CODE>"abc"</CODE>
1317

    
1318
<DT>gettext shorthand
1319
<DD>
1320
<CODE>(_"abc")</CODE>
1321

    
1322
<DT>gettext/ngettext functions
1323
<DD>
1324
<CODE>gettext</CODE>, <CODE>dgettext</CODE> (xemacs only)
1325

    
1326
<DT>textdomain
1327
<DD>
1328
<CODE>domain</CODE> special form (xemacs only)
1329

    
1330
<DT>bindtextdomain
1331
<DD>
1332
<CODE>bind-text-domain</CODE> function (xemacs only)
1333

    
1334
<DT>setlocale
1335
<DD>
1336
automatic
1337

    
1338
<DT>Prerequisite
1339
<DD>
1340
---
1341

    
1342
<DT>Use or emulate GNU gettext
1343
<DD>
1344
use
1345

    
1346
<DT>Extractor
1347
<DD>
1348
<CODE>xgettext</CODE>
1349

    
1350
<DT>Formatting with positions
1351
<DD>
1352
<CODE>format "%2$d %1$d"</CODE>
1353

    
1354
<DT>Portability
1355
<DD>
1356
Only XEmacs.  Without <CODE>I18N3</CODE> defined at build time, no translation.
1357

    
1358
<DT>po-mode marking
1359
<DD>
1360
---
1361
</DL>
1362

    
1363

    
1364

    
1365
<H3><A NAME="SEC254" HREF="gettext_toc.html#TOC254">13.5.8  librep</A></H3>
1366
<P>
1367
<A NAME="IDX1101"></A>
1368

    
1369
</P>
1370
<DL COMPACT>
1371

    
1372
<DT>RPMs
1373
<DD>
1374
librep 0.15.3 or newer
1375

    
1376
<DT>File extension
1377
<DD>
1378
<CODE>jl</CODE>
1379

    
1380
<DT>String syntax
1381
<DD>
1382
<CODE>"abc"</CODE>
1383

    
1384
<DT>gettext shorthand
1385
<DD>
1386
<CODE>(_"abc")</CODE>
1387

    
1388
<DT>gettext/ngettext functions
1389
<DD>
1390
<CODE>gettext</CODE>
1391

    
1392
<DT>textdomain
1393
<DD>
1394
<CODE>textdomain</CODE> function
1395

    
1396
<DT>bindtextdomain
1397
<DD>
1398
<CODE>bindtextdomain</CODE> function
1399

    
1400
<DT>setlocale
1401
<DD>
1402
---
1403

    
1404
<DT>Prerequisite
1405
<DD>
1406
<CODE>(require 'rep.i18n.gettext)</CODE>
1407

    
1408
<DT>Use or emulate GNU gettext
1409
<DD>
1410
use
1411

    
1412
<DT>Extractor
1413
<DD>
1414
<CODE>xgettext</CODE>
1415

    
1416
<DT>Formatting with positions
1417
<DD>
1418
<CODE>format "%2$d %1$d"</CODE>
1419

    
1420
<DT>Portability
1421
<DD>
1422
On platforms without gettext, no translation.
1423

    
1424
<DT>po-mode marking
1425
<DD>
1426
---
1427
</DL>
1428

    
1429
<P>
1430
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-librep</CODE>.
1431

    
1432
</P>
1433

    
1434

    
1435
<H3><A NAME="SEC255" HREF="gettext_toc.html#TOC255">13.5.9  GNU Smalltalk</A></H3>
1436
<P>
1437
<A NAME="IDX1102"></A>
1438

    
1439
</P>
1440
<DL COMPACT>
1441

    
1442
<DT>RPMs
1443
<DD>
1444
smalltalk
1445

    
1446
<DT>File extension
1447
<DD>
1448
<CODE>st</CODE>
1449

    
1450
<DT>String syntax
1451
<DD>
1452
<CODE>'abc'</CODE>
1453

    
1454
<DT>gettext shorthand
1455
<DD>
1456
<CODE>NLS ? 'abc'</CODE>
1457

    
1458
<DT>gettext/ngettext functions
1459
<DD>
1460
<CODE>LcMessagesDomain&#62;&#62;#at:</CODE>, <CODE>LcMessagesDomain&#62;&#62;#at:plural:with:</CODE>
1461

    
1462
<DT>textdomain
1463
<DD>
1464
<CODE>LcMessages&#62;&#62;#domain:localeDirectory:</CODE> (returns a <CODE>LcMessagesDomain</CODE>
1465
object).<BR>
1466
Example: <CODE>I18N Locale default messages domain: 'gettext' localeDirectory: /usr/local/share/locale'</CODE>
1467

    
1468
<DT>bindtextdomain
1469
<DD>
1470
<CODE>LcMessages&#62;&#62;#domain:localeDirectory:</CODE>, see above.
1471

    
1472
<DT>setlocale
1473
<DD>
1474
Automatic if you use <CODE>I18N Locale default</CODE>.
1475

    
1476
<DT>Prerequisite
1477
<DD>
1478
<CODE>PackageLoader fileInPackage: 'I18N'!</CODE>
1479

    
1480
<DT>Use or emulate GNU gettext
1481
<DD>
1482
emulate
1483

    
1484
<DT>Extractor
1485
<DD>
1486
<CODE>xgettext</CODE>
1487

    
1488
<DT>Formatting with positions
1489
<DD>
1490
<CODE>'%1 %2' bindWith: 'Hello' with: 'world'</CODE>
1491

    
1492
<DT>Portability
1493
<DD>
1494
fully portable
1495

    
1496
<DT>po-mode marking
1497
<DD>
1498
---
1499
</DL>
1500

    
1501
<P>
1502
An example is available in the <TT>`examples&acute;</TT> directory:
1503
<CODE>hello-smalltalk</CODE>.
1504

    
1505
</P>
1506

    
1507

    
1508
<H3><A NAME="SEC256" HREF="gettext_toc.html#TOC256">13.5.10  Java</A></H3>
1509
<P>
1510
<A NAME="IDX1103"></A>
1511

    
1512
</P>
1513
<DL COMPACT>
1514

    
1515
<DT>RPMs
1516
<DD>
1517
java, java2
1518

    
1519
<DT>File extension
1520
<DD>
1521
<CODE>java</CODE>
1522

    
1523
<DT>String syntax
1524
<DD>
1525
"abc"
1526

    
1527
<DT>gettext shorthand
1528
<DD>
1529
_("abc")
1530

    
1531
<DT>gettext/ngettext functions
1532
<DD>
1533
<CODE>GettextResource.gettext</CODE>, <CODE>GettextResource.ngettext</CODE>
1534

    
1535
<DT>textdomain
1536
<DD>
1537
---, use <CODE>ResourceBundle.getResource</CODE> instead
1538

    
1539
<DT>bindtextdomain
1540
<DD>
1541
---, use CLASSPATH instead
1542

    
1543
<DT>setlocale
1544
<DD>
1545
automatic
1546

    
1547
<DT>Prerequisite
1548
<DD>
1549
---
1550

    
1551
<DT>Use or emulate GNU gettext
1552
<DD>
1553
---, uses a Java specific message catalog format
1554

    
1555
<DT>Extractor
1556
<DD>
1557
<CODE>xgettext -k_</CODE>
1558

    
1559
<DT>Formatting with positions
1560
<DD>
1561
<CODE>MessageFormat.format "{1,number} {0,number}"</CODE>
1562

    
1563
<DT>Portability
1564
<DD>
1565
fully portable
1566

    
1567
<DT>po-mode marking
1568
<DD>
1569
---
1570
</DL>
1571

    
1572
<P>
1573
Before marking strings as internationalizable, uses of the string
1574
concatenation operator need to be converted to <CODE>MessageFormat</CODE>
1575
applications.  For example, <CODE>"file "+filename+" not found"</CODE> becomes
1576
<CODE>MessageFormat.format("file {0} not found", new Object[] { filename })</CODE>.
1577
Only after this is done, can the strings be marked and extracted.
1578

    
1579
</P>
1580
<P>
1581
GNU gettext uses the native Java internationalization mechanism, namely
1582
<CODE>ResourceBundle</CODE>s.  There are two formats of <CODE>ResourceBundle</CODE>s:
1583
<CODE>.properties</CODE> files and <CODE>.class</CODE> files.  The <CODE>.properties</CODE>
1584
format is a text file which the translators can directly edit, like PO
1585
files, but which doesn't support plural forms.  Whereas the <CODE>.class</CODE>
1586
format is compiled from <CODE>.java</CODE> source code and can support plural
1587
forms (provided it is accessed through an appropriate API, see below).
1588

    
1589
</P>
1590
<P>
1591
To convert a PO file to a <CODE>.properties</CODE> file, the <CODE>msgcat</CODE>
1592
program can be used with the option <CODE>--properties-output</CODE>.  To convert
1593
a <CODE>.properties</CODE> file back to a PO file, the <CODE>msgcat</CODE> program
1594
can be used with the option <CODE>--properties-input</CODE>.  All the tools
1595
that manipulate PO files can work with <CODE>.properties</CODE> files as well,
1596
if given the <CODE>--properties-input</CODE> and/or <CODE>--properties-output</CODE>
1597
option.
1598

    
1599
</P>
1600
<P>
1601
To convert a PO file to a ResourceBundle class, the <CODE>msgfmt</CODE> program
1602
can be used with the option <CODE>--java</CODE> or <CODE>--java2</CODE>.  To convert a
1603
ResourceBundle back to a PO file, the <CODE>msgunfmt</CODE> program can be used
1604
with the option <CODE>--java</CODE>.
1605

    
1606
</P>
1607
<P>
1608
Two different programmatic APIs can be used to access ResourceBundles.
1609
Note that both APIs work with all kinds of ResourceBundles, whether
1610
GNU gettext generated classes, or other <CODE>.class</CODE> or <CODE>.properties</CODE>
1611
files.
1612

    
1613
</P>
1614

    
1615
<OL>
1616
<LI>
1617

    
1618
The <CODE>java.util.ResourceBundle</CODE> API.
1619

    
1620
In particular, its <CODE>getString</CODE> function returns a string translation.
1621
Note that a missing translation yields a <CODE>MissingResourceException</CODE>.
1622

    
1623
This has the advantage of being the standard API.  And it does not require
1624
any additional libraries, only the <CODE>msgcat</CODE> generated <CODE>.properties</CODE>
1625
files or the <CODE>msgfmt</CODE> generated <CODE>.class</CODE> files.  But it cannot do
1626
plural handling, even if the resource was generated by <CODE>msgfmt</CODE> from
1627
a PO file with plural handling.
1628

    
1629
<LI>
1630

    
1631
The <CODE>gnu.gettext.GettextResource</CODE> API.
1632

    
1633
Reference documentation in Javadoc 1.1 style format
1634
is in the <A HREF="javadoc1/tree.html">javadoc1 directory</A> and
1635
in Javadoc 2 style format
1636
in the <A HREF="javadoc2/index.html">javadoc2 directory</A>.
1637

    
1638
Its <CODE>gettext</CODE> function returns a string translation.  Note that when
1639
a translation is missing, the <VAR>msgid</VAR> argument is returned unchanged.
1640

    
1641
This has the advantage of having the <CODE>ngettext</CODE> function for plural
1642
handling.
1643

    
1644
<A NAME="IDX1104"></A>
1645
To use this API, one needs the <CODE>libintl.jar</CODE> file which is part of
1646
the GNU gettext package and distributed under the LGPL.
1647
</OL>
1648

    
1649
<P>
1650
Three examples, using the second API, are available in the <TT>`examples&acute;</TT>
1651
directory: <CODE>hello-java</CODE>, <CODE>hello-java-awt</CODE>, <CODE>hello-java-swing</CODE>.
1652

    
1653
</P>
1654

    
1655

    
1656
<H3><A NAME="SEC257" HREF="gettext_toc.html#TOC257">13.5.11  GNU awk</A></H3>
1657
<P>
1658
<A NAME="IDX1105"></A>
1659
<A NAME="IDX1106"></A>
1660

    
1661
</P>
1662
<DL COMPACT>
1663

    
1664
<DT>RPMs
1665
<DD>
1666
gawk 3.1 or newer
1667

    
1668
<DT>File extension
1669
<DD>
1670
<CODE>awk</CODE>
1671

    
1672
<DT>String syntax
1673
<DD>
1674
<CODE>"abc"</CODE>
1675

    
1676
<DT>gettext shorthand
1677
<DD>
1678
<CODE>_"abc"</CODE>
1679

    
1680
<DT>gettext/ngettext functions
1681
<DD>
1682
<CODE>dcgettext</CODE>, missing <CODE>dcngettext</CODE> in gawk-3.1.0
1683

    
1684
<DT>textdomain
1685
<DD>
1686
<CODE>TEXTDOMAIN</CODE> variable
1687

    
1688
<DT>bindtextdomain
1689
<DD>
1690
<CODE>bindtextdomain</CODE> function
1691

    
1692
<DT>setlocale
1693
<DD>
1694
automatic, but missing <CODE>setlocale (LC_MESSAGES, "")</CODE> in gawk-3.1.0
1695

    
1696
<DT>Prerequisite
1697
<DD>
1698
---
1699

    
1700
<DT>Use or emulate GNU gettext
1701
<DD>
1702
use
1703

    
1704
<DT>Extractor
1705
<DD>
1706
<CODE>xgettext</CODE>
1707

    
1708
<DT>Formatting with positions
1709
<DD>
1710
<CODE>printf "%2$d %1$d"</CODE> (GNU awk only)
1711

    
1712
<DT>Portability
1713
<DD>
1714
On platforms without gettext, no translation.  On non-GNU awks, you must
1715
define <CODE>dcgettext</CODE>, <CODE>dcngettext</CODE> and <CODE>bindtextdomain</CODE>
1716
yourself.
1717

    
1718
<DT>po-mode marking
1719
<DD>
1720
---
1721
</DL>
1722

    
1723
<P>
1724
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-gawk</CODE>.
1725

    
1726
</P>
1727

    
1728

    
1729
<H3><A NAME="SEC258" HREF="gettext_toc.html#TOC258">13.5.12  Pascal - Free Pascal Compiler</A></H3>
1730
<P>
1731
<A NAME="IDX1107"></A>
1732
<A NAME="IDX1108"></A>
1733
<A NAME="IDX1109"></A>
1734

    
1735
</P>
1736
<DL COMPACT>
1737

    
1738
<DT>RPMs
1739
<DD>
1740
fpk
1741

    
1742
<DT>File extension
1743
<DD>
1744
<CODE>pp</CODE>, <CODE>pas</CODE>
1745

    
1746
<DT>String syntax
1747
<DD>
1748
<CODE>'abc'</CODE>
1749

    
1750
<DT>gettext shorthand
1751
<DD>
1752
automatic
1753

    
1754
<DT>gettext/ngettext functions
1755
<DD>
1756
---, use <CODE>ResourceString</CODE> data type instead
1757

    
1758
<DT>textdomain
1759
<DD>
1760
---, use <CODE>TranslateResourceStrings</CODE> function instead
1761

    
1762
<DT>bindtextdomain
1763
<DD>
1764
---, use <CODE>TranslateResourceStrings</CODE> function instead
1765

    
1766
<DT>setlocale
1767
<DD>
1768
automatic, but uses only LANG, not LC_MESSAGES or LC_ALL
1769

    
1770
<DT>Prerequisite
1771
<DD>
1772
<CODE>{$mode delphi}</CODE> or <CODE>{$mode objfpc}</CODE><BR><CODE>uses gettext;</CODE>
1773

    
1774
<DT>Use or emulate GNU gettext
1775
<DD>
1776
emulate partially
1777

    
1778
<DT>Extractor
1779
<DD>
1780
<CODE>ppc386</CODE> followed by <CODE>xgettext</CODE> or <CODE>rstconv</CODE>
1781

    
1782
<DT>Formatting with positions
1783
<DD>
1784
<CODE>uses sysutils;</CODE><BR><CODE>format "%1:d %0:d"</CODE>
1785

    
1786
<DT>Portability
1787
<DD>
1788
?
1789

    
1790
<DT>po-mode marking
1791
<DD>
1792
---
1793
</DL>
1794

    
1795
<P>
1796
The Pascal compiler has special support for the <CODE>ResourceString</CODE> data
1797
type.  It generates a <CODE>.rst</CODE> file.  This is then converted to a
1798
<CODE>.pot</CODE> file by use of <CODE>xgettext</CODE> or <CODE>rstconv</CODE>.  At runtime,
1799
a <CODE>.mo</CODE> file corresponding to translations of this <CODE>.pot</CODE> file
1800
can be loaded using the <CODE>TranslateResourceStrings</CODE> function in the
1801
<CODE>gettext</CODE> unit.
1802

    
1803
</P>
1804
<P>
1805
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-pascal</CODE>.
1806

    
1807
</P>
1808

    
1809

    
1810
<H3><A NAME="SEC259" HREF="gettext_toc.html#TOC259">13.5.13  wxWindows library</A></H3>
1811
<P>
1812
<A NAME="IDX1110"></A>
1813

    
1814
</P>
1815
<DL COMPACT>
1816

    
1817
<DT>RPMs
1818
<DD>
1819
wxGTK, gettext
1820

    
1821
<DT>File extension
1822
<DD>
1823
<CODE>cpp</CODE>
1824

    
1825
<DT>String syntax
1826
<DD>
1827
<CODE>"abc"</CODE>
1828

    
1829
<DT>gettext shorthand
1830
<DD>
1831
<CODE>_("abc")</CODE>
1832

    
1833
<DT>gettext/ngettext functions
1834
<DD>
1835
<CODE>wxLocale::GetString</CODE>, <CODE>wxGetTranslation</CODE>
1836

    
1837
<DT>textdomain
1838
<DD>
1839
<CODE>wxLocale::AddCatalog</CODE>
1840

    
1841
<DT>bindtextdomain
1842
<DD>
1843
<CODE>wxLocale::AddCatalogLookupPathPrefix</CODE>
1844

    
1845
<DT>setlocale
1846
<DD>
1847
<CODE>wxLocale::Init</CODE>, <CODE>wxSetLocale</CODE>
1848

    
1849
<DT>Prerequisite
1850
<DD>
1851
<CODE>#include &#60;wx/intl.h&#62;</CODE>
1852

    
1853
<DT>Use or emulate GNU gettext
1854
<DD>
1855
emulate, see <CODE>include/wx/intl.h</CODE> and <CODE>src/common/intl.cpp</CODE>
1856

    
1857
<DT>Extractor
1858
<DD>
1859
<CODE>xgettext</CODE>
1860

    
1861
<DT>Formatting with positions
1862
<DD>
1863
---
1864

    
1865
<DT>Portability
1866
<DD>
1867
fully portable
1868

    
1869
<DT>po-mode marking
1870
<DD>
1871
yes
1872
</DL>
1873

    
1874

    
1875

    
1876
<H3><A NAME="SEC260" HREF="gettext_toc.html#TOC260">13.5.14  YCP - YaST2 scripting language</A></H3>
1877
<P>
1878
<A NAME="IDX1111"></A>
1879
<A NAME="IDX1112"></A>
1880

    
1881
</P>
1882
<DL COMPACT>
1883

    
1884
<DT>RPMs
1885
<DD>
1886
libycp, libycp-devel, yast2-core, yast2-core-devel
1887

    
1888
<DT>File extension
1889
<DD>
1890
<CODE>ycp</CODE>
1891

    
1892
<DT>String syntax
1893
<DD>
1894
<CODE>"abc"</CODE>
1895

    
1896
<DT>gettext shorthand
1897
<DD>
1898
<CODE>_("abc")</CODE>
1899

    
1900
<DT>gettext/ngettext functions
1901
<DD>
1902
<CODE>_()</CODE> with 1 or 3 arguments
1903

    
1904
<DT>textdomain
1905
<DD>
1906
<CODE>textdomain</CODE> statement
1907

    
1908
<DT>bindtextdomain
1909
<DD>
1910
---
1911

    
1912
<DT>setlocale
1913
<DD>
1914
---
1915

    
1916
<DT>Prerequisite
1917
<DD>
1918
---
1919

    
1920
<DT>Use or emulate GNU gettext
1921
<DD>
1922
use
1923

    
1924
<DT>Extractor
1925
<DD>
1926
<CODE>xgettext</CODE>
1927

    
1928
<DT>Formatting with positions
1929
<DD>
1930
<CODE>sformat "%2 %1"</CODE>
1931

    
1932
<DT>Portability
1933
<DD>
1934
fully portable
1935

    
1936
<DT>po-mode marking
1937
<DD>
1938
---
1939
</DL>
1940

    
1941
<P>
1942
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-ycp</CODE>.
1943

    
1944
</P>
1945

    
1946

    
1947
<H3><A NAME="SEC261" HREF="gettext_toc.html#TOC261">13.5.15  Tcl - Tk's scripting language</A></H3>
1948
<P>
1949
<A NAME="IDX1113"></A>
1950
<A NAME="IDX1114"></A>
1951

    
1952
</P>
1953
<DL COMPACT>
1954

    
1955
<DT>RPMs
1956
<DD>
1957
tcl
1958

    
1959
<DT>File extension
1960
<DD>
1961
<CODE>tcl</CODE>
1962

    
1963
<DT>String syntax
1964
<DD>
1965
<CODE>"abc"</CODE>
1966

    
1967
<DT>gettext shorthand
1968
<DD>
1969
<CODE>[_ "abc"]</CODE>
1970

    
1971
<DT>gettext/ngettext functions
1972
<DD>
1973
<CODE>::msgcat::mc</CODE>
1974

    
1975
<DT>textdomain
1976
<DD>
1977
---
1978

    
1979
<DT>bindtextdomain
1980
<DD>
1981
---, use <CODE>::msgcat::mcload</CODE> instead
1982

    
1983
<DT>setlocale
1984
<DD>
1985
automatic, uses LANG, but ignores LC_MESSAGES and LC_ALL
1986

    
1987
<DT>Prerequisite
1988
<DD>
1989
<CODE>package require msgcat</CODE>
1990
<BR><CODE>proc _ {s} {return [::msgcat::mc $s]}</CODE>
1991

    
1992
<DT>Use or emulate GNU gettext
1993
<DD>
1994
---, uses a Tcl specific message catalog format
1995

    
1996
<DT>Extractor
1997
<DD>
1998
<CODE>xgettext -k_</CODE>
1999

    
2000
<DT>Formatting with positions
2001
<DD>
2002
<CODE>format "%2\$d %1\$d"</CODE>
2003

    
2004
<DT>Portability
2005
<DD>
2006
fully portable
2007

    
2008
<DT>po-mode marking
2009
<DD>
2010
---
2011
</DL>
2012

    
2013
<P>
2014
Two examples are available in the <TT>`examples&acute;</TT> directory:
2015
<CODE>hello-tcl</CODE>, <CODE>hello-tcl-tk</CODE>.
2016

    
2017
</P>
2018
<P>
2019
Before marking strings as internationalizable, substitutions of variables
2020
into the string need to be converted to <CODE>format</CODE> applications.  For
2021
example, <CODE>"file $filename not found"</CODE> becomes
2022
<CODE>[format "file %s not found" $filename]</CODE>.
2023
Only after this is done, can the strings be marked and extracted.
2024
After marking, this example becomes
2025
<CODE>[format [_ "file %s not found"] $filename]</CODE> or
2026
<CODE>[msgcat::mc "file %s not found" $filename]</CODE>.  Note that the
2027
<CODE>msgcat::mc</CODE> function implicitly calls <CODE>format</CODE> when more than one
2028
argument is given.
2029

    
2030
</P>
2031

    
2032

    
2033
<H3><A NAME="SEC262" HREF="gettext_toc.html#TOC262">13.5.16  Perl</A></H3>
2034
<P>
2035
<A NAME="IDX1115"></A>
2036

    
2037
</P>
2038
<DL COMPACT>
2039

    
2040
<DT>RPMs
2041
<DD>
2042
perl
2043

    
2044
<DT>File extension
2045
<DD>
2046
<CODE>pl</CODE>, <CODE>PL</CODE>, <CODE>pm</CODE>, <CODE>cgi</CODE>
2047

    
2048
<DT>String syntax
2049
<DD>
2050

    
2051
<UL>
2052

    
2053
<LI><CODE>"abc"</CODE>
2054

    
2055
<LI><CODE>'abc'</CODE>
2056

    
2057
<LI><CODE>qq (abc)</CODE>
2058

    
2059
<LI><CODE>q (abc)</CODE>
2060

    
2061
<LI><CODE>qr /abc/</CODE>
2062

    
2063
<LI><CODE>qx (/bin/date)</CODE>
2064

    
2065
<LI><CODE>/pattern match/</CODE>
2066

    
2067
<LI><CODE>?pattern match?</CODE>
2068

    
2069
<LI><CODE>s/substitution/operators/</CODE>
2070

    
2071
<LI><CODE>$tied_hash{"message"}</CODE>
2072

    
2073
<LI><CODE>$tied_hash_reference-&#62;{"message"}</CODE>
2074

    
2075
<LI>etc., issue the command <SAMP>`man perlsyn&acute;</SAMP> for details
2076

    
2077
</UL>
2078

    
2079
<DT>gettext shorthand
2080
<DD>
2081
<CODE>__</CODE> (double underscore)
2082

    
2083
<DT>gettext/ngettext functions
2084
<DD>
2085
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
2086
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
2087

    
2088
<DT>textdomain
2089
<DD>
2090
<CODE>textdomain</CODE> function
2091

    
2092
<DT>bindtextdomain
2093
<DD>
2094
<CODE>bindtextdomain</CODE> function
2095

    
2096
<DT>bind_textdomain_codeset
2097
<DD>
2098
<CODE>bind_textdomain_codeset</CODE> function
2099

    
2100
<DT>setlocale
2101
<DD>
2102
Use <CODE>setlocale (LC_ALL, "");</CODE>
2103

    
2104
<DT>Prerequisite
2105
<DD>
2106
<CODE>use POSIX;</CODE>
2107
<BR><CODE>use Locale::TextDomain;</CODE> (included in the package libintl-perl
2108
which is available on the Comprehensive Perl Archive Network CPAN,
2109
http://www.cpan.org/).
2110

    
2111
<DT>Use or emulate GNU gettext
2112
<DD>
2113
platform dependent: gettext_pp emulates, gettext_xs uses GNU gettext
2114

    
2115
<DT>Extractor
2116
<DD>
2117
<CODE>xgettext -k__ -k\$__ -k%__ -k__x -k__n:1,2 -k__nx:1,2 -k__xn:1,2 -kN__ -k</CODE>
2118

    
2119
<DT>Formatting with positions
2120
<DD>
2121
Both kinds of format strings support formatting with positions.
2122
<BR><CODE>printf "%2\$d %1\$d", ...</CODE> (requires Perl 5.8.0 or newer)
2123
<BR><CODE>__expand("[new] replaces [old]", old =&#62; $oldvalue, new =&#62; $newvalue)</CODE>
2124

    
2125
<DT>Portability
2126
<DD>
2127
The <CODE>libintl-perl</CODE> package is platform independent but is not
2128
part of the Perl core.  The programmer is responsible for
2129
providing a dummy implementation of the required functions if the 
2130
package is not installed on the target system.
2131

    
2132
<DT>po-mode marking
2133
<DD>
2134
---
2135

    
2136
<DT>Documentation
2137
<DD>
2138
Included in <CODE>libintl-perl</CODE>, available on CPAN
2139
(http://www.cpan.org/).
2140

    
2141
</DL>
2142

    
2143
<P>
2144
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-perl</CODE>.
2145

    
2146
</P>
2147
<P>
2148
<A NAME="IDX1116"></A>
2149

    
2150
</P>
2151
<P>
2152
The <CODE>xgettext</CODE> parser backend for Perl differs significantly from
2153
the parser backends for other programming languages, just as Perl
2154
itself differs significantly from other programming languages.  The
2155
Perl parser backend offers many more string marking facilities than
2156
the other backends but it also has some Perl specific limitations, the
2157
worst probably being its imperfectness.
2158

    
2159
</P>
2160

    
2161

    
2162

    
2163
<H4><A NAME="SEC263" HREF="gettext_toc.html#TOC263">13.5.16.1  General Problems Parsing Perl Code</A></H4>
2164

    
2165
<P>
2166
It is often heard that only Perl can parse Perl.  This is not true.
2167
Perl cannot be <EM>parsed</EM> at all, it can only be <EM>executed</EM>.
2168
Perl has various built-in ambiguities that can only be resolved at runtime.
2169

    
2170
</P>
2171
<P>
2172
The following example may illustrate one common problem:
2173

    
2174
</P>
2175

    
2176
<PRE>
2177
print gettext "Hello World!";
2178
</PRE>
2179

    
2180
<P>
2181
Although this example looks like a bullet-proof case of a function
2182
invocation, it is not:
2183

    
2184
</P>
2185

    
2186
<PRE>
2187
open gettext, "&#62;testfile" or die;
2188
print gettext "Hello world!"
2189
</PRE>
2190

    
2191
<P>
2192
In this context, the string <CODE>gettext</CODE> looks more like a
2193
file handle.  But not necessarily:
2194

    
2195
</P>
2196

    
2197
<PRE>
2198
use Locale::Messages qw (:libintl_h);
2199
open gettext "&#62;testfile" or die;
2200
print gettext "Hello world!";
2201
</PRE>
2202

    
2203
<P>
2204
Now, the file is probably syntactically incorrect, provided that the module
2205
<CODE>Locale::Messages</CODE> found first in the Perl include path exports a
2206
function <CODE>gettext</CODE>.  But what if the module
2207
<CODE>Locale::Messages</CODE> really looks like this?
2208

    
2209
</P>
2210

    
2211
<PRE>
2212
use vars qw (*gettext);
2213

    
2214
1;
2215
</PRE>
2216

    
2217
<P>
2218
In this case, the string <CODE>gettext</CODE> will be interpreted as a file
2219
handle again, and the above example will create a file <TT>`testfile&acute;</TT>
2220
and write the string "Hello world!" into it.  Even advanced
2221
control flow analysis will not really help:
2222

    
2223
</P>
2224

    
2225
<PRE>
2226
if (0.5 &#60; rand) {
2227
   eval "use Sane";
2228
} else {
2229
   eval "use InSane";
2230
}
2231
print gettext "Hello world!";
2232
</PRE>
2233

    
2234
<P>
2235
If the module <CODE>Sane</CODE> exports a function <CODE>gettext</CODE> that does
2236
what we expect, and the module <CODE>InSane</CODE> opens a file for writing
2237
and associates the <EM>handle</EM> <CODE>gettext</CODE> with this output
2238
stream, we are clueless again about what will happen at runtime.  It is
2239
completely unpredictable.  The truth is that Perl has so many ways to
2240
fill its symbol table at runtime that it is impossible to interpret a
2241
particular piece of code without executing it.
2242

    
2243
</P>
2244
<P>
2245
Of course, <CODE>xgettext</CODE> will not execute your Perl sources while
2246
scanning for translatable strings, but rather use heuristics in order
2247
to guess what you meant.
2248

    
2249
</P>
2250
<P>
2251
Another problem is the ambiguity of the slash and the question mark.
2252
Their interpretation depends on the context:
2253

    
2254
</P>
2255

    
2256
<PRE>
2257
# A pattern match.
2258
print "OK\n" if /foobar/;
2259

    
2260
# A division.
2261
print 1 / 2;
2262

    
2263
# Another pattern match.
2264
print "OK\n" if ?foobar?;
2265

    
2266
# Conditional.
2267
print $x ? "foo" : "bar";
2268
</PRE>
2269

    
2270
<P>
2271
The slash may either act as the division operator or introduce a
2272
pattern match, whereas the question mark may act as the ternary
2273
conditional operator or as a pattern match, too.  Other programming
2274
languages like <CODE>awk</CODE> present similar problems, but the consequences of a
2275
misinterpretation are particularly nasty with Perl sources.  In <CODE>awk</CODE>
2276
for instance, a statement can never exceed one line and the parser
2277
can recover from a parsing error at the next newline and interpret
2278
the rest of the input stream correctly.  Perl is different, as a
2279
pattern match is terminated by the next appearance of the delimiter
2280
(the slash or the question mark) in the input stream, regardless of
2281
the semantic context.  If a slash is really a division sign but
2282
mis-interpreted as a pattern match, the rest of the input file is most
2283
probably parsed incorrectly.
2284

    
2285
</P>
2286
<P>
2287
If you find that <CODE>xgettext</CODE> fails to extract strings from
2288
portions of your sources, you should therefore look out for slashes
2289
and/or question marks preceding these sections.  You may have come
2290
across a bug in <CODE>xgettext</CODE>'s Perl parser (and of course you
2291
should report that bug).  In the meantime you should consider to
2292
reformulate your code in a manner less challenging to <CODE>xgettext</CODE>.
2293

    
2294
</P>
2295

    
2296

    
2297
<H4><A NAME="SEC264" HREF="gettext_toc.html#TOC264">13.5.16.2  Which keywords will xgettext look for?</A></H4>
2298
<P>
2299
<A NAME="IDX1117"></A>
2300

    
2301
</P>
2302
<P>
2303
Unless you instruct <CODE>xgettext</CODE> otherwise by invoking it with one
2304
of the options <CODE>--keyword</CODE> or <CODE>-k</CODE>, it will recognize the
2305
following keywords in your Perl sources:
2306

    
2307
</P>
2308

    
2309
<UL>
2310

    
2311
<LI><CODE>gettext</CODE>
2312

    
2313
<LI><CODE>dgettext</CODE>
2314

    
2315
<LI><CODE>dcgettext</CODE>
2316

    
2317
<LI><CODE>ngettext:1,2</CODE>
2318

    
2319
The first (singular) and the second (plural) argument will be
2320
extracted.
2321

    
2322
<LI><CODE>dngettext:1,2</CODE>
2323

    
2324
The first (singular) and the second (plural) argument will be
2325
extracted.
2326

    
2327
<LI><CODE>dcngettext:1,2</CODE>
2328

    
2329
The first (singular) and the second (plural) argument will be
2330
extracted.
2331

    
2332
<LI><CODE>gettext_noop</CODE>
2333

    
2334
<LI><CODE>%gettext</CODE>
2335

    
2336
The keys of lookups into the hash <CODE>%gettext</CODE> will be extracted.
2337

    
2338
<LI><CODE>$gettext</CODE>
2339

    
2340
The keys of lookups into the hash reference <CODE>$gettext</CODE> will be extracted.
2341

    
2342
</UL>
2343

    
2344

    
2345

    
2346
<H4><A NAME="SEC265" HREF="gettext_toc.html#TOC265">13.5.16.3  How to Extract Hash Keys</A></H4>
2347
<P>
2348
<A NAME="IDX1118"></A>
2349

    
2350
</P>
2351
<P>
2352
Translating messages at runtime is normally performed by looking up the
2353
original string in the translation database and returning the
2354
translated version.  The "natural" Perl implementation is a hash
2355
lookup, and, of course, <CODE>xgettext</CODE> supports such practice.
2356

    
2357
</P>
2358

    
2359
<PRE>
2360
print __"Hello world!";
2361
print $__{"Hello world!"};
2362
print $__-&#62;{"Hello world!"};
2363
print $$__{"Hello world!"};
2364
</PRE>
2365

    
2366
<P>
2367
The above four lines all do the same thing.  The Perl module 
2368
<CODE>Locale::TextDomain</CODE> exports by default a hash <CODE>%__</CODE> that
2369
is tied to the function <CODE>__()</CODE>.  It also exports a reference
2370
<CODE>$__</CODE> to <CODE>%__</CODE>.
2371

    
2372
</P>
2373
<P>
2374
If an argument to the <CODE>xgettext</CODE> option <CODE>--keyword</CODE>,
2375
resp. <CODE>-k</CODE> starts with a percent sign, the rest of the keyword is
2376
interpreted as the name of a hash.  If it starts with a dollar
2377
sign, the rest of the keyword is interpreted as a reference to a
2378
hash.
2379

    
2380
</P>
2381
<P>
2382
Note that you can omit the quotation marks (single or double) around
2383
the hash key (almost) whenever Perl itself allows it:
2384

    
2385
</P>
2386

    
2387
<PRE>
2388
print $gettext{Error};
2389
</PRE>
2390

    
2391
<P>
2392
The exact rule is: You can omit the surrounding quotes, when the hash
2393
key is a valid C (!) identifier, i. e. when it starts with an
2394
underscore or an ASCII letter and is followed by an arbitrary number
2395
of underscores, ASCII letters or digits.  Other Unicode characters
2396
are <EM>not</EM> allowed, regardless of the <CODE>use utf8</CODE> pragma.
2397

    
2398
</P>
2399

    
2400

    
2401
<H4><A NAME="SEC266" HREF="gettext_toc.html#TOC266">13.5.16.4  What are Strings And Quote-like Expressions?</A></H4>
2402
<P>
2403
<A NAME="IDX1119"></A>
2404

    
2405
</P>
2406
<P>
2407
Perl offers a plethora of different string constructs.  Those that can
2408
be used either as arguments to functions or inside braces for hash
2409
lookups are generally supported by <CODE>xgettext</CODE>.  
2410

    
2411
</P>
2412

    
2413
<UL>
2414
<LI><STRONG>double-quoted strings</STRONG>
2415

    
2416
<BR>
2417

    
2418
<PRE>
2419
print gettext "Hello World!";
2420
</PRE>
2421

    
2422
<LI><STRONG>single-quoted strings</STRONG>
2423

    
2424
<BR>
2425

    
2426
<PRE>
2427
print gettext 'Hello World!';
2428
</PRE>
2429

    
2430
<LI><STRONG>the operator qq</STRONG>
2431

    
2432
<BR>
2433

    
2434
<PRE>
2435
print gettext qq |Hello World!|;
2436
print gettext qq &#60;E-mail: &#60;guido\@imperia.net&#62;&#62;;
2437
</PRE>
2438

    
2439
The operator <CODE>qq</CODE> is fully supported.  You can use arbitrary
2440
delimiters, including the four bracketing delimiters (round, angle,
2441
square, curly) that nest.
2442

    
2443
<LI><STRONG>the operator q</STRONG>
2444

    
2445
<BR>
2446

    
2447
<PRE>
2448
print gettext q |Hello World!|;
2449
print gettext q &#60;E-mail: &#60;guido@imperia.net&#62;&#62;;
2450
</PRE>
2451

    
2452
The operator <CODE>q</CODE> is fully supported.  You can use arbitrary
2453
delimiters, including the four bracketing delimiters (round, angle,
2454
square, curly) that nest.
2455

    
2456
<LI><STRONG>the operator qx</STRONG>
2457

    
2458
<BR>
2459

    
2460
<PRE>
2461
print gettext qx ;LANGUAGE=C /bin/date;
2462
print gettext qx [/usr/bin/ls | grep '^[A-Z]*'];
2463
</PRE>
2464

    
2465
The operator <CODE>qx</CODE> is fully supported.  You can use arbitrary
2466
delimiters, including the four bracketing delimiters (round, angle,
2467
square, curly) that nest.
2468

    
2469
The example is actually a useless use of <CODE>gettext</CODE>.  It will
2470
invoke the <CODE>gettext</CODE> function on the output of the command
2471
specified with the <CODE>qx</CODE> operator.  The feature was included
2472
in order to make the interface consistent (the parser will extract
2473
all strings and quote-like expressions).
2474

    
2475
<LI><STRONG>here documents</STRONG>
2476

    
2477
<BR>
2478

    
2479
<PRE>
2480
print gettext &#60;&#60;'EOF';
2481
program not found in $PATH
2482
EOF
2483

    
2484
print ngettext &#60;&#60;EOF, &#60;&#60;"EOF";
2485
one file deleted
2486
EOF
2487
several files deleted
2488
EOF
2489
</PRE>
2490

    
2491
Here-documents are recognized.  If the delimiter is enclosed in single
2492
quotes, the string is not interpolated.  If it is enclosed in double
2493
quotes or has no quotes at all, the string is interpolated.
2494

    
2495
Delimiters that start with a digit are not supported!
2496

    
2497
</UL>
2498

    
2499

    
2500

    
2501
<H4><A NAME="SEC267" HREF="gettext_toc.html#TOC267">13.5.16.5  Invalid Uses Of String Interpolation</A></H4>
2502
<P>
2503
<A NAME="IDX1120"></A>
2504

    
2505
</P>
2506
<P>
2507
Perl is capable of interpolating variables into strings.  This offers
2508
some nice features in localized programs but can also lead to
2509
problems.
2510

    
2511
</P>
2512
<P>
2513
A common error is a construct like the following:
2514

    
2515
</P>
2516

    
2517
<PRE>
2518
print gettext "This is the program $0!\n";
2519
</PRE>
2520

    
2521
<P>
2522
Perl will interpolate at runtime the value of the variable <CODE>$0</CODE>
2523
into the argument of the <CODE>gettext()</CODE> function.  Hence, this
2524
argument is not a string constant but a variable argument (<CODE>$0</CODE>
2525
is a global variable that holds the name of the Perl script being
2526
executed).  The interpolation is performed by Perl before the string
2527
argument is passed to <CODE>gettext()</CODE> and will therefore depend on
2528
the name of the script which can only be determined at runtime.
2529
Consequently, it is almost impossible that a translation can be looked
2530
up at runtime (except if, by accident, the interpolated string is found
2531
in the message catalog).
2532

    
2533
</P>
2534
<P>
2535
The <CODE>xgettext</CODE> program will therefore terminate parsing with a fatal
2536
error if it encounters a variable inside of an extracted string.  In
2537
general, this will happen for all kinds of string interpolations that
2538
cannot be safely performed at compile time.  If you absolutely know
2539
what you are doing, you can always circumvent this behavior:
2540

    
2541
</P>
2542

    
2543
<PRE>
2544
my $know_what_i_am_doing = "This is program $0!\n";
2545
print gettext $know_what_i_am_doing;
2546
</PRE>
2547

    
2548
<P>
2549
Since the parser only recognizes strings and quote-like expressions,
2550
but not variables or other terms, the above construct will be
2551
accepted.  You will have to find another way, however, to let your
2552
original string make it into your message catalog.
2553

    
2554
</P>
2555
<P>
2556
If invoked with the option <CODE>--extract-all</CODE>, resp. <CODE>-a</CODE>,
2557
variable interpolation will be accepted.  Rationale: You will
2558
generally use this option in order to prepare your sources for
2559
internationalization.
2560

    
2561
</P>
2562
<P>
2563
Please see the manual page <SAMP>`man perlop&acute;</SAMP> for details of strings and
2564
quote-like expressions that are subject to interpolation and those
2565
that are not.  Safe interpolations (that will not lead to a fatal
2566
error) are:
2567

    
2568
</P>
2569

    
2570
<UL>
2571

    
2572
<LI>the escape sequences <CODE>\t</CODE> (tab, HT, TAB), <CODE>\n</CODE>
2573

    
2574
(newline, NL), <CODE>\r</CODE> (return, CR), <CODE>\f</CODE> (form feed, FF),
2575
<CODE>\b</CODE> (backspace, BS), <CODE>\a</CODE> (alarm, bell, BEL), and <CODE>\e</CODE>
2576
(escape, ESC).
2577

    
2578
<LI>octal chars, like <CODE>\033</CODE>
2579

    
2580
<BR>
2581
Note that octal escapes in the range of 400-777 are translated into a 
2582
UTF-8 representation, regardless of the presence of the <CODE>use utf8</CODE> pragma.
2583

    
2584
<LI>hex chars, like <CODE>\x1b</CODE>
2585

    
2586
<LI>wide hex chars, like <CODE>\x{263a}</CODE>
2587

    
2588
<BR>
2589
Note that this escape is translated into a UTF-8 representation,
2590
regardless of the presence of the <CODE>use utf8</CODE> pragma.
2591

    
2592
<LI>control chars, like <CODE>\c[</CODE> (CTRL-[)
2593

    
2594
<LI>named Unicode chars, like <CODE>\N{LATIN CAPITAL LETTER C WITH CEDILLA}</CODE>
2595

    
2596
<BR>
2597
Note that this escape is translated into a UTF-8 representation,
2598
regardless of the presence of the <CODE>use utf8</CODE> pragma.
2599
</UL>
2600

    
2601
<P>
2602
The following escapes are considered partially safe:
2603

    
2604
</P>
2605

    
2606
<UL>
2607

    
2608
<LI><CODE>\l</CODE> lowercase next char
2609

    
2610
<LI><CODE>\u</CODE> uppercase next char
2611

    
2612
<LI><CODE>\L</CODE> lowercase till \E
2613

    
2614
<LI><CODE>\U</CODE> uppercase till \E
2615

    
2616
<LI><CODE>\E</CODE> end case modification
2617

    
2618
<LI><CODE>\Q</CODE> quote non-word characters till \E
2619

    
2620
</UL>
2621

    
2622
<P>
2623
These escapes are only considered safe if the string consists of
2624
ASCII characters only.  Translation of characters outside the range
2625
defined by ASCII is locale-dependent and can actually only be performed 
2626
at runtime; <CODE>xgettext</CODE> doesn't do these locale-dependent translations
2627
at extraction time.
2628

    
2629
</P>
2630
<P>
2631
Except for the modifier <CODE>\Q</CODE>, these translations, albeit valid,
2632
are generally useless and only obfuscate your sources.  If a
2633
translation can be safely performed at compile time you can just as
2634
well write what you mean.
2635

    
2636
</P>
2637

    
2638

    
2639
<H4><A NAME="SEC268" HREF="gettext_toc.html#TOC268">13.5.16.6  Valid Uses Of String Interpolation</A></H4>
2640
<P>
2641
<A NAME="IDX1121"></A>
2642

    
2643
</P>
2644
<P>
2645
Perl is often used to generate sources for other programming languages
2646
or arbitrary file formats.  Web applications that output HTML code
2647
make a prominent example for such usage.
2648

    
2649
</P>
2650
<P>
2651
You will often come across situations where you want to intersperse
2652
code written in the target (programming) language with translatable
2653
messages, like in the following HTML example:
2654

    
2655
</P>
2656

    
2657
<PRE>
2658
print gettext &#60;&#60;EOF;
2659
&#60;h1&#62;My Homepage&#60;/h1&#62;
2660
&#60;script language="JavaScript"&#62;&#60;!--
2661
for (i = 0; i &#60; 100; ++i) {
2662
    alert ("Thank you so much for visiting my homepage!");
2663
}
2664
//--&#62;&#60;/script&#62;
2665
EOF
2666
</PRE>
2667

    
2668
<P>
2669
The parser will extract the entire here document, and it will appear
2670
entirely in the resulting PO file, including the JavaScript snippet
2671
embedded in the HTML code.  If you exaggerate with constructs like 
2672
the above, you will run the risk that the translators of your package 
2673
will look out for a less challenging project.  You should consider an 
2674
alternative expression here:
2675

    
2676
</P>
2677

    
2678
<PRE>
2679
print &#60;&#60;EOF;
2680
&#60;h1&#62;$gettext{"My Homepage"}&#60;/h1&#62;
2681
&#60;script language="JavaScript"&#62;&#60;!--
2682
for (i = 0; i &#60; 100; ++i) {
2683
    alert ("$gettext{'Thank you so much for visiting my homepage!'}");
2684
}
2685
//--&#62;&#60;/script&#62;
2686
EOF
2687
</PRE>
2688

    
2689
<P>
2690
Only the translatable portions of the code will be extracted here, and
2691
the resulting PO file will begrudgingly improve in terms of readability.
2692

    
2693
</P>
2694
<P>
2695
You can interpolate hash lookups in all strings or quote-like
2696
expressions that are subject to interpolation (see the manual page
2697
<SAMP>`man perlop&acute;</SAMP> for details).  Double interpolation is invalid, however:
2698

    
2699
</P>
2700

    
2701
<PRE>
2702
# TRANSLATORS: Replace "the earth" with the name of your planet.
2703
print gettext qq{Welcome to $gettext-&#62;{"the earth"}};
2704
</PRE>
2705

    
2706
<P>
2707
The <CODE>qq</CODE>-quoted string is recognized as an argument to <CODE>xgettext</CODE> in
2708
the first place, and checked for invalid variable interpolation.  The
2709
dollar sign of hash-dereferencing will therefore terminate the parser 
2710
with an "invalid interpolation" error.
2711

    
2712
</P>
2713
<P>
2714
It is valid to interpolate hash lookups in regular expressions:
2715

    
2716
</P>
2717

    
2718
<PRE>
2719
if ($var =~ /$gettext{"the earth"}/) {
2720
   print gettext "Match!\n";
2721
}
2722
s/$gettext{"U. S. A."}/$gettext{"U. S. A."} $gettext{"(dial +0)"}/g;
2723
</PRE>
2724

    
2725

    
2726

    
2727
<H4><A NAME="SEC269" HREF="gettext_toc.html#TOC269">13.5.16.7  When To Use Parentheses</A></H4>
2728
<P>
2729
<A NAME="IDX1122"></A>
2730

    
2731
</P>
2732
<P>
2733
In Perl, parentheses around function arguments are mostly optional.
2734
<CODE>xgettext</CODE> will always assume that all
2735
recognized keywords (except for hashs and hash references) are names
2736
of properly prototyped functions, and will (hopefully) only require
2737
parentheses where Perl itself requires them.  All constructs in the
2738
following example are therefore ok to use:
2739

    
2740
</P>
2741

    
2742
<PRE>
2743
print gettext ("Hello World!\n");
2744
print gettext "Hello World!\n";
2745
print dgettext ($package =&#62; "Hello World!\n");
2746
print dgettext $package, "Hello World!\n";
2747

    
2748
# The "fat comma" =&#62; turns the left-hand side argument into a
2749
# single-quoted string!
2750
print dgettext smellovision =&#62; "Hello World!\n";
2751

    
2752
# The following assignment only works with prototyped functions.
2753
# Otherwise, the functions will act as "greedy" list operators and
2754
# eat up all following arguments.
2755
my $anonymous_hash = {
2756
   planet =&#62; gettext "earth",
2757
   cakes =&#62; ngettext "one cake", "several cakes", $n,
2758
   still =&#62; $works,
2759
};
2760
# The same without fat comma:
2761
my $other_hash = {
2762
   'planet', gettext "earth",
2763
   'cakes', ngettext "one cake", "several cakes", $n,
2764
   'still', $works,
2765
};
2766

    
2767
# Parentheses are only significant for the first argument.
2768
print dngettext 'package', ("one cake", "several cakes", $n), $discarded;
2769
</PRE>
2770

    
2771

    
2772

    
2773
<H4><A NAME="SEC270" HREF="gettext_toc.html#TOC270">13.5.16.8  How To Grok with Long Lines</A></H4>
2774
<P>
2775
<A NAME="IDX1123"></A>
2776

    
2777
</P>
2778
<P>
2779
The necessity of long messages can often lead to a cumbersome or
2780
unreadable coding style.  Perl has several options that may prevent
2781
you from writing unreadable code, and
2782
<CODE>xgettext</CODE> does its best to do likewise.  This is where the dot
2783
operator (the string concatenation operator) may come in handy:
2784

    
2785
</P>
2786

    
2787
<PRE>
2788
print gettext ("This is a very long"
2789
               . " message that is still"
2790
               . " readable, because"
2791
               . " it is split into"
2792
               . " multiple lines.\n");
2793
</PRE>
2794

    
2795
<P>
2796
Perl is smart enough to concatenate these constant string fragments
2797
into one long string at compile time, and so is
2798
<CODE>xgettext</CODE>.  You will only find one long message in the resulting
2799
POT file.
2800

    
2801
</P>
2802
<P>
2803
Note that the future Perl 6 will probably use the underscore
2804
(<SAMP>`_&acute;</SAMP>) as the string concatenation operator, and the dot 
2805
(<SAMP>`.&acute;</SAMP>) for dereferencing.  This new syntax is not yet supported by
2806
<CODE>xgettext</CODE>.
2807

    
2808
</P>
2809
<P>
2810
If embedded newline characters are not an issue, or even desired, you
2811
may also insert newline characters inside quoted strings wherever you
2812
feel like it:
2813

    
2814
</P>
2815

    
2816
<PRE>
2817
print gettext ("&#60;em&#62;In HTML output
2818
embedded newlines are generally no
2819
problem, since adjacent whitespace
2820
is always rendered into a single
2821
space character.&#60;/em&#62;");
2822
</PRE>
2823

    
2824
<P>
2825
You may also consider to use here documents:
2826

    
2827
</P>
2828

    
2829
<PRE>
2830
print gettext &#60;&#60;EOF;
2831
&#60;em&#62;In HTML output
2832
embedded newlines are generally no
2833
problem, since adjacent whitespace
2834
is always rendered into a single
2835
space character.&#60;/em&#62;
2836
EOF
2837
</PRE>
2838

    
2839
<P>
2840
Please do not forget, that the line breaks are real, i. e. they
2841
translate into newline characters that will consequently show up in
2842
the resulting POT file.
2843

    
2844
</P>
2845

    
2846

    
2847
<H4><A NAME="SEC271" HREF="gettext_toc.html#TOC271">13.5.16.9  Bugs, Pitfalls, And Things That Do Not Work</A></H4>
2848
<P>
2849
<A NAME="IDX1124"></A>
2850

    
2851
</P>
2852
<P>
2853
The foregoing sections should have proven that
2854
<CODE>xgettext</CODE> is quite smart in extracting translatable strings from
2855
Perl sources.  Yet, some more or less exotic constructs that could be
2856
expected to work, actually do not work.  
2857

    
2858
</P>
2859
<P>
2860
One of the more relevant limitations can be found in the
2861
implementation of variable interpolation inside quoted strings.  Only
2862
simple hash lookups can be used there:
2863

    
2864
</P>
2865

    
2866
<PRE>
2867
print &#60;&#60;EOF;
2868
$gettext{"The dot operator"
2869
          . " does not work"
2870
          . "here!"}
2871
Likewise, you cannot @{[ gettext ("interpolate function calls") ]}
2872
inside quoted strings or quote-like expressions.
2873
EOF
2874
</PRE>
2875

    
2876
<P>
2877
This is valid Perl code and will actually trigger invocations of the
2878
<CODE>gettext</CODE> function at runtime.  Yet, the Perl parser in
2879
<CODE>xgettext</CODE> will fail to recognize the strings.  A less obvious
2880
example can be found in the interpolation of regular expressions:
2881

    
2882
</P>
2883

    
2884
<PRE>
2885
s/&#60;!--START_OF_WEEK--&#62;/gettext ("Sunday")/e;
2886
</PRE>
2887

    
2888
<P>
2889
The modifier <CODE>e</CODE> will cause the substitution to be interpreted as
2890
an evaluable statement.  Consequently, at runtime the function
2891
<CODE>gettext()</CODE> is called, but again, the parser fails to extract the
2892
string "Sunday".  Use a temporary variable as a simple workaround if
2893
you really happen to need this feature:
2894

    
2895
</P>
2896

    
2897
<PRE>
2898
my $sunday = gettext "Sunday";
2899
s/&#60;!--START_OF_WEEK--&#62;/$sunday/;
2900
</PRE>
2901

    
2902
<P>
2903
Hash slices would also be handy but are not recognized:
2904

    
2905
</P>
2906

    
2907
<PRE>
2908
my @weekdays = @gettext{'Sunday', 'Monday', 'Tuesday', 'Wednesday',
2909
                        'Thursday', 'Friday', 'Saturday'};
2910
# Or even:
2911
@weekdays = @gettext{qw (Sunday Monday Tuesday Wednesday Thursday
2912
                         Friday Saturday) };
2913
</PRE>
2914

    
2915
<P>
2916
This is perfectly valid usage of the tied hash <CODE>%gettext</CODE> but the
2917
strings are not recognized and therefore will not be extracted.
2918

    
2919
</P>
2920
<P>
2921
Another caveat of the current version is its rudimentary support for
2922
non-ASCII characters in identifiers.  You may encounter serious
2923
problems if you use identifiers with characters outside the range of
2924
'A'-'Z', 'a'-'z', '0'-'9' and the underscore '_'.
2925

    
2926
</P>
2927
<P>
2928
Maybe some of these missing features will be implemented in future
2929
versions, but since you can always make do without them at minimal effort,
2930
these todos have very low priority.
2931

    
2932
</P>
2933
<P>
2934
A nasty problem are brace format strings that already contain braces
2935
as part of the normal text, for example the usage strings typically
2936
encountered in programs:
2937

    
2938
</P>
2939

    
2940
<PRE>
2941
die "usage: $0 {OPTIONS} FILENAME...\n";
2942
</PRE>
2943

    
2944
<P>
2945
If you want to internationalize this code with Perl brace format strings,
2946
you will run into a problem:
2947

    
2948
</P>
2949

    
2950
<PRE>
2951
die __x ("usage: {program} {OPTIONS} FILENAME...\n", program =&#62; $0);
2952
</PRE>
2953

    
2954
<P>
2955
Whereas <SAMP>`{program}&acute;</SAMP> is a placeholder, <SAMP>`{OPTIONS}&acute;</SAMP>
2956
is not and should probably be translated. Yet, there is no way to teach
2957
the Perl parser in <CODE>xgettext</CODE> to recognize the first one, and leave
2958
the other one alone.
2959

    
2960
</P>
2961
<P>
2962
There are two possible work-arounds for this problem.  If you are
2963
sure that your program will run under Perl 5.8.0 or newer (these
2964
Perl versions handle positional parameters in <CODE>printf()</CODE>) or
2965
if you are sure that the translator will not have to reorder the arguments
2966
in her translation -- for example if you have only one brace placeholder
2967
in your string, or if it describes a syntax, like in this one --, you can
2968
mark the string as <CODE>no-perl-brace-format</CODE> and use <CODE>printf()</CODE>:
2969

    
2970
</P>
2971

    
2972
<PRE>
2973
# xgettext: no-perl-brace-format
2974
die sprintf ("usage: %s {OPTIONS} FILENAME...\n", $0);
2975
</PRE>
2976

    
2977
<P>
2978
If you want to use the more portable Perl brace format, you will have to do
2979
put placeholders in place of the literal braces:
2980

    
2981
</P>
2982

    
2983
<PRE>
2984
die __x ("usage: {program} {[}OPTIONS{]} FILENAME...\n",
2985
         program =&#62; $0, '[' =&#62; '{', ']' =&#62; '}');
2986
</PRE>
2987

    
2988
<P>
2989
Perl brace format strings know no escaping mechanism.  No matter how this
2990
escaping mechanism looked like, it would either give the programmer a
2991
hard time, make translating Perl brace format strings heavy-going, or
2992
result in a performance penalty at runtime, when the format directives
2993
get executed.  Most of the time you will happily get along with
2994
<CODE>printf()</CODE> for this special case.
2995

    
2996
</P>
2997

    
2998

    
2999
<H3><A NAME="SEC272" HREF="gettext_toc.html#TOC272">13.5.17  PHP Hypertext Preprocessor</A></H3>
3000
<P>
3001
<A NAME="IDX1125"></A>
3002

    
3003
</P>
3004
<DL COMPACT>
3005

    
3006
<DT>RPMs
3007
<DD>
3008
mod_php4, mod_php4-core, phpdoc
3009

    
3010
<DT>File extension
3011
<DD>
3012
<CODE>php</CODE>, <CODE>php3</CODE>, <CODE>php4</CODE>
3013

    
3014
<DT>String syntax
3015
<DD>
3016
<CODE>"abc"</CODE>, <CODE>'abc'</CODE>
3017

    
3018
<DT>gettext shorthand
3019
<DD>
3020
<CODE>_("abc")</CODE>
3021

    
3022
<DT>gettext/ngettext functions
3023
<DD>
3024
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>; starting with PHP 4.2.0
3025
also <CODE>ngettext</CODE>, <CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
3026

    
3027
<DT>textdomain
3028
<DD>
3029
<CODE>textdomain</CODE> function
3030

    
3031
<DT>bindtextdomain
3032
<DD>
3033
<CODE>bindtextdomain</CODE> function
3034

    
3035
<DT>setlocale
3036
<DD>
3037
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
3038

    
3039
<DT>Prerequisite
3040
<DD>
3041
---
3042

    
3043
<DT>Use or emulate GNU gettext
3044
<DD>
3045
use
3046

    
3047
<DT>Extractor
3048
<DD>
3049
<CODE>xgettext</CODE>
3050

    
3051
<DT>Formatting with positions
3052
<DD>
3053
<CODE>printf "%2\$d %1\$d"</CODE>
3054

    
3055
<DT>Portability
3056
<DD>
3057
On platforms without gettext, the functions are not available.
3058

    
3059
<DT>po-mode marking
3060
<DD>
3061
---
3062
</DL>
3063

    
3064
<P>
3065
An example is available in the <TT>`examples&acute;</TT> directory: <CODE>hello-php</CODE>.
3066

    
3067
</P>
3068

    
3069

    
3070
<H3><A NAME="SEC273" HREF="gettext_toc.html#TOC273">13.5.18  Pike</A></H3>
3071
<P>
3072
<A NAME="IDX1126"></A>
3073

    
3074
</P>
3075
<DL COMPACT>
3076

    
3077
<DT>RPMs
3078
<DD>
3079
roxen
3080

    
3081
<DT>File extension
3082
<DD>
3083
<CODE>pike</CODE>
3084

    
3085
<DT>String syntax
3086
<DD>
3087
<CODE>"abc"</CODE>
3088

    
3089
<DT>gettext shorthand
3090
<DD>
3091
---
3092

    
3093
<DT>gettext/ngettext functions
3094
<DD>
3095
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>
3096

    
3097
<DT>textdomain
3098
<DD>
3099
<CODE>textdomain</CODE> function
3100

    
3101
<DT>bindtextdomain
3102
<DD>
3103
<CODE>bindtextdomain</CODE> function
3104

    
3105
<DT>setlocale
3106
<DD>
3107
<CODE>setlocale</CODE> function
3108

    
3109
<DT>Prerequisite
3110
<DD>
3111
<CODE>import Locale.Gettext;</CODE>
3112

    
3113
<DT>Use or emulate GNU gettext
3114
<DD>
3115
use
3116

    
3117
<DT>Extractor
3118
<DD>
3119
---
3120

    
3121
<DT>Formatting with positions
3122
<DD>
3123
---
3124

    
3125
<DT>Portability
3126
<DD>
3127
On platforms without gettext, the functions are not available.
3128

    
3129
<DT>po-mode marking
3130
<DD>
3131
---
3132
</DL>
3133

    
3134

    
3135

    
3136
<H3><A NAME="SEC274" HREF="gettext_toc.html#TOC274">13.5.19  GNU Compiler Collection sources</A></H3>
3137
<P>
3138
<A NAME="IDX1127"></A>
3139

    
3140
</P>
3141
<DL COMPACT>
3142

    
3143
<DT>RPMs
3144
<DD>
3145
gcc
3146

    
3147
<DT>File extension
3148
<DD>
3149
<CODE>c</CODE>, <CODE>h</CODE>.
3150

    
3151
<DT>String syntax
3152
<DD>
3153
<CODE>"abc"</CODE>
3154

    
3155
<DT>gettext shorthand
3156
<DD>
3157
<CODE>_("abc")</CODE>
3158

    
3159
<DT>gettext/ngettext functions
3160
<DD>
3161
<CODE>gettext</CODE>, <CODE>dgettext</CODE>, <CODE>dcgettext</CODE>, <CODE>ngettext</CODE>,
3162
<CODE>dngettext</CODE>, <CODE>dcngettext</CODE>
3163

    
3164
<DT>textdomain
3165
<DD>
3166
<CODE>textdomain</CODE> function
3167

    
3168
<DT>bindtextdomain
3169
<DD>
3170
<CODE>bindtextdomain</CODE> function
3171

    
3172
<DT>setlocale
3173
<DD>
3174
Programmer must call <CODE>setlocale (LC_ALL, "")</CODE>
3175

    
3176
<DT>Prerequisite
3177
<DD>
3178
<CODE>#include "intl.h"</CODE>
3179

    
3180
<DT>Use or emulate GNU gettext
3181
<DD>
3182
Use
3183

    
3184
<DT>Extractor
3185
<DD>
3186
<CODE>xgettext -k_</CODE>
3187

    
3188
<DT>Formatting with positions
3189
<DD>
3190
---
3191

    
3192
<DT>Portability
3193
<DD>
3194
Uses autoconf macros
3195

    
3196
<DT>po-mode marking
3197
<DD>
3198
yes
3199
</DL>
3200

    
3201

    
3202

    
3203
<H2><A NAME="SEC275" HREF="gettext_toc.html#TOC275">13.6  Internationalizable Data</A></H2>
3204

    
3205
<P>
3206
Here is a list of other data formats which can be internationalized
3207
using GNU gettext.
3208

    
3209
</P>
3210

    
3211

    
3212

    
3213
<H3><A NAME="SEC276" HREF="gettext_toc.html#TOC276">13.6.1  POT - Portable Object Template</A></H3>
3214

    
3215
<DL COMPACT>
3216

    
3217
<DT>RPMs
3218
<DD>
3219
gettext
3220

    
3221
<DT>File extension
3222
<DD>
3223
<CODE>pot</CODE>, <CODE>po</CODE>
3224

    
3225
<DT>Extractor
3226
<DD>
3227
<CODE>xgettext</CODE>
3228
</DL>
3229

    
3230

    
3231

    
3232
<H3><A NAME="SEC277" HREF="gettext_toc.html#TOC277">13.6.2  Resource String Table</A></H3>
3233
<P>
3234
<A NAME="IDX1128"></A>
3235

    
3236
</P>
3237
<DL COMPACT>
3238

    
3239
<DT>RPMs
3240
<DD>
3241
fpk
3242

    
3243
<DT>File extension
3244
<DD>
3245
<CODE>rst</CODE>
3246

    
3247
<DT>Extractor
3248
<DD>
3249
<CODE>xgettext</CODE>, <CODE>rstconv</CODE>
3250
</DL>
3251

    
3252

    
3253

    
3254
<H3><A NAME="SEC278" HREF="gettext_toc.html#TOC278">13.6.3  Glade - GNOME user interface description</A></H3>
3255

    
3256
<DL COMPACT>
3257

    
3258
<DT>RPMs
3259
<DD>
3260
glade, libglade, glade2, libglade2, intltool
3261

    
3262
<DT>File extension
3263
<DD>
3264
<CODE>glade</CODE>, <CODE>glade2</CODE>
3265

    
3266
<DT>Extractor
3267
<DD>
3268
<CODE>xgettext</CODE>, <CODE>libglade-xgettext</CODE>, <CODE>xml-i18n-extract</CODE>, <CODE>intltool-extract</CODE>
3269
</DL>
3270

    
3271
<P><HR><P>
3272
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_12.html">previous</A>, <A HREF="gettext_14.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
3273
</BODY>
3274
</HTML>