Statistics
| Revision:

svn-gvsig-desktop / trunk / install / launcher / izpack-launcher-1.3 / src / gettext / share / doc / gettext / gettext_5.html @ 7940

History | View | Annotate | Download (13.4 KB)

1
<HTML>
2
<HEAD>
3
<!-- This HTML file has been created by texi2html 1.52a
4
     from gettext.texi on 9 December 2003 -->
5

    
6
<TITLE>GNU gettext utilities - 5  Creating a New PO File</TITLE>
7
</HEAD>
8
<BODY>
9
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_4.html">previous</A>, <A HREF="gettext_6.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
10
<P><HR><P>
11

    
12

    
13
<H1><A NAME="SEC31" HREF="gettext_toc.html#TOC31">5  Creating a New PO File</A></H1>
14
<P>
15
<A NAME="IDX254"></A>
16

    
17
</P>
18
<P>
19
When starting a new translation, the translator creates a file called
20
<TT>`<VAR>LANG</VAR>.po&acute;</TT>, as a copy of the <TT>`<VAR>package</VAR>.pot&acute;</TT> template
21
file with modifications in the initial comments (at the beginning of the file)
22
and in the header entry (the first entry, near the beginning of the file).
23

    
24
</P>
25
<P>
26
The easiest way to do so is by use of the <SAMP>`msginit&acute;</SAMP> program.
27
For example:
28

    
29
</P>
30

    
31
<PRE>
32
$ cd <VAR>PACKAGE</VAR>-<VAR>VERSION</VAR>
33
$ cd po
34
$ msginit
35
</PRE>
36

    
37
<P>
38
The alternative way is to do the copy and modifications by hand.
39
To do so, the translator copies <TT>`<VAR>package</VAR>.pot&acute;</TT> to
40
<TT>`<VAR>LANG</VAR>.po&acute;</TT>.  Then she modifies the initial comments and
41
the header entry of this file.
42

    
43
</P>
44

    
45

    
46

    
47
<H2><A NAME="SEC32" HREF="gettext_toc.html#TOC32">5.1  Invoking the <CODE>msginit</CODE> Program</A></H2>
48

    
49
<P>
50
<A NAME="IDX255"></A>
51
<A NAME="IDX256"></A>
52

    
53
<PRE>
54
msginit [<VAR>option</VAR>]
55
</PRE>
56

    
57
<P>
58
<A NAME="IDX257"></A>
59
<A NAME="IDX258"></A>
60
The <CODE>msginit</CODE> program creates a new PO file, initializing the meta
61
information with values from the user's environment.
62

    
63
</P>
64

    
65

    
66
<H3><A NAME="SEC33" HREF="gettext_toc.html#TOC33">5.1.1  Input file location</A></H3>
67

    
68
<DL COMPACT>
69

    
70
<DT><SAMP>`-i <VAR>inputfile</VAR>&acute;</SAMP>
71
<DD>
72
<DT><SAMP>`--input=<VAR>inputfile</VAR>&acute;</SAMP>
73
<DD>
74
<A NAME="IDX259"></A>
75
<A NAME="IDX260"></A>
76
Input POT file.
77

    
78
</DL>
79

    
80
<P>
81
If no <VAR>inputfile</VAR> is given, the current directory is searched for the
82
POT file.  If it is <SAMP>`-&acute;</SAMP>, standard input is read.
83

    
84
</P>
85

    
86

    
87
<H3><A NAME="SEC34" HREF="gettext_toc.html#TOC34">5.1.2  Output file location</A></H3>
88

    
89
<DL COMPACT>
90

    
91
<DT><SAMP>`-o <VAR>file</VAR>&acute;</SAMP>
92
<DD>
93
<DT><SAMP>`--output-file=<VAR>file</VAR>&acute;</SAMP>
94
<DD>
95
<A NAME="IDX261"></A>
96
<A NAME="IDX262"></A>
97
Write output to specified PO file.
98

    
99
</DL>
100

    
101
<P>
102
If no output file is given, it depends on the <SAMP>`--locale&acute;</SAMP> option or the
103
user's locale setting.  If it is <SAMP>`-&acute;</SAMP>, the results are written to
104
standard output.
105

    
106
</P>
107

    
108

    
109
<H3><A NAME="SEC35" HREF="gettext_toc.html#TOC35">5.1.3  Input file syntax</A></H3>
110

    
111
<DL COMPACT>
112

    
113
<DT><SAMP>`-P&acute;</SAMP>
114
<DD>
115
<DT><SAMP>`--properties-input&acute;</SAMP>
116
<DD>
117
<A NAME="IDX263"></A>
118
<A NAME="IDX264"></A>
119
Assume the input file is a Java ResourceBundle in Java <CODE>.properties</CODE>
120
syntax, not in PO file syntax.
121

    
122
<DT><SAMP>`--stringtable-input&acute;</SAMP>
123
<DD>
124
<A NAME="IDX265"></A>
125
Assume the input file is a NeXTstep/GNUstep localized resource file in
126
<CODE>.strings</CODE> syntax, not in PO file syntax.
127

    
128
</DL>
129

    
130

    
131

    
132
<H3><A NAME="SEC36" HREF="gettext_toc.html#TOC36">5.1.4  Output details</A></H3>
133

    
134
<DL COMPACT>
135

    
136
<DT><SAMP>`-l <VAR>ll_CC</VAR>&acute;</SAMP>
137
<DD>
138
<DT><SAMP>`--locale=<VAR>ll_CC</VAR>&acute;</SAMP>
139
<DD>
140
<A NAME="IDX266"></A>
141
<A NAME="IDX267"></A>
142
Set target locale.  <VAR>ll</VAR> should be a language code, and <VAR>CC</VAR> should
143
be a country code.  The command <SAMP>`locale -a&acute;</SAMP> can be used to output a list
144
of all installed locales.  The default is the user's locale setting.
145

    
146
<DT><SAMP>`--no-translator&acute;</SAMP>
147
<DD>
148
<A NAME="IDX268"></A>
149
Declares that the PO file will not have a human translator and is instead
150
automatically generated.
151

    
152
<DT><SAMP>`-p&acute;</SAMP>
153
<DD>
154
<DT><SAMP>`--properties-output&acute;</SAMP>
155
<DD>
156
<A NAME="IDX269"></A>
157
<A NAME="IDX270"></A>
158
Write out a Java ResourceBundle in Java <CODE>.properties</CODE> syntax.  Note
159
that this file format doesn't support plural forms and silently drops
160
obsolete messages.
161

    
162
<DT><SAMP>`--stringtable-output&acute;</SAMP>
163
<DD>
164
<A NAME="IDX271"></A>
165
Write out a NeXTstep/GNUstep localized resource file in <CODE>.strings</CODE> syntax.
166
Note that this file format doesn't support plural forms.
167

    
168
<DT><SAMP>`-w <VAR>number</VAR>&acute;</SAMP>
169
<DD>
170
<DT><SAMP>`--width=<VAR>number</VAR>&acute;</SAMP>
171
<DD>
172
<A NAME="IDX272"></A>
173
<A NAME="IDX273"></A>
174
Set the output page width.  Long strings in the output files will be
175
split across multiple lines in order to ensure that each line's width
176
(= number of screen columns) is less or equal to the given <VAR>number</VAR>.
177

    
178
<DT><SAMP>`--no-wrap&acute;</SAMP>
179
<DD>
180
<A NAME="IDX274"></A>
181
Do not break long message lines.  Message lines whose width exceeds the
182
output page width will not be split into several lines.  Only file reference
183
lines which are wider than the output page width will be split.
184

    
185
</DL>
186

    
187

    
188

    
189
<H3><A NAME="SEC37" HREF="gettext_toc.html#TOC37">5.1.5  Informative output</A></H3>
190

    
191
<DL COMPACT>
192

    
193
<DT><SAMP>`-h&acute;</SAMP>
194
<DD>
195
<DT><SAMP>`--help&acute;</SAMP>
196
<DD>
197
<A NAME="IDX275"></A>
198
<A NAME="IDX276"></A>
199
Display this help and exit.
200

    
201
<DT><SAMP>`-V&acute;</SAMP>
202
<DD>
203
<DT><SAMP>`--version&acute;</SAMP>
204
<DD>
205
<A NAME="IDX277"></A>
206
<A NAME="IDX278"></A>
207
Output version information and exit.
208

    
209
</DL>
210

    
211

    
212

    
213
<H2><A NAME="SEC38" HREF="gettext_toc.html#TOC38">5.2  Filling in the Header Entry</A></H2>
214
<P>
215
<A NAME="IDX279"></A>
216

    
217
</P>
218
<P>
219
The initial comments "SOME DESCRIPTIVE TITLE", "YEAR" and
220
"FIRST AUTHOR &#60;EMAIL@ADDRESS&#62;, YEAR" ought to be replaced by sensible
221
information.  This can be done in any text editor; if Emacs is used
222
and it switched to PO mode automatically (because it has recognized
223
the file's suffix), you can disable it by typing <KBD>M-x fundamental-mode</KBD>.
224

    
225
</P>
226
<P>
227
Modifying the header entry can already be done using PO mode: in Emacs,
228
type <KBD>M-x po-mode RET</KBD> and then <KBD>RET</KBD> again to start editing the
229
entry.  You should fill in the following fields.
230

    
231
</P>
232
<DL COMPACT>
233

    
234
<DT>Project-Id-Version
235
<DD>
236
This is the name and version of the package.
237

    
238
<DT>Report-Msgid-Bugs-To
239
<DD>
240
This has already been filled in by <CODE>xgettext</CODE>.  It contains an email
241
address or URL where you can report bugs in the untranslated strings:
242

    
243

    
244
<UL>
245
<LI>Strings which are not entire sentences, see the maintainer guidelines
246

    
247
in section <A HREF="gettext_3.html#SEC15">3.2  Preparing Translatable Strings</A>.
248
<LI>Strings which use unclear terms or require additional context to be
249

    
250
understood.
251
<LI>Strings which make invalid assumptions about notation of date, time or
252

    
253
money.
254
<LI>Pluralisation problems.
255

    
256
<LI>Incorrect English spelling.
257

    
258
<LI>Incorrect formatting.
259

    
260
</UL>
261

    
262
<DT>POT-Creation-Date
263
<DD>
264
This has already been filled in by <CODE>xgettext</CODE>.
265

    
266
<DT>PO-Revision-Date
267
<DD>
268
You don't need to fill this in.  It will be filled by the Emacs PO mode
269
when you save the file.
270

    
271
<DT>Last-Translator
272
<DD>
273
Fill in your name and email address (without double quotes).
274

    
275
<DT>Language-Team
276
<DD>
277
Fill in the English name of the language, and the email address or
278
homepage URL of the language team you are part of.
279

    
280
Before starting a translation, it is a good idea to get in touch with
281
your translation team, not only to make sure you don't do duplicated work,
282
but also to coordinate difficult linguistic issues.
283

    
284
<A NAME="IDX280"></A>
285
In the Free Translation Project, each translation team has its own mailing
286
list.  The up-to-date list of teams can be found at the Free Translation
287
Project's homepage, <A HREF="http://www.iro.umontreal.ca/contrib/po/HTML/">http://www.iro.umontreal.ca/contrib/po/HTML/</A>,
288
in the "National teams" area.
289

    
290
<DT>Content-Type
291
<DD>
292
<A NAME="IDX281"></A>
293
<A NAME="IDX282"></A>
294
Replace <SAMP>`CHARSET&acute;</SAMP> with the character encoding used for your language,
295
in your locale, or UTF-8.  This field is needed for correct operation of the
296
<CODE>msgmerge</CODE> and <CODE>msgfmt</CODE> programs, as well as for users whose
297
locale's character encoding differs from yours (see section <A HREF="gettext_10.html#SEC165">10.2.4  How to specify the output character set <CODE>gettext</CODE> uses</A>).
298

    
299
<A NAME="IDX283"></A>
300
You get the character encoding of your locale by running the shell command
301
<SAMP>`locale charmap&acute;</SAMP>.  If the result is <SAMP>`C&acute;</SAMP> or <SAMP>`ANSI_X3.4-1968&acute;</SAMP>,
302
which is equivalent to <SAMP>`ASCII&acute;</SAMP> (= <SAMP>`US-ASCII&acute;</SAMP>), it means that your
303
locale is not correctly configured.  In this case, ask your translation
304
team which charset to use.  <SAMP>`ASCII&acute;</SAMP> is not usable for any language
305
except Latin.
306

    
307
<A NAME="IDX284"></A>
308
Because the PO files must be portable to operating systems with less advanced
309
internationalization facilities, the character encodings that can be used
310
are limited to those supported by both GNU <CODE>libc</CODE> and GNU
311
<CODE>libiconv</CODE>.  These are:
312
<CODE>ASCII</CODE>, <CODE>ISO-8859-1</CODE>, <CODE>ISO-8859-2</CODE>, <CODE>ISO-8859-3</CODE>,
313
<CODE>ISO-8859-4</CODE>, <CODE>ISO-8859-5</CODE>, <CODE>ISO-8859-6</CODE>, <CODE>ISO-8859-7</CODE>,
314
<CODE>ISO-8859-8</CODE>, <CODE>ISO-8859-9</CODE>, <CODE>ISO-8859-13</CODE>, <CODE>ISO-8859-14</CODE>,
315
<CODE>ISO-8859-15</CODE>,
316
<CODE>KOI8-R</CODE>, <CODE>KOI8-U</CODE>, <CODE>KOI8-T</CODE>,
317
<CODE>CP850</CODE>, <CODE>CP866</CODE>, <CODE>CP874</CODE>,
318
<CODE>CP932</CODE>, <CODE>CP949</CODE>, <CODE>CP950</CODE>, <CODE>CP1250</CODE>, <CODE>CP1251</CODE>,
319
<CODE>CP1252</CODE>, <CODE>CP1253</CODE>, <CODE>CP1254</CODE>, <CODE>CP1255</CODE>, <CODE>CP1256</CODE>,
320
<CODE>CP1257</CODE>, <CODE>GB2312</CODE>, <CODE>EUC-JP</CODE>, <CODE>EUC-KR</CODE>, <CODE>EUC-TW</CODE>,
321
<CODE>BIG5</CODE>, <CODE>BIG5-HKSCS</CODE>, <CODE>GBK</CODE>, <CODE>GB18030</CODE>, <CODE>SHIFT_JIS</CODE>,
322
<CODE>JOHAB</CODE>, <CODE>TIS-620</CODE>, <CODE>VISCII</CODE>, <CODE>GEORGIAN-PS</CODE>, <CODE>UTF-8</CODE>.
323

    
324
<A NAME="IDX285"></A>
325
In the GNU system, the following encodings are frequently used for the
326
corresponding languages.
327

    
328
<A NAME="IDX286"></A>
329

    
330
<UL>
331
<LI><CODE>ISO-8859-1</CODE> for
332

    
333
Afrikaans, Albanian, Basque, Breton, Catalan, Cornish, Danish, Dutch,
334
English, Estonian, Faroese, Finnish, French, Galician, German,
335
Greenlandic, Icelandic, Indonesian, Irish, Italian, Malay, Manx,
336
Norwegian, Occitan, Portuguese, Spanish, Swedish, Tagalog, Uzbek,
337
Walloon,
338
<LI><CODE>ISO-8859-2</CODE> for
339

    
340
Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian, Slovak,
341
Slovenian,
342
<LI><CODE>ISO-8859-3</CODE> for Maltese,
343

    
344
<LI><CODE>ISO-8859-5</CODE> for Macedonian, Serbian,
345

    
346
<LI><CODE>ISO-8859-6</CODE> for Arabic,
347

    
348
<LI><CODE>ISO-8859-7</CODE> for Greek,
349

    
350
<LI><CODE>ISO-8859-8</CODE> for Hebrew,
351

    
352
<LI><CODE>ISO-8859-9</CODE> for Turkish,
353

    
354
<LI><CODE>ISO-8859-13</CODE> for Latvian, Lithuanian, Maori,
355

    
356
<LI><CODE>ISO-8859-14</CODE> for Welsh,
357

    
358
<LI><CODE>ISO-8859-15</CODE> for
359

    
360
Basque, Catalan, Dutch, English, Finnish, French, Galician, German, Irish,
361
Italian, Portuguese, Spanish, Swedish, Walloon,
362
<LI><CODE>KOI8-R</CODE> for Russian,
363

    
364
<LI><CODE>KOI8-U</CODE> for Ukrainian,
365

    
366
<LI><CODE>KOI8-T</CODE> for Tajik,
367

    
368
<LI><CODE>CP1251</CODE> for Bulgarian, Byelorussian,
369

    
370
<LI><CODE>GB2312</CODE>, <CODE>GBK</CODE>, <CODE>GB18030</CODE>
371

    
372
for simplified writing of Chinese,
373
<LI><CODE>BIG5</CODE>, <CODE>BIG5-HKSCS</CODE>
374

    
375
for traditional writing of Chinese,
376
<LI><CODE>EUC-JP</CODE> for Japanese,
377

    
378
<LI><CODE>EUC-KR</CODE> for Korean,
379

    
380
<LI><CODE>TIS-620</CODE> for Thai,
381

    
382
<LI><CODE>GEORGIAN-PS</CODE> for Georgian,
383

    
384
<LI><CODE>UTF-8</CODE> for any language, including those listed above.
385

    
386
</UL>
387

    
388
<A NAME="IDX287"></A>
389
<A NAME="IDX288"></A>
390
When single quote characters or double quote characters are used in
391
translations for your language, and your locale's encoding is one of the
392
ISO-8859-* charsets, it is best if you create your PO files in UTF-8
393
encoding, instead of your locale's encoding.  This is because in UTF-8
394
the real quote characters can be represented (single quote characters:
395
U+2018, U+2019, double quote characters: U+201C, U+201D), whereas none of
396
ISO-8859-* charsets has them all.  Users in UTF-8 locales will see the
397
real quote characters, whereas users in ISO-8859-* locales will see the
398
vertical apostrophe and the vertical double quote instead (because that's
399
what the character set conversion will transliterate them to).
400

    
401
<A NAME="IDX289"></A>
402
To enter such quote characters under X11, you can change your keyboard
403
mapping using the <CODE>xmodmap</CODE> program.  The X11 names of the quote
404
characters are "leftsinglequotemark", "rightsinglequotemark",
405
"leftdoublequotemark", "rightdoublequotemark", "singlelowquotemark",
406
"doublelowquotemark".
407

    
408
Note that only recent versions of GNU Emacs support the UTF-8 encoding:
409
Emacs 20 with Mule-UCS, and Emacs 21.  As of January 2001, XEmacs doesn't
410
support the UTF-8 encoding.
411

    
412
The character encoding name can be written in either upper or lower case.
413
Usually upper case is preferred.
414

    
415
<DT>Content-Transfer-Encoding
416
<DD>
417
Set this to <CODE>8bit</CODE>.
418

    
419
<DT>Plural-Forms
420
<DD>
421
This field is optional.  It is only needed if the PO file has plural forms.
422
You can find them by searching for the <SAMP>`msgid_plural&acute;</SAMP> keyword.  The
423
format of the plural forms field is described in section <A HREF="gettext_10.html#SEC166">10.2.5  Additional functions for plural forms</A>.
424
</DL>
425

    
426
<P><HR><P>
427
Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_4.html">previous</A>, <A HREF="gettext_6.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
428
</BODY>
429
</HTML>