The message catalog files (The GNU C Library)

Next: Generate Message Catalogs files, Previous: The catgets function family, Up: X/Open Message Catalog Handling [Contents][Index]

8.1.2 Format of the message catalog files

The only reasonable way to translate all the messages of a function and store the result in a message catalog file which can be read by the catopen function is to write all the message text to the translator and let her/him translate them all. I.e., we must have a file with entries which associate the set/message tuple with a specific translation. This file format is specified in the X/Open standard and is as follows:

Important: The handling of identifiers instead of numbers for the set and messages is a GNU extension. Systems strictly following the X/Open specification do not have this feature. An example for a message catalog file is this:

$ This is a leading comment.
$quote "

$set SetOne
1 Message with ID 1.
two "   Message with ID \"two\", which gets the value 2 assigned"

$set SetTwo
$ Since the last set got the number 1 assigned this set has number 2.
4000 "The numbers can be arbitrary, they need not start at one."

This small example shows various aspects:

Lines 1 and 9 are comments since they start with $ followed by a whitespace.
The quoting character is set to ". Otherwise the quotes in the message definition would have to be omitted and in this case the message with the identifier two would lose its leading whitespace.
Mixing numbered messages with messages having symbolic names is no problem and the numbering happens automatically.

While this file format is pretty easy it is not the best possible for use in a running program. The catopen function would have to parse the file and handle syntactic errors gracefully. This is not so easy and the whole process is pretty slow. Therefore the catgets functions expect the data in another more compact and ready-to-use file format. There is a special program gencat which is explained in detail in the next section.

Files in this other format are not human readable. To be easy to use by programs it is a binary file. But the format is byte order independent so translation files can be shared by systems of arbitrary architecture (as long as they use the GNU C Library).

Details about the binary file format are not important to know since these files are always created by the gencat program. The sources of the GNU C Library also provide the sources for the gencat program and so the interested reader can look through these source files to learn about the file format.