StringCodes

From GRFSpecs
Revision as of 23:09, 20 August 2011 by Frosch (talk | contribs) (extended string code 18)
Jump to navigationJump to search

String Codes

Texts in TTD are mostly in the Latin-1 (ISO-8859-1) character set (except when using UTF-8 encoding; see below), however a few characters are different. Also, some characters have special meaning. These are explained in the following table.

Range, hex Version Meaning
00..1F Control characters, unused except for the following:
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 01 X offset in next byte of string (variable space)
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 0D New line
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 0E Set small font size
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 0F Set large font size
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 1F X and Y offsets in next two bytes of string
20..7A Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 Latin-1/ASCII characters, from space " " up to lower case "z"
7B..87 Formatting instructions, all take their argument from the stack if not otherwise specified
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 7B Print signed dword
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 7C Print signed word
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 7D Print signed byte
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 7E Print unsigned word
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 7F Print dword in currency units
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 80 Print substring (text ID from stack)
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 81 Print substring (text ID in next 2 bytes of string)
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 82 Print date (day, month, year) (based on year 1920)
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 83 Print short date (month and year) (based on year 1920)
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 84 Print signed word in speed units
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 85 Discard next word from stack
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 86 Rotate down top 4 words on stack
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 87 Print signed word in litres
88..98 Colour codes
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 88 Blue
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 89 Light Gray
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 8A Light Orange ("Gold")
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 8B Red
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 8C Purple
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 8D Gray-Green
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 8E Orange
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 8F Green
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 90 Yellow
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 91 Light Green
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 92 Red-Brown
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 93 Brown
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 94 White
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 95 Light Blue
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 96 Dark Gray
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 97 Mauve (grayish purple)
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 98 Black
99 Not supported by OpenTTD Supported by TTDPatch 2.5 (2.0.1 alpha 1)2.5 Switch to company colour that follows in next byte (enabled by enhancegui)
9A Extended format code in next byte:
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.52.5 00 -or- 01 Display 64-bit value from stack in currency units
Not supported by OpenTTD Supported by TTDPatch 2.62.6 02 Ignore next colour byte. Multiple instances will skip multiple colour bytes.
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.62.6 03 WORD Push WORD onto the textref stack
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.62.6 04 BYTE Un-print the previous BYTE characters.
Not supported by OpenTTD Supported by TTDPatch 2.62.6 05 For internal use only. Not valid in GRF files.
Supported by OpenTTD 0.70.7 Supported by TTDPatch 2.6 (r2007)2.6 06 Print byte in hex
Supported by OpenTTD 0.70.7 Supported by TTDPatch 2.6 (r2007)2.6 07 Print word in hex
Supported by OpenTTD 0.70.7 Supported by TTDPatch 2.6 (r2007)2.6 08 Print dword in hex
Not supported by OpenTTD Supported by TTDPatch 2.6 (r2128)2.6 09 For internal use only. Usage in NewGRFs will most likely crash TTDPatch.
Not supported by OpenTTD Supported by TTDPatch 2.6 (r2128)2.6 0A For internal use only. Usage in NewGRFs will most likely crash TTDPatch.
Supported by OpenTTD 1.01.0 Supported by TTDPatch 2.6 (r2178)2.6 0B Print 64-bit value in hex
Supported by OpenTTD 1.11.1 Supported by TTDPatch 2.6 (r2178)2.6 0C Print name of station with id in next textrefstack word
Supported by OpenTTD 1.1 (r21086)1.1 Not supported by TTDPatch 0D Print signed word in tonnes
Supported by OpenTTD 1.1 (r21209)1.1 Not supported by TTDPatch 0E Set gender of string, NewGRF internal ID in next byte. Must be first in a string. [1]
Supported by OpenTTD 1.1 (r21209)1.1 Not supported by TTDPatch 0F Select case for next substring, NewGRF internal ID in next byte. [1]
Supported by OpenTTD 1.1 (r21211)1.1 Not supported by TTDPatch 10 Begin choice list value, NewGRF internal ID in next byte. [2]
Supported by OpenTTD 1.1 (r21211)1.1 Not supported by TTDPatch 11 Begin choice list default [2]
Supported by OpenTTD 1.1 (r21211)1.1 Not supported by TTDPatch 12 End choice list [2]
Supported by OpenTTD 1.1 (r21211)1.1 Not supported by TTDPatch 13 Begin gender choice list, stack offset of substring to get gender from in next byte. [2]
Supported by OpenTTD 1.1 (r21211)1.1 Not supported by TTDPatch 14 Begin case choice list [2]
Supported by OpenTTD 1.1 (r21216)1.1 Not supported by TTDPatch 15 Begin plural choice list, stack offset of value to get plural for in next byte. [3]
Supported by OpenTTD 1.2 (r22778)1.2 Not supported by TTDPatch 16 Print dword as date (day, month, year) (based on year 0)
Supported by OpenTTD 1.2 (r22778)1.2 Not supported by TTDPatch 17 Print dword as short date (month and year) (based on year 0)
Supported by OpenTTD 1.2 (r22779)1.2 Not supported by TTDPatch 18 Print unsigned word in horse power
9B..9D Reserved
9E..FF Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 Latin-1 characters, except for the following:
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 9E Euro character "€"
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 9F Capital Y umlaut "Ÿ"
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 A0 Scroll button up
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 AA Scroll button down
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 AC Tick mark
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 AD X mark
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 AF Scroll button right
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 B4 Train symbol
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 B5 Truck symbol
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 B6 Bus symbol
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 B7 Plane symbol
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 B8 Ship symbol
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 B9 Superscript -1
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 BC Small scroll button up
Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.02.0 BD Small scroll button down

The formatting instructions must not be used except in strings that expect them, and then they may not be out of order (with the possible exception of code 86 shuffling the internal stack). When used improperly, they will most likely crash TTD. Code 81 is always safe to use (provided that the referenced text ID uses no unsafe formatting instructions either), and will insert the given text ID (e.g. "\81\3D\A0" will insert text ID A03D, "\98Refit Aircraft"). Note however that if you want to include e.g. ID D000/D400, the 00 byte will be considered the end of string, and this will therefore break if additional texts are supposed to follow in the action 4. DCxx IDs must not be included; neither codes 80 nor 81 correctly access DCxx IDs.

Each formatting instructions removes its argument from the stack, so that the next one will receive the following bytes as arguments. Code 86 takes the top four words from the stack, let's call them W1 through W4, and reorders them as W4 W1 W2 W3. This is used for languages in which industries or stations should be named not "Flinfingbury Power Plant" but "Power Plant Flinfingbury".

Using genders and cases

Supported by OpenTTD 1.11.1 Not supported by TTDPatch When multiple strings are combined to form a single sentence, the parts affect each other in various languages. E.g. you might have some string "The <industrytype> closes due to next month.", where <industrytype> is replaced by a name of some industry (e.g "brewery", "textile mill"). In German however the gender of the <industrytype> affects the string it is inserted into. For "brewery" ("Brauerei") the article "The" is translated with "Die", for "textile mill" ("Textilwerk") "The" is translated with "Das".

In OpenTTD these interactions of strings are called "genders" and "cases". While these names refer to the common grammatical constructions in various languages, there is a technical definition of these term wrt. OpenTTD. These might match the grammatical usages of genders and cases in some languages, or they might be used to deal with other grammatical constructions.

Generally, when a string I is inserted into a string O:

  • If the inner string I affects the outer O:
    • I is said to have a gender, and
    • O chooses different texts depending on that gender.
  • If the outer string O affects the inner I:
    • O sets a case for I, and
    • I chooses different texts depending on the required case.

Setting genders and cases

Supported by OpenTTD 1.11.1 Not supported by TTDPatch String codes 9A 0E and 9A 0F map a NewGRF internal gender or case ID to an OpenTTD gender or case. The internal ID is resolved to the appropriate OpenTTD gender or case at load time by means of the mapping. The first internal ID in the mapping that matches the ID from the string and has an existing OpenTTD gender or case is taken, i.e. the list of mappings is filtered by internal ID and existance of the OpenTTD gender/case and then the top element is used. When the gender or case ID is not known, or there is no existing OpenTTD gender or case with the mapped names the whole mapping is ignored and the default gender or case is taken.

Example of gender translation table

// Gender translation table
// Current OpenTTD German translation uses m, w, n and p but
// support a (fictitious) previous version that used masculine,
// feminine, neuter and plural as gender names.
 0 * 56     00 08 01 01 02
            13
            01 "m" 00
            01 "masculine" 00
            02 "w" 00
            02 "feminine" 00
            03 "n" 00
            03 "neuter" 00
            04 "p" 00
            04 "plural" 00
            00

// Brauerei is a female word in German; this sets it as female.
 1 * 40     04 0A 82 01 73 DC C3 9E 9A 0E 02 "Brauerei" 00

In this case OpenTTD would look for NewGRF internal ID 2 in the gender table. This would yield "w" and "feminine" as OpenTTD gender names. In current OpenTTD this would match "w", in the fictitious older version of OpenTTD it will match "feminine".

Choosing strings depending on required gender or case

Supported by OpenTTD 1.11.1 Not supported by TTDPatch String codes 9A 10 to 9A 14 map an OpenTTD gender or case to the NewGRF internal gender or case ID. The mapping is resolved at load time by going through all cases or genders OpenTTD's translation knows an mapping these to NewGRF internal IDs. If mapping is found the default choice list item is chosen. This happens by filtering the mapping on the gender or case name and then the NewGRF internal ID of the top element is used.

The choice list string codes are related and must be used in a specific manner:

Genders: 9A 13 <offset> (9A 10 <index> <string>)+ 9A 11 <default> 9A 12

Cases: 9A 14 (9A 10 <ndex> <string>)+ 9A 11 <default> 9A 12

Plurals: 9A 13 <offset> (9A 10 <index> <string>)+ 9A 11 <default> 9A 12

The offset is the stack location of the substring/value you want to get the gender/plural for. This is the real offset plus 80, i.e. an offset of 0 becomes 80 and an offset of 1 becomes 81 in the NFO.

Example of gender-dependent string

// Assuming the translation table of the previous example
// A string with a gender choice list and a stack item that gets resolved
 2 * 29     04 0A FF 01 1A DC "D" 9A 13 80 9A 10 1 "er" 9A 10 3 "as" 9A 11 "ie" 9A 12 " " 80 00

Imagine the "Brauerei" from the previous example being, as substring, on the stack. Then this string would resolve to "Die Brauerei".

What happens in OpenTTD is that whenever the "begin gender choice list" string code is found it will resolve the string at the given stack location. Of that resolved string the first character is compared to the "set gender" string code and if that is the case the (mapped) OpenTTD gender is retrieved. When there is "set gender" string code the first OpenTTD gender is used. After resolving the OpenTTD gender that gender is reverse mapped to a NewGRF internal ID. If that NewGRF internal ID exists in one of the "choice list values" that (sub)string is taken (up till the next choice list value/default). If there is no reverse mapping the string at the "choice list default" string code is used up till the "end choice list" string code. Further processing of the string happens after the choice list, i.e. the (sub)strings in the choice list may not contain any special string codes except colour codes.

Case choice lists work in a similar matter, except that instead of resolving a case from a (sub)string we "are" the substring; the string that includes this substring has set a case using the "select case" string code. As such no offset has to be given to the choice list, but the rest works in the same way as gender choice lists.

Using plural forms

Supported by OpenTTD 1.11.1 Not supported by TTDPatch The plural list works like a gender list, however you have to choose one "mapping" from value to plural index by setting the plural form using Action0GeneralVariables property 15.

If, for example, plural form 0 is chosen using the Action0GeneralVariables property 15, then there are 2 plural indices. If the value at the stack with the given offset equals 1 you get plural index 1, otherwise plural index 2. These plural indices are the indices that are used in the choice lists.

Plural form Plural index Description
0 Two forms:
1 1
2 rest
1 Only one form:
1 every form
2 Two forms:
1 0 or 1
2 rest
3 Three forms:
1 ending in 1, but not ending in 11
2 0
3 rest
4 Five forms:
1 1
2 2
3 3-6
4 7-10
5 rest
5 Three forms:
1 ending in 1, but not ending in 11
2 ending in 2-9, but not ending in 1[2-9]
3 rest
6 Three forms:
1 ending in 1, but not ending in 11
2 ending in 2-4, but not ending in 1[2-4]
3 rest
7 Three forms:
1 0
2 ending in 2-4, but not ending in 1[2-4]
3 rest
8 Four forms:
1 ending in 01
2 ending in 02
3 ending in 03 or ending in 04
4 rest
9 Two forms:
1 ending in 1, but not ending in 11
2 ret
10 Three forms:
1 1
2 2-4
3 rest
11 Two forms:
1 ending in 0, 1, 3, 6, 7 and 8
2 ending in 2, 4, 5 and 9
12 Four forms:
1 1
2 0 or ending in 02-10
3 ending in 11-19
4 rest

Example of string using plural forms

// Set the plural type to type 0
 0 * 7      00 08 01 01 02 15 00
// In case of the first stack item being 1 use "Tonne", otherwise use "Tonnen"
 1 * 34     04 0B 82 01 1A DC C3 9E "\UE07C Tonne" 9A 15 80 9A 10 01 "" 9A 11 "n" 9A 12 " Sand" 00

UTF-8 support

Supported by OpenTTD 0.60.6 Supported by TTDPatch 2.5 (2.0.1 alpha 68)2.5 Since TTDPatch 2.0.1 alpha 68, TTDPatch supports UTF-8 encoded input strings. Use action 12 to define glyphs for the characters which do not exist in TTD's .grf files (possible since TTDPatch 2.0.1 alpha 73).

To indicate that a given string is in UTF-8 encoding, start it with a capital thorn (U+00DE, "Þ"), encoded in UTF-8 as usual with the bytes C3 9E. Everything in that string is then assumed to be in UTF-8 encoding, with the following exception: if characters appear that are not valid UTF-8 sequences, they are assumed to be one of the above control codes. This way, it is still possible to write, e.g. "ÞCapacity: " 87 "litres", without encoding the 87 in UTF-8 (which would instead refer to a character installed at codepoint U+0087).

In addition, this allows using the non-Unicode characters 9E, 9F, A0, AA, AC, AD, AF, B4..B9, BC and BD from the above list, which when encoded with UTF-8 would refer to their respective Unicode characters instead. To use the TTD characters, simply do not encode them using UTF-8 but enter them directly as bytes. This causes them to be an invalid UTF-8 sequence and allows TTDPatch to use the correct symbol from TTD's fonts.

Alternatively, these symbols (in fact, the TTD character set from 20 to FF) are also mapped into the Unicode Private Use Area at U+E0xx, so to encode the truck symbol, you may use character U+E0B5 as well, although this will probably be an unprintable character in most text editors.

Finally, characters 7B..7F no longer function as the above formatting instructions, but will display regular glyphs instead (provided they are installed; by default TTD has none at these codepoints). Instead, to use these formatting instructions in UTF-8 mode, you need to use their Private Use Area codepoint at U+E0xx.

Basically there are three possibilities:

  1. Characters U+E020..U+E0FF in the E0xx Private Use Area do what their respective character xx would do in TTD, as do the control characters below U+0020
  2. All other valid UTF-8 sequences display actual glyphs, if they are available
  3. All invalid UTF-8 sequences do what their individual bytes would do in TTD

To summarize, here's a handy table:

Character Encoding in UTF-8 mode Meaning
7E 7E Unicode Character 'TILDE' (~)
82 82 (invalid UTF-8) Print date (day, month, year)
82 C2 82 Display glyph for U+0082
AC AC (invalid UTF-8) Tick mark
AC C2 AC Unicode Character 'NOT SIGN' (¬)
E07E EE 81 BE Print unsigned word
E082 EE 82 82 Print date (day, month, year)
E0AC EE 82 AC Tick mark