Language and character set settings

Purpose of application locale definition

The locale settings matters at compile time and at runtime. At runtime, the locale changes the behavior of the character handling functions, such as UPSHIFT and DOWNSHIFT. It also changes the handling of the character strings, which can be single byte or multibyte encoded. Compilation errors will occur if the source files contain characters that do not exist in the encoding defined by the current locale.

Always check that the local environment variable matches the locale of your Genero application, during development and at runtime:
$ fglrun -i mbcs
Charmap      : UTF-8
Multibyte    : yes
Stateless    : yes
Length Semantics : CHAR

Mobile plaforms

On iOS and Android™ mobile platforms, the locale is automatically defined to be UTF-8. This cannot be changed.

The language conventions and system messages are defined by the device settings.

Windows™ plaforms

On Windows platforms, if you don't specify the LANG environment variable, the language and character set defaults to the system locale which is defined by the regional settings for non-Unicode applications. For example, on a US-English Windows, this defaults to the 1252 code page. You typically leave the default on Windows platforms (i.e. you should not set the LANG variable, except if your application uses a different character set as the Windows system locale).

On Windows platforms, the syntax of the LANG variable is:
  language[_territory[.codeset]]
| .codeset
For example:
C:\ set LANG=English_USA.1252

UNIX™ plaforms

On UNIX-based platforms, The LC_ALL (or LANG) environment variable defines the global settings for the language used by the application.

With the LANG environment variable (or LC_ALL, on UNIX), you define the language, the territory (aka country) and the codeset (aka character set or code page) to be used. The format of the value is normalized as follows, but may be specific on some operating systems:
language_territory.codeset
For example:
$ LC_ALL=en_US.iso88591; export LC_ALL

What are possible locales on my platform?

Usually OS vendors define a specific set of values for the language, territory and codeset. For example, on a UNIX platform, you typically have the value "en_US.ISO8859-1" for a US English locale, while Microsoft™ Windows requires the "English_USA.1252" value. For more details about supported locales, refer to the operating system documentation.

A list of available locales can be found on UNIX platform by running the locale -a command. You may also want to read the man pages of the locale command and the setlocale function. On Windows platforms, search the Microsoft MSDN documentation for "Language and Country/Region Strings".

UNICODE support (UTF-8)

To support multiple languages in your application, you must use UNICODE. The encoding supported by Genero for UNICODE applications is UTF-8.

On UNIX platforms, UTF-8 locales are natively supported with LANG/LC_ALL.

On Windows platforms, UTF-8 is not well supported by the operating system: Defining the LANG environment variable to code page 65001 will not work. To workaround this limitation, Genero implements UTF-8 support on Windows by setting the LANG environment variable to the value .fglutf8 :

C:\ set LANG=.fglutf8