Language and character set settings

The Genero Business Development Language (BDL) compilers and runtime system use the standard C library functions (setlocale) to handle character sets. The LANG (or LC_ALL) environment variable defines the global settings for the language used by the application. The locale settings matters at compile time and at runtime. At runtime, the locale changes the behavior of the character handling functions, such as UPSHIFT and DOWNSHIFT. It also changes the handling of the character strings, which can be single byte or multibyte encoded. Compilation errors will occur if the source files contain characters that do not exist in the encoding defined by the current locale.

Note that on Windows™ platforms, if you don't specify the LANG environment variable, the language and character set defaults to the system locale which is defined by the regional settings for non-Unicode applications. For example, on a US-English Windows, this defaults to the 1252 code page. You typically leave the default on Windows platforms (i.e. you should not set the LANG variable, except if your application uses a different character set as the Windows system locale). On UNIX™ platforms, you should always check the the LANG (or LC_ALL) environment variable matches the locale of your Genero application.

With the LANG environment variable (or LC_ALL, on UNIX), you define the language, the territory (aka country) and the codeset (aka character set or code page) to be used. The format of the value is normalized as follows, but may be specific on some operating systems:
language_territory.codeset
For example:
$ LANG=en_US.iso88591; export LANG

Usually OS vendors define a specific set of values for the language, territory and codeset. For example, on a UNIX platform, you typically have the value "en_US.ISO8859-1" for a US English locale, while Microsoft™ Windows requires the "English_USA.1252" value. For more details about supported locales, please refer to the OS documentation. On UNIX platforms, you can do a man locale or man setlocale to understand how the standard C library defines locale settings. For Windows, search in the MSDN library.

Note that on Windows platforms, the syntax of the LANG variable is:
  language[_territory[.codeset]]
| .codeset
For example:
C:\ set LANG=English_USA.1252

A list of available locales can be found on UNIX platform by running the locale -a command. You may also want to read the man pages of the locale command and the setlocale function. On Windows platforms, search the Microsoft MSDN documentation for "Language and Country/Region Strings".

Read the OS manuals to learn how to check if a locale is properly set and to find the list the locales installed on your system.

To support multiple languages in your application, you must use UNICODE. The encoding supported by Genero for UNICODE applications is UTF-8. On UNIX platforms, UTF-8 locales are natively supported. On Windows platforms, UTF-8 is not well supported by the operating system: Setting the LANG environment variable to code page 65001 will not work. To workaround this limitation, Genero implements UTF-8 support on Windows by setting the LANG environment variable to the value .fglutf8 :

C:\ set LANG=.fglutf8