Character set usage

The character set used to edit and compile .per form specification files is defined by the current locale.

Form elements (typically, labels) can be written with non-ASCII characters of the current codeset.

In a grid-based layout, the form element positions and sizes are determined by counting the width of characters, rather than the number of bytes identifying the characters in the current codeset. This rule can be ignored when using a single-byte character set such as ISO-8859-1 or CP-1252, where each character has width of 1 and codepoint of 1 byte. This rule is important when using a multibyte character set such as BIG5 or UTF-8.

For example, in the UTF-8 multibyte codeset, a Chinese ideogram is encoded with three bytes, while the visual width of the character is twice the size of a Latin character. In the next example, the labels with three Chinese characters have the same width as the labels using six Latin characters. As a result, all the labels will get the same size (6 cells), and all fields will be aligned properly in a proportional font display:

LAYOUT
GRID
{
叽哱唶 [f001  ] abcdef [f002  ]
abcdef [f003  ] 叽哱唶 [f004  ]
}
END
END

In a stack-based container, the position of form elements is logical, the current locale does not impact on the form item positions as in a grid-based container:

LAYOUT
 STACK
   GROUP
      EDIT customer.cust_num, TITLE="叽哱唶";
      EDIT customer.cust_name;
      EDIT customer.cust_address;
   END
   ...
 END
END

For maximum portability, it is recommended to write all form specification files in ASCII (7 bit), and use localized strings to internationalize your forms.