CHARACTER data types

Informix^® supports the following character data types:

CHAR(N) with N<= 32767 bytes
VARCHAR(N[,M]) with N<=255 bytes
NCHAR(N) with N<= 32767 bytes
NVARCHAR(N[,M]) with N<=255 bytes
LVARCHAR(N), without the 255 bytes limit (max size varies according to IDS version)

In Informix, both CHAR/VARCHAR and NCHAR/NVARCHAR data types can be used to store single-byte or multibyte encoded character strings. The only difference between CHAR/VARCHAR and NCHAR/NVARCHAR is for sorting: N[VAR]CHAR types use the collation order, while [VAR]CHAR types use the byte order. The character set used to store strings in CHAR/VARCHAR/NCHAR/NVARCHAR columns is defined by the DB_LOCALE environment variable. The character set used by applications is defined by the CLIENT_LOCALE environment variable. Informix uses Byte Length Semantics (the size N that you specify in [VAR]CHAR(N)is expressed in bytes, not characters as in some other databases)

PostgreSQL provides the following character types:

CHAR(N) with N<= 10485760 characters
VARCHAR(N) with N<= 10485760 characters; The length specification is optional.
TEXT with a limit of 1GB

In PostgreSQL, CHAR, VARCHAR and TEXT types store data in single byte or multibyte character sets. For CHAR and VARCHAR, the size is specified in a number of characters, not bytes. The character set used to store data for these types is defined by the database character set, which can be specified when you create the database with the createdb tool or the CREATE DATABASE SQL command.

Note: The VARCHAR type of PostgreSQL can be used without a length specification. If no size is specified, the column accepts strings of any size. However, as Genero BDL needs to know the size of CHAR and VARCHAR columns to define fields and program variables from a schema file, you should not create tables in PostgreSQL having VARCHAR columns without size specification. If you try to extract a schema with fgldbsch, this tool will report that the VARCHAR column cannot be converted to a BDL type for the .sch file.

Automatic character set conversion between the PostgreSQL client and server is supported. You must properly specify the client character set for PostgreSQL. This can be done in different ways, with the SET CLIENT_ENCODING TO SQL command for example, or with configuration parameters. See the PostgreSQL documentation for more details.

Solution

Informix CHAR(N) types must be mapped to PostgreSQL CHAR(N) types, and Informix VARCHAR(N) or LVARCHAR(N) columns must be mapped to PostgreSQL VARCHAR(N).

Note: When creating a table from the BDL program with NCHAR or NVARCHAR types, the type names will be left as is and produce an SQL error because these types are not supported by PostgreSQL.

You can store single-byte or multibyte character strings in PostgreSQL CHAR, VARCHAR and TEXT columns.

PostgreSQL uses character length semantics: When you define a CHAR(20) and the database character set is multibyte, the column can hold more bytes/characters than the Informix CHAR(20) type, when using byte length semantics.

When using a multibyte character set (such as UTF-8), define database columns with the size in character units, and use character length semantics in BDL programs with FGL_LENGTH_SEMANTICS=CHAR.

When extracting a database schema from a PostgreSQL database, the schema extractor uses the size of the column in characters, not the octet length. If you have created a CHAR(10 (characters) ) column a in PostgreSQL database using the UTF-8 character set, the .sch file will get a size of 10, that will be interpreted according to FGL_LENGTH_SEMANTICS as a number of bytes or characters.

Do not forget to properly define the database client character set, which must correspond to the runtime system character set.

See also the section about Localization.