CHAR and VARCHAR data types

Informix®

Informix supports the following character data types:

Table 1. Informix character data types
Informix data type Description
CHAR(n) SBCS and MBCS character data (max is 32767 bytes)
VARCHAR(n[,m]) SBCS and MBCS character data (max is 255 bytes)
NCHAR(n) Same as CHAR, with specific collation order
NVARCHAR(n[,m]) Same as VARCHAR, with specific collation order
LVARCHAR(n) max size varies depending on the IDS version

With Informix, both CHAR/VARCHAR and NCHAR/NVARCHAR data types can be used to store single-byte or multibyte encoded character strings. The only difference between CHAR/VARCHAR and NCHAR/NVARCHAR is in how they use sorting: N[VAR]CHAR types use the collation order, while [VAR]CHAR types use the byte order.

The character set used to store strings in CHAR/VARCHAR/NCHAR/NVARCHAR columns is defined by the DB_LOCALE environment variable.

The character set used by applications is defined by the CLIENT_LOCALE environment variable.

Informix uses Byte Length Semantics (the size N that you specify in [VAR]CHAR(N) is expressed in bytes, not characters as in some other databases)

IBM® DB2®

IBM DB2 supports following data types to store character data:

Table 2. IBM DB2 character data types
IBM DB2 data type Description
CHAR(n) SBCS or MBCS character data using the database character set, where n is specified in bytes, or characters when the string units is CODEUNITS32. (max is 255 bytes)
VARCHAR(n) SBCS or MBCS character data using the database character set, where n is specified in bytes, or characters when the string units is CODEUNITS32. (max is 32672 bytes)
GRAPHIC(n) UNICODE character data, where n is specified in double-bytes, or characters when the string units is CODEUNITS32 (max is 127 characters)
VARGRAPHIC(n) UNICODE character data, where n is specified in double-bytes, or characters when the string units is CODEUNITS32 . (max is 16336 characters)
NCHAR(n) Synonym for CHAR or GRAPHIC, depending on the nchar_mapping database parameter, where n is specified in characters.
NVARCHAR(n) Synonym for VARCHAR or VARGRAPHIC, depending on the nchar_mapping database parameter, where n is specified in characters.

The character set used by DB2 to store CHAR and VARCHAR data is defined in the database locale section when creating a new database. If your application uses UTF-8, consider creating the DB2 database with the UTF-8 codeset.

The DB2 NLS_STRING_UNITS=CODEUNITS32 built-in global variable (string_units system variable) can be used to control the real size in bytes of character data types. See DB2 documentation for more details.

DB2 can automatically convert from/to the client and server characters sets. In the context of client applications, the DB2CODEPAGE profile variable can be used to define the application codeset. The default depends on the operating system type:
  • On Linux and UNIX operating systems, the default code page determined from the locale setting, which includes information about language, territory and code set.
  • On Windows operating systems, the default code page is derived from the ANSI code page setting in the Regional Options menu in the Control Panel.

Solution

Informix CHAR(N) columns can be converted to DB2 CHAR(N) or NCHAR(N) types, and Informix VARCHAR(N) or LVARCHAR(N) columns can be converted to DB2 VARCHAR(N) or NVARCHAR(N).

Based on the FGLPROFILE entry dbi.database.dbname.ifxemul.nationalchars, the CHAR and VARCHAR type names in SQL statements are converted to CHAR/VARCHAR or to NCHAR/NVARCHAR DB2 types, that are then interpreted by DB2 to CHAR/VARCHAR or GRAPGHIC/VARGRAPHIC types, depending on the nchar_mapping database parameter.

Check that your database schema does not use CHAR or VARCHAR types with a length exceeding the DB2 limits. Especially, the Informix CHAR type has a very long size limit compared to DB2 CHAR. For example, when using the ifxemul.nationalchars FGLPROFILE entry, if a program creates a table with a CHAR(100) column, the ODI driver will convert this to NCHAR(100), and if nchar_mapping=CHAR_CU32, DB2 will convert NCHAR(100) to a CHAR(100) with CODEUNITS32, which will result in a CHAR(400) (100*4), which exceeds the maximum size (255 bytes) of a DB2 CHAR.

When using a multibyte character set (such as UTF-8) in programs, if the DB2 database was created with the appropriate codeset (UTF-8), use the CHAR/VARCHAR column types, and use BYTE length semantics in programs. Another option is to use CHAR/VARCHAR types, with the NLS_STRING_UNITS=CODEUNITS32 option set, to force DB2 to apply a ratio of 4 for storage sizes, and use CHAR length semantics in programs. If the DB2 database codeset is not multibyte, you must use the GRAPHIC and VARGRAPHIC data types to store multibyte character data, and use CHAR length semantics in programs.

When generating a .sch database schema from a DB2 database, the fgldbsch schema extractor extracts the size of CHAR/VARCHAR columns and GRAPHIC/VARGRAPHIC columns as a number of characters. If you have created a CHAR(10) column a in DB2 database using the UTF-8 character set, the .sch file will get a size of 10, that will be interpreted by FGL_LENGTH_SEMANTICS as either a number of bytes or characters.

Properly define the database client character set, which must correspond to the runtime system character set, with the DB2CODEPAGE profile variable.
Note: On Linux and UNIX operating systems, DB2CODEPAGE can be left unset, since DB2 client will use the current application locale as default. On Microsoft Windows platforms, if the ANSI Code Page of the operating system does not match the LANG setting used by Genero applications (LANG=.fglutf8), the DB2CODEPAGE must be set. For UTF-8, DB2CODEPAGE=1208; for ISO-8859-15, DB2CODEPAGE=923.

See also the section about Localization.

The CHAR/VARCHAR type translation can be controlled with the following FGLPROFILE entries:
dbi.database.dsname.ifxemul.datatype.char = { true | false }
dbi.database.dsname.ifxemul.datatype.varchar = { true | false }
For more details see IBM Informix emulation parameters in FGLPROFILE.