CHARACTER data types

Informix® supports following character data types:

In Informix, both CHAR/VARCHAR and NCHAR/NVARCHAR data types can be used to store single-byte or multibyte encoded character strings. The only difference between CHAR/VARCHAR and NCHAR/NVARCHAR is for sorting: N[VAR]CHAR types use the collation order, while [VAR]CHAR types use the byte order. The character set used to store strings in CHAR/VARCHAR/NCHAR/NVARCHAR columns is defined by the DB_LOCALE environment variable. The character set used by applications is defined by the CLIENT_LOCALE environment variable. Informix uses Byte Length Semantics (the size N that you specify in [VAR]CHAR(N) is expressed in bytes, not characters as in some other databases)

MySQL supports the following character data types:

With MySQL version 4, CHAR/VARCHAR with a size exceeding 255 characters are silently converted to TEXT columns. With later versions, you now get an SQL error when trying to define a CHAR or VARCHAR column with a size greater than the limit. Also, before version MySQL 5.0.3, VARCHAR limit was 255 characters, starting with 5.0.3 the limit is 65535 characters.

MySQL uses character length semantics to define the size of CHAR/VARCHAR columns, while Informix and Genero use Byte Length Semantics.

MySQL can support multiple character sets, you can run the SHOW CHARACTER SET statement to list supported encodings. There are different configuration levels to define the character set used by MySQL to store data. The server character set defines the default for database character sets if not specified in the CREATE DATABASE command. You can even define a specific character set at the table and column level, but this is not recommended with Genero applications. The database character set is used to store CHAR and VARCHAR columns. The NCHAR and NATIONAL VARCHAR types use a predefined character set which can be different from the database character set. In MySQL the national character set is UTF-8.

MySQL can automatically convert from/to the client and server characters sets. In the client applications, you define the character set with the SET NAMES instruction.

Note that by default, when fetching CHAR columns from MySQL, trailing blanks are trimmed. This does not matter as long as you fetch CHAR columns into CHAR variables, but this non-standard behavior will impact CHAR fetch into VARCHAR, or other SQL areas such as string concatenation for example. You can control the behavior of CHAR trailing blanks trimming with the PAD_CHAR_TO_FULL_LENGTH sql-mode parameter. But when this mode is used, the result of the SQL LENGTH() function will be different since trailing blanks are significant for that function in MySQL.

Solution

Informix CHAR(N) types must be mapped to MySQL CHAR(N) types. Informix VARCHAR(N) or LVARCHAR(N) columns must be mapped to MySQL VARCHAR(N).

Review your database schema when using CHAR, VARCHAR and LVARCHAR columns with a size exceeding the MySQL limits: If you need to store character strings larger as the MySQL limit, you can use the MySQL text type. When using CHAR/VARCHAR/LVARCHAR types in CREATE TABLE, the SQL Translator converts to text if the size is bigger as 255. When fetching tinytext and text columns from the MySQL database, the MySQL database driver treats such columns as CHAR/VARCHAR, to emulate Informix CHAR columns with a size greater than 255.

Note: For each text column fetched from MySQL, the MySQL database driver needs to allocate a temporary string buffer of 65535 bytes . Keep in mind that text columns may have be created by CREATE TABLE using CHAR/VARCHAR/LVARCHAR with size>255. The memory used by this temporary buffer is freed when freeing the cursor.

You can store single-byte or multibyte character strings in MySQL CHAR, VARCHAR and TEXT columns.

MySQL uses character length semantics: When you define a CHAR(20) and the database character set is multibyte, the column can hold more bytes/characters than the Informix CHAR(20) type, when using byte length semantics. When using a multibyte character set (such as UTF-8), define database columns with the size in character units, and use character length semantics in 4gl programs with FGL_LENGTH_SEMANTICS=CHAR.

When extracting a database schema from a MySQL database, the schema extractor uses the size of the column in characters, not the octet length. If you have created a CHAR(10 (characters) ) column a in MySQL database using the UTF8 character set, the .sch file will get a size of 10, that will be interpreted according to FGL_LENGTH_SEMANTICS as a number of bytes or characters.

Do not forget to properly define the database client character set, which must correspond to the runtime system character set.

See also the section about Localization.