LOAD

Inserts data from a file into an existing database table.

Syntax

LOAD FROM filename [ DELIMITER delimiter]
{
  INSERT INTO table-specification [ ( column [,...]) ]
|
  insert-string
}

where table-specification is:

[dbname[@dbserver]:][owner.]table
  1. filename is a string expression containing the name of the file the data is read from.
  2. delimiter is the character used as the value delimiter. Default is | pipe or DBDELIMITER environment variable when set.
  3. The INSERT clause is a pseudo INSERT statement (without the VALUES clause), where you can specify the list of columns in parentheses.
  4. dbname identifies the database name.
  5. dbserver identifies the database server (INFORMIXSERVER).
  6. owner identifies the owner of the table, with optional double quotes.
  7. table is the name of the database table.
  8. column is a name of a table column.
  9. insert-string is a string expression containing the pseudo-INSERT statement.

Usage

The LOAD instruction reads serialized data from an input file and inserts new rows in a database table specified in the INSERT clause. A file created by the UNLOAD statement can be used as input for the LOAD statement if its values are compatible with the schema of table.

A data file used by LOAD or UNLOAD instructions looks like this (when the delimiter is the pipe), and the record has an INTEGER, VARCHAR and DATE fields:
102|Mike|12/24/2020|
192|Tom|04/29/2019|
Note: Files encoded in UTF-8 can start with the UTF-8 Byte Order Mark (BOM), a sequence of 0xEF 0xBB 0xBF bytes, also known as UNICODE U+FEFF. When reading files, Genero BDL will ignore the UTF-8 BOM, if it is present at the beginning of the file. This applies to instructions such as LOAD, as well as I/O APIs such as base.Channel.read() and readLine().

The LOAD statement must include a pseudo-INSERT statement (either directly or as text in a variable) to specify where to store the data. LOAD appends the new rows to the specified table, synonym, or view, but does not overwrite existing data. It cannot add a row that has the same key as an existing row.

Avoid use of the the dbname, dbserver and owner prefix of the table name for maximum SQL portability.

The number and the order of columns in the INSERT statement must match the values of the input file.

The LOAD instruction cannot be prepared with a PREPARE statement. However, LOAD can take a string literal as parameter, that allows to build the INSERT statement at runtime.

The variable or string following the LOAD FROM keywords must specify the name of a file of ASCII characters (or characters that are valid for the current locale) that holds the data values that are to be inserted.

Each set of data values in filename that represents a new row is called an input record. Each input record must contain the same number of delimited data values. If the INSERT clause has no list of columns, the sequence of values in each input record must match the columns of table in number and order. Each value must have the literal format of the column data type, or of a compatible data type.

If LOAD is executed within a transaction block (BEGIN WORK / COMMIT WORK), the rows inserted by the LOAD instruction are part of the transaction. With some database servers the insert rows remain locked until the COMMIT WORK or ROLLBACK WORK statement terminates the transaction. Consider locking the whole table to

If the database does not support transactions, a failing LOAD statement cannot remove any rows that were loaded before the failure occurred. You must manually remove the already loaded records from either the load file or from the receiving table, repair the erroneous records, and rerun LOAD.

If the database supports transactions, you can do the following actions:

  • Run LOAD as a singleton transaction, so that any error causes the entire LOAD statement to be automatically rolled back.
  • Run LOAD within an explicit BEGIN WORK / COMMIT WORK transaction block, so that a data error merely stops the LOAD statement in place with the transaction still open.
Note: When the LOAD instruction is not surrounded by BEGIN WORK and COMMIT WORK or ROLLBACK WORK instructions, terminating the transaction when LOAD is finished will automatically close cursors not defined WITH HOLD option. To keep cursors open, either use the WITH HOLD option in DECLARE CURSOR, or surround the DECLARE, OPEN and LOAD instruction with an explicit BEGIN WORK / COMMIT WORK.

A single character delimiter instructs LOAD to read data in the default format. When using "CSV" as delimiter specification, the LOAD instruction will read the data in CSV format. If the DELIMITER clause is not specified, the delimiter is defined by the DBDELIMITER environment variable. If the DBDELIMITER environment variable is not set, the default is a | pipe. The field delimiter can be a blank character. It cannot be backslash or any hexadecimal digit (0-9, A-F, a-f). If the delimiter specified in the LOAD command is NULL, the runtime system will use the default delimiter or DBDELIMITER if the variable is defined.

At this time, data type description of the input file fields is implicit; in order to create the SQL parameter buffers to hold the field values for inserts, the LOAD instruction uses the current database connection to get the column data types of the target table. Those data types depend on the type of database server. For example, IBM® Informix® DATE columns do not store the same data as the Oracle® DATE data type. Therefore, be careful when using the LOAD/UNLOAD instructions; if the application connects to different kinds of database servers, it can result data conversion errors.

Pay attention to numeric (DECIMAL, MONEY) and date/time values (DATE, DATETIME): These must match the current format settings (DBFORMAT, DBDATE). As a general programming pattern, use simple INSERT statements to load default and configuration data into your database, in order to be independent from the numeric and date format settings.

Default LOAD format

The next table describes the recommended representation for data values in the input file used by the LOAD instruction. Values must be serialized with a character string following the SQL data type of the receiving column of the table.

Table 1. Data representation for the default LOAD format
Data type Input Format
CHAR, VARCHAR, TEXT Values can have more characters than the declared maximum length of the column, but any extra characters are ignored. A backslash ( \ ) is required before any literal backslash or any literal delimiter character, and before any NEWLINE character anywhere in character value. Blank values can be represented as one or more blank characters between delimiters, but leading blanks must not precede other CHAR, VARCHAR, or TEXT values.
DATE In the default locale, values must be in month/day/year format unless another format is specified by DBDATE environment variable. The day and month must be a 2-digit number, and the year must be a 4-digit number.
DATETIME

DATETIME values must be in the format:

year-month-day hour:minute:second.fraction or a contiguous subset, without the DATETIME keyword or qualifiers. Time units outside the declared column precision can be omitted. The year must be a four-digit number; all other time units (except fraction) require two digits.

INTERVAL INTERVAL values must be formatted:

year-month

or

day hour:minute:second.fraction

or a contiguous subset thereof, without the INTERVAL keyword or qualifiers. Time units outside the declared column precision can be omitted. All time units (except year and fraction) require two digits.

DECIMAL, MONEY Values must use the decimal separator defined by DBFORMAT/DBMONEY. For MONEY, values can include currency symbols, but these are not required.
BYTE Values must be ASCII-hexadecimals; no leading or trailing blanks.
SERIAL, BIGSERIAL, SERIAL8 Values can be represented as 0 to tell the database server to supply a new serial value. You can specify a literal integer greater than zero, but if the column has a unique index, an error results if this number duplicates an existing value.

The NEWLINE character must terminate each input record in filename. Specify only values that the language can convert to the data type of the database column. For database columns of character data types, inserted values are truncated from the right if they exceed the declared length of the column.

NULL values of any data type must be represented by consecutive delimiters in the input file; you cannot include anything between the delimiter symbols.

The LOAD statement expects incoming data in the format specified by environment variables like DBFORMAT, DBMONEY, DBDATE, GL_DATE, and GL_DATETIME. The precedence of these format specifications is consistent with forms and reports. If there is an inconsistency, an error is reported and the LOAD is canceled.

The backslash symbol (\) serves as an escape character in the input file to indicate that the next character in a data value is a literal. The LOAD statement scans for backslash escaped elements to read special characters in the following contexts:

  • The backslash character appears anywhere in the value.
  • The delimiter character appears anywhere in the value.
  • The NEWLINE character appears anywhere in a value.

CSV LOAD format

The CSV (comma separated values) format is similar to the default format when using a simple comma delimiter, with the following differences:

  • Input values might be surrounded with " double quotes.
  • If an input value contains a comma or a NEWLINE, it is not escaped be the value must be quoted in the file.
  • Double-quote characters in input values are doubled and will be converted to a unique " character; the value must be quoted.
  • Backslash characters are not escaped in the input file and are read as; the value must be quoted.
  • Leading and trailing blanks are kept (no truncation).
  • No ending delimiter is expected at the end of the input record.

Example

MAIN
  DATABASE stores 
  BEGIN WORK
  DELETE FROM items 
  LOAD FROM "items01.unl" INSERT INTO items 
  LOAD FROM "items02.unl" INSERT INTO items 
  COMMIT WORK
END MAIN