Skip to main content

LOAD DATA (SQL)

Loads data into a table.

Synopsis

LOAD DATA FROM FILE filepath 
    [ COLUMNS (fieldname datatype, 
                          fieldname2 datatype2, ...) ] 
INTO table [ (fieldname, fieldname2, ...) 
    [ VALUES (headeritem,headeritem2, ...) ] ]
    [ USING {json_object} ]

LOAD DATA FROM JDBC connection TABLE jtable
INTO table [ (fieldname, fieldname2, ...) 
    [ VALUES (jfieldname,jfieldname2, ...) ] ]

Arguments

Argument Description
filepath (File Source Only) — The server-side location of a text file containing the data to load, specified as a complete file path enclosed in quotes.
COLUMNS (fieldname datatype) Optional (File Source Only) — The COLUMNS clause allows you to specify the sequence of data values in the filepath data file. For further details, refer to COLUMNS Clause.
INTO table The table to load the data into. A table name can be qualified (schema.tablename), or unqualified (tablename). An unqualified table name takes the default schema name. You can specify a view to load data in the table accessed through the view.
(fieldname,fieldname) Optional — The table fields to load file data into, specified in the order of the file data fields. This list of field names allows you to specify selected table fields, and to match the order of the data file items to the fields in table. A field that is not specified takes the default value, if specified in the table definition. If this clause is omitted, all user-defined fields in table must be represented in the data file.
VALUES (headeritem,headeritem2) Optional — For a data file source, the data file header names (headeritem) or COLUMNS clause names. For a JDBC data source, the JDBC table column names (jfieldname). Items must correspond positionally to the INTO clause field names.
USING {json_object}
Optional (File Source Only) — Specifies load parameters for the input data file using JSON object nested key:value pair syntax. For example, USING {"from":{"file":{"columnseparator":"^"}}}.
Used to specify the source data file column separator character, the presence of a source data file header row, and other parameters.
JSON object key:value syntax uses JSON escape characters.
If no USING clause is specified, default load parameters are used.
connection (JDBC Source Only) — A defined SQL Gateway Connection name. For further details see Loading from a JDBC Source.
TABLE jtable (JDBC Source Only) — The SQL data source table connected via JDBC. For further details see Loading from a JDBC Source.

Description

The LOAD DATA command loads data from a source into an InterSystems IRIS® SQL table. The source can be a data file or a table accessed using JDBC.

This command is intended for rapid population of a table with well-validated data. When you load data, %ROWCOUNT indicates the number of records successfully loaded. An error in input data results in that record not being loaded and loading proceeds to the next record. SQLCODE does not report this as an error; the %SQL_Diag.Result log indicates how many records failed to load.

The INTO table must be defined. If the table is empty, LOAD DATA populates the table with the source data rows. If the table already contains data, LOAD DATA adds the source data rows to the existing data.

Note:

The LOAD DATA command uses an underlying Java-based engine that requires a Java Virtual Machine (JVM) installation on your server. If you already have a JVM set up and accessible in your PATH environment variable, then the first time you use LOAD DATA, InterSystems IRIS automatically uses that JVM to start an External Language Server. To customize your External Language Server to use a specific JVM, or to use a remote server, see Managing External Server Connections.

Matching Columns

By default, LOAD DATA matches the columns from the data source to the target table by position. Note that LOAD DATA uses the SQL column order (SELECT * column order), as specified in the SqlColumnNumber property.

To match the data source columns and the target table columns the following syntax can be used:

  • Load source data positionally for all target table columns. Both source and target have the same sequence of data columns:

    LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
    INTO MyTable

    The source data for each column must validate with the data type of the matching table column. If the data source has more columns than the target table, the excess columns are ignored.

  • Load source data positionally for only some target table columns. Use INTO clause:

    LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
    INTO MyTable(field1,field2)
  • Load source data positionally when data source columns are in a different sequence than target table columns. Use INTO clause:

    LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
    INTO MyTable(field3,field1,field2,field4)
  • Load selected source data columns by source header name: Use INTO...VALUES clause and USING clause:

    LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
    INTO MyTable(field1,field3) VALUES (head2,head5) 
    USING {"from":{"file":{"header":"1"}}}

    If loading from a file source, the USING clause specifies that the first line of the source is a header and the VALUES clause specifies the source header names.

    If loading from a JDBC source the VALUES clause specifies the JDBC table column names and the USING clause is omitted.

  • Load selected source data columns from a source with no header: Use COLUMNS and INTO...VALUES clauses:

    LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
    COLUMNS (head1 INT,head2 VARCHAR(20),head3 INT,head4 VARCHAR(20),head5 INT)
    INTO MyTable(field1,field3) VALUES (head2,head5)

    The COLUMNS clause specifies source header names and data types. The VALUES clause specifies these source header names.

Loading from File Source

When loading data from a file, you must specify data for all user-defined fields specified in the INTO clause. You can also optionally include a header row.

Data Format

A data file must specify data for all of the user-defined fields specified in the INTO clause.

  • Each line in the data file specifies a separate record (row). Newline ("\n") is the default line separator. Blank lines are ignored.

    You can define a different line separator by specifying the lineseparator parameter in the USING clause. Available values are "\n", "\r" or "\r\n".

  • Data values in a record are separated by a column separator character. Comma is the default column separator character. All data fields must be indicated by column separators, including unspecified data indicated by placeholder column separators.

    You can define a different column separator character by specifying the columnseparator parameter in the USING clause.

    By default, no escape character is defined. To include the column separator character as a literal in a data value, enclose the data value with quotation marks. To include a quotation mark in a quoted data value, double the quote character (""). You can define an escape character by specifying the escapechar parameter in the USING clause.

  • All data in a data file record is validated against the table’s data criteria, including the number of data fields in the record, and the data type and data length of each field. A data file record that fails validation is passed over (not loaded). No error message is issued. Data loading continues with the next record.

  • By default, data values are specified in the order of the fields in the table (or view). You can use the COLUMNS clause to specify the data in a different order. You can use a view to load a data record to a table by supplying only values for the fields that are defined in the view.

For example, the Employees table contains fields for Name, Age, and Department. To populate this table you could use the following data file:

Frank Rogers,44,Administration
Ginger Rogers,33,Research
Fred Rogers,,Human Resources
Wilma Flintstone,55,
Barney Rubble,,

Note that all missing data must be represented by placeholder commas. If a record does not specify the correct number of commas, that record is not loaded, and loading proceeds with the next record. SQLCODE does not report this as an error; the %SQL_Diag.Result log indicates how many records failed to load.

The data file may include additional trailing items that are ignored by LOAD DATA. For example, the following data file loads the Employees table that contains fields for Name, Age, and Department (and has no City or Country fields):

Frank Rogers,44,Administration,Boston MA,USA
Ginger Rogers,33,Research,Paris,France
Fred Rogers,,Human Resources,Latrobe PA,USA
Wilma Flintstone,55,La Brea CA,USA
Barney Rubble,,London,UK

Default Values

  • If the data file represents a data field with a placeholder column separator character, that field is null; the default value specified in the table definition is not supplied.

  • If the data file does not represent a data field, the default value specified in the table definition is supplied.

For example, a table is defined with three fields: Name, Age (with the default 999), and Department. If the data file loads data directly into the table, Fred Rogers,,Human Resources loads a record with Age as NULL. If the data file loads data that specifies only Name and Department, Fred Rogers,Human Resources loads a record with Age as 999.

COLUMNS Clause

The COLUMNS clause allows you to assign names to the data columns in the filepath data file. This sequence is used to match data file values to table fields. The COLUMNS clause is used when the source data does not have a header row.

Specify a comma-separated list of fields in the order these data values are represented in the filepath file. Each entry in this list consisting of a fieldname and a datatype pair.

You must specify a fieldname/datatype pair sequentially for all source data columns loaded into the target table. (You do not have to specify fieldname/datatype for source data columns beyond those loaded into the target table.)

The datatype must be compatible with the table’s data type. The table’s data length, not the COLUMNS data length, is used to validate the data.

All fields specified in the INTO clause must be represented in the COLUMNS clause. If the COLUMNS clause is omitted, the default is to positionally match filepath values with table fields.

Header Row Format

The data file may optionally have a header row, such as the following:

Name,Department,Country,City,Street
Frank Rogers,Administration,USA,Cambridge,1 Memorial Drive
...

To specify that the data file has a header row, use the header boolean parameter, as shown in the following example:

LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
INTO Sample.Employees
USING {"from":{"file":{"header":"1"}}}

The header parameter is set to 1 (True). This identifies the first row of the data file as a header row that should not be loaded as data. The default is no header row; all rows are considered data.

Note:

If you do not specify a header parameter, a header row may nevertheless not be loaded as a table record because the header text does not validate against the field data type (for example an integer field with a header such as “Total”). However, this validation rejection should not be relied upon. Use the header parameter.

If you have set the header parameter to 1 (true), you can set the skip parameter to an integer. The skip parameter skips additional rows immediately following the header. The following example does not load the first 3 lines of the data file (the header line and the 2 additional skipped lines):

LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
INTO Sample.Employees
USING {"from":{"file":{"header":"1","skip":"2"}}}

If the header parameter is not specified or is set to 0 and you set the skip parameter, LOAD DATA completes successfully but loads no records.

The header row values correspond to the table field names. A header allows the data file items to be listed in a different order than the table fields. There are two possibilities:

  • Header values are the same as table field names. In this case, LOAD DATA automatically matches the data file items to the corresponding table fields. For example, if the header sequence is Department,Name,Country,City,StreetAddress and the table fields sequence is Name, Department, StreetAddress, City, Country the following example matches the header columns to the appropriate table fields:

    LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
    INTO Sample.Employees
    USING {"from":{"file":{"header":"1"}}}
  • Header values are not the same as table field names. In this case, LOAD DATA requires a COLUMNS clause to match the data file items to the corresponding table fields. For example, if the header sequence is Dept,FullName,Nation,City,Street and the table fields sequence is Name, Department, StreetAddress, City, Country the following example matches the header columns to the appropriate table fields:

    LOAD DATA FROM FILE 'C://TEMP/mydata.txt' 
    COLUMNS (Department VARCHAR(40),Name VARCHAR(40)),Country VARCHAR(20),City VARCHAR(20),StreetAddress VARCHAR(20) 
    INTO Sample.Employees
    USING {"from":{"file":{"header":"1"}}}

    As you can see, the COLUMNS sequence is the header row sequence, but the COLUMNS names are the table field names.

Loading from JDBC Source

When loading from a JDBC source, you specify a defined SQL Gateway Connection name and an external table accessed using that connection. For further details see Connecting the SQL Gateway via JDBC.

The LOAD DATA command INTO clause uses the VALUES keyword to match the target table column names to the JDBC table column names, as shown in the following example:

LOAD DATA FROM JDBC ConnectionName TABLE MyExternalJTable 
INTO MyTable(field1,field3) VALUES (jfield1,jfield2)

VALUES keyword positional matching of SQL column names is similar to INSERT command syntax.

%SQL_Diag Logging

A successful LOAD DATA operation generates a log entry in the %SQL_Diag.Result table and the %SQL_Diag.Message table. These tables can be displayed using a SELECT query, as follows:

SELECT * FROM %SQL_Diag.Result

The %SQL_Diag.Result errorCount column lists the number of records that failed to load and were skipped over.

The %SQL_Diag.Message log table provides detailed information on each record that failed to load.

SELECT * FROM %SQL_Diag.Message WHERE severity = 'error'

The %SQL_Diag.Message table has a foreign key to the %SQL_Diag.Result table. You can view the combination of data from these two tables by performing a JOIN operation on the resultId column of %SQL_Diag.Result and the diagResult column of %SQL_Diag.Message. For example:

SELECT * FROM %SQL_Diag.Message
JOIN %SQL_Diag.Result ON %SQL_Diag.Result.resultId = %SQL_Diag.Message.diagResult

Note that the timestamp for these logs is UTC (Coordinated Universal Time), not local time.

Examples

The examples in this section load data into the following table:

CREATE TABLE Sample.Employees (Name VARCHAR(40),Age INT DEFAULT 999,Department VARCHAR(20))

In the following example LOAD DATA copies data records such as Frank Rogers,44,Administration from a data file into the Employees table. It loads all user-defined table fields with the positionally corresponding file data item:

LOAD DATA FROM FILE 'c://TEMP/mydata.txt' INTO Sample.Employees

In the following example LOAD DATA copies data records such as Administration,Frank Rogers from a data file into the Employees table:

LOAD DATA FROM FILE 'c://TEMP/mydata.txt' INTO Sample.Employees(Department,Name)

The Age field is not specified in the list of fields. Age defaults to 999 when data is loaded.

The following example copies data from a data file into the Employees table using the following view:

CREATE VIEW Sample.VEmp (Moniker,Dept) AS SELECT Name,Department FROM Sample.Stuff

LOAD DATA copies data records such as Administration,Frank Rogers from a file using this view into the Employees table:

LOAD DATA FROM FILE 'c://TEMP/mydata.txt' INTO Sample.VEmp(Dept,Moniker)

The Age field is not part of the view definition. Age defaults to 999 when data is loaded using this view.

See Also

Feedback