Skip to main content

Federated Tables

A federated table offers query access to data that lives in disparate instances of InterSystems IRIS® data platform hosting the identical or similar schemas. Unlike a sharded table, federated tables are read-only and do not manage or distribute data among the sources based on a shard key. The source tables for a federated table, which physically store the its data and live in the distinct instances or namespaces, are managed exclusively by their host instance. As such, federated tables are read only. Federated tables offer a solution to use cases that require analytics or querying across different deployments of the same application or schema, such as multi-tenant environments.

Requirements for Creating a Federated Table

To connect source tables together as a federated table, you must first configure a sharded cluster that spans all the instances containing the relevant source data. For more information about configuring a sharded cluster, see “Deploying the Sharded Cluster.” Users that are new to sharding may find it useful to familiarize themselves with the terminology and concepts in “Horizontally Scaling for Data Volume with Sharding.”

A cluster namespace must be created for each namespace that contains one or more source tables, called a source namespace. This requirement applies to both cases where each namespace resides on a separate instance and to cases where multiple source namespaces reside on the same instance; in the latter case, multiple cluster namespaces on the same instance need to be configured so that each source namespace has a corresponding cluster namespace.

The following image shows the relationships between federated tables, source tables, cluster namespaces, and source namespaces.

On three instances of a sharded cluster, source tables project into a federated table accessible through the cluster namespac

Creating a Federated Table

Once you have configured a sharded cluster, creating a federated table consists of two parts: defining the federated table and connecting source tables to the federated table.

Defining a Federated Table

From the master namespace of the sharded cluster, you can create a federated table by using $SYSTEM.Sharding.CreateFederatedTable(). See complete documentation on how to use this method in the Class Reference.

CreateFederatedTable() registers the federated table definition in the cluster namespace that represents the federated table and defines how columns from a source table are projected into the federated table. As part of the API call, columns from the source table may be left out of the federated table or dummy columns with a constant value may be added (for example, to identify the source namespace). This method also attaches the initial source table from the specified source namespace, which is stored on the same instance. A source namespace is never impacted by the existence of a federated table that any of its tables projects to.

There are a few notes about federated table definitions:

  • Private properties in a source table are not projected to a federated table unless explicitly listed as an argument to CreateFederatedTable().

  • The types of source fields do not need to match the types of the corresponding fields in the federated tables. Implicit ype conversions are identical to those used in UNION clauses and return a data type with the highest precedence as follows: VARCHAR, DOUBLE, NUMERIC, BIGINT, INTEGER, SMALLINT, TINYINT.

  • The SqlRowIdPrivate class keyword is determined by the source table and the ID is projected with the name from the source table. If this keyword is defined, the RowID is not projected to the federated table. Note that tables created via DDL specify SqlPrivateRowId by default.

The newly created federated table is immediately accessible from any cluster namespace in the sharded cluster.

At this point, the newly created federated table only projects data from the source table in the source namespace that was referred to in the call to CreateFederatedTable(). See Connecting Source Tables to a Federated Table below for information about including data from source table in other source namespaces.

Examples

The following example demonstrates a call to CreateFederatedTable() from the terminal that creates a federated table, Hospital.Employees, in the IRISCLUSTER namespace (the default name for a cluster namespace). This federated table uses Employees.Doctors from the USER namespace as an initial source table. The federated table’s columns are identical to those of the source table.

do $SYSTEM.Sharding.CreateFederatedTable(,"Hospital.Employees", "USER", "Employees.Doctors")

The following example creates a federated table, Hospital.DiagnosisLog, in the IRISCLUSTER cluster namespace. This federated table uses Hospital.Patient from the HOSPITAL namespace as an initial source table and projects the Diagnosis and DateAdmitted columns from the source table. In this case, the DateAdmitted column is projected as AdmissionDate to the federated table. The resulting federated table in this example has only two columns.

do $SYSTEM.Sharding.CreateFederatedTable("IRISCLUSTER","Hospital.DiagnosisLog", "HOSPITAL", "Hospital.Patient", $lb($lb("Diagnosis"), $lb("DateAdmitted","AdmissionDate")))

Connecting Source Tables to a Federated Table

After creating the federated table, connect source tables from other source namespaces to it. To do so, use the $SYSTEM.Sharding.ConnectFederatedTable() method in each source namespace. See complete documentation for this method in the Class Reference.

When a source table is connected to the federated table, any table statistics that have been collected by TUNE TABLE are reported to the federated table. For optimal query performance, you should run TUNE TABLE on a source table before connecting it to a federated table, so the table statistics are most accurate. If the source table needs to be re-tuned and you wish to propagate the updated statistics to the federated table, tune the source table, then re-connect it to the federated table, specifying 1 for the Force argument to ConnectFederatedTable().

If the source table is altered or dropped, the federated table definition is not automatically updated. This is a consequence of how federated tables treat the source namespace as read-only and cannot be triggered by any events in it. When a source table is altered, you will need to call ConnectFederatedTable() in the cluster namespace corresponding to the source namespace that has changed and specify the alterations, specifying 1 for the Force argument to ConnectFederatedTable(). For example, changing a column name in the source table requires you to re-specify the column names that are projected into the federated table. Note that this step is not required for any compatible changes, such as adding a column to the source table that does not need to be projected to the federated table.

When the source tables’ structure matches exactly between different source namespaces, calls to ConnectFederatedTable() are easy to script, as you can leverage the %SYSTEM.ShardWorkMgr.Broadcast() instance method to execute them once in each cluster namespace. See the Examples for further explanation.

Examples

The following example uses ConnectFederatedTable() to connect the Employees.Nurses source table in the USER namespace to the Hospital.Employees federated table in the IRISCLUSTER namespace.

do $SYSTEM.Sharding.CreateFederatedTable("IRISCLUSTER","Hospital.Employees","USER","Employees.Nurses")

The following example employs the %SYSTEM.ShardWorkMgr.Broadcast() instance method to connect source tables, called Employees.Nurses in the HOSPITAL namespace, to a newly created federated table, also called Employees.Nurses. Note that this approach requires each instance to have identically named source tables and source namespaces, due to the same arguments being broadcast across instances. The cluster namespace argument to ConnectFederatedTable() should be left blank.

zn "IRISCLUSTER"
set status=$SYSTEM.Sharding.CreateFederatedTable(,"Employees.Nurses","HOSPITAL","Employees.Nurses")
set shardManager=$SYSTEM.ShardWorkMgr.%New()
set status=shardManager.Broadcast("DS","##class(%SYSTEM.Sharding).ConnectFederatedTable",,"Employees.Nurses","HOSPITAL","Employees.Nurses")

Querying a Federated Table

Querying a federated table is transparently similar to querying a standard table. Federated tables can be queried with Dynamic SQL, Embedded SQL, or with a database driver, such as JDBC or ODBC.

Dropping or Disconnecting a Federated Table

To drop a federated table from your sharded cluster, use $SYSTEM.Sharding.DropFederatedTable(). This method, called from the cluster namespace of any instance, drops the federated table definition from all the cluster namespaces. Further attempts to query the federated table will fail, as it no longer exists. Dropping a federated table does not affect the source tables or their data.

To remove a single source table from a federated table, call $SYSTEM.Sharding.DisconnectFederatedTable() from the cluster namespace corresponding to the source namespace that contains the source table to be disconnected. This method disconnects a source table from the federated table. When a source table is disconnected from a federated table, queries on the federated table will no longer return data from that source table. The source table can later be re-connected to the federated table with ConnectFederatedTable().

FeedbackOpens in a new tab