Skip to main content

Designing Schemas with SchemaBuilder

SchemaBuilder is a utility that provides simple calls to construct schema definitions, which are returned as JSON strings. All construction methods are static and calls can be nested. Field types can be specified directly or inferred from Java types, classes, or objects.

There are several ways to create schemas. One of the easiest is to use SchemaBuilder.infer() to construct a schema from sample data and corresponding field names. For example:

  Object[] values = new Object[]{"Apple", 2};
  String[] labels = new String[]{"item","count"};
  String schemaFruit = SchemaBuilder.infer(values,"Demo.Fruit",labels);

Schemas can also be designed using SchemaBuilder.record(), which provides builder methods such as addfield() to create schema components. Builder methods can be chained until complete() is called to return the JSON schema string.

The following call to record() produces a JSON schema string identical to the one produced by infer():

  String schemaFruit = SchemaBuilder.record()
    .withName("Test.Demo.Fruit")
    .addField("item", "string")
    .addField("count", "int")
    .complete();

Both of the examples above will return the same JSON schema string (line breaks added for clarity):

  {"type":"record",
  "namespace":"Test.Demo",
  "name":"Fruit",
  "category":"persistent",
  "fields":[
    {"name":"item","type":"string"},
    {"name":"count","type":"int"}
  ]}

This schema contains the following components:

  • type — Schemas can have various types (in the example above, notice that each field is a schema with its own type), but Persister schemas will always be stored in the server Schema Registry as record types (class RecordSchema).

  • namespace — A schema name qualifier. It is important to note that schema namespaces have nothing to do with InterSystems IRIS database namespaces. Schema namespaces are part of the qualified schema name, which also determines the fully qualified name of the corresponding ObjectScript class. For example, the Test.Demo.Fruit schema has namespace Test.Demo and name Fruit. The corresponding class would be Test.Demo.Fruit (package name Test.Demo and short class name Fruit). This class could be stored in any InterSystems IRIS namespace (for example, the USER namespace).

  • name — The unqualified schema name. This is the final part of the identifier you specify for the schema. For example, if you specify .withname(Test.Demo.Fruit), the namespace will be Test.Demo, and the name will be Fruit. The corresponding SQL table will be named Test_Demo.Fruit (see Table Names and Schema Names in Using InterSystems SQL for related information on naming conventions).

  • category — Since InterSystems IRIS stores records as serialized instances of ObjectScript classes, Persister schemas need to differentiate between classes that extend %Library.PersistentOpens in a new tab and those that extend %Library.SerialObjectOpens in a new tab (embedded objects). The default is Persistent if no value is specified.

  • fields — Each field entry is specified as a schema with its own name and type. The Persister supports primitive types string, bytes, short, int, long, float, double, boolean, and null. Complex type declarations can include their own fields, nested as deep as necessary.

SchemaBuilder.record() provides an extra level of control when specifying field types. For example, a schema for the following object could be created by either infer() or record():

  Object[] data = new Object[];
  data[0] = "Wilber";
  data[1] = true;
  data[2] = new Object[][] {{0,1},{2,3}};
  data[3] = java.util.UUID.randomUUID();
  data[4] = new java.util.Date();

It is very simple to create the schema by inference from data:

  String[] names = {"Name","isActive","Scores","MemberID","DateJoined"};
  String schemaJson = SchemaBuilder.infer(data,"Demo.ClubMember",names);

But the record() builder methods allow individual field types to be specified in several different ways. Types can still be inferred from data, but they can also be inferred from Java class names or specified directly as Java types. Some complex types such as dates and times can be also be specified using logical type methods. The following example demonstrates all of these options:

  String schemaJson = SchemaBuilder.record()
    .withName("Demo.ClubMember")
    .addField("Name", SchemaBuilder.infer("java.lang.String"))
    .addField("isActive", SchemaBuilder.infer(true))
    .addField("Scores", java.lang.Integer[].class)
    .addField("MemberID", SchemaBuilder.uuid())
    .addField("DateJoined", SchemaBuilder.date())
    .complete();

In the first two fields, Name and isActive, types are determined by calling infer() on class name "java.lang.String" and value true. The Scores field directly specifies class type java.lang.Integer[].class. The last two fields use logical type methods uuid() and date(). This produces the following JSON schema string:

  {"type":"record",
  "name":"ClubMember",
  "namespace":"Demo",
  "category":"persistent",
  "fields":[
    {"name":"Name", "type":"string"},
    {"name":"isActive", "type":"boolean"},
    {"name":"Scores", "type":{"type":"array", "items":"int"}}
    {"name":"MemberID", "type":{"logicalType":"uuid", "type":"string"}},
    {"name":"DateJoined", "type":{"logicalType":"date", "type":"int"}},
  ]}

This schema is almost identical to one produced by simple inference, except for the last field. The date() method produces a logical type for java.util.Date, while inference would produce a logical type for java.sql.Timestamp.

FeedbackOpens in a new tab