Using Caché Objects
Using the Caché Populate Utility
[Home] [Back] [Next]
InterSystems: The power behind what matters   
Class Reference   
Search:    

Caché includes a utility for creating pseudo-random test data for persistent classes. The creation of such data is known as data population; the utility for doing this, known as the Caché populate utility, is useful for testing persistent classes before deploying them within a real application. It is especially helpful when testing how various parts of an application will function when working against a large set of data.

The populate utility takes its name from its principal element — the %Populate class, which is part of the Caché class library. Classes that inherit from %Populate contain a method called Populate(), which allows you to generate and save class instances containing valid data. You can also customize the behavior of the %Populate class to provide data for your needs.
Along with the %Populate class, the populate utility uses %PopulateUtils. %Populate provides the interface to the utility, while %PopulateUtils is a helper class.
This appendix covers the following topics:
When viewing this book online, use the preface of this book to quickly find related topics.
Data Population Basics
To use the populate utility, do the following:
  1. Modify each persistent and each serial class that you want to populate with data. Specifically, add %Populate to the end of the list of superclasses, so that the class inherits the interface methods. For example, if a class inherits directly from %Persistent, its new superclass list would be:
    Class MyApp.MyClass Extends (%Persistent,%Populate) {}
    Do not use %Populate as a primary superclass; that is, do not list it as the first class in the superclass list.
    Or when using the New Class Wizard within Studio, check Data Population on the last screen. This is equivalent to adding the %Populate class to the superclass list.
  2. In those classes, optionally specify the POPSPEC and POPORDER parameters of each property, to control how the populate utility generates data for those properties, if you want to generate custom data rather than the default data, which is described in the next section.
    Later sections of this appendix provide information on these parameters.
  3. Recompile the classes.
  4. To generate the data, call the Populate() method of each persistent class. By default, this method generates 10 records for the class (including any serial objects that it references):
     Do ##class(MyApp.MyClass).Populate()
    If you prefer, you can specify the number of objects to create:
     Do ##class(MyApp.MyClass).Populate(num)
    where num is the number of objects that you want.
    Do this in the same order in which you would add records manually for the classes. That is, if Class A has a property that refers to Class B, use the following table to determine which class to populate first:
    If the property in Class A has this form... And Class B inherits from... Populate this class first...
    Property PropertyName as ClassB; %SerialObject ClassA (this populates ClassB automatically)
    Property PropertyName as List of ClassB;
    Property PropertyName as Array of ClassB;
    Property PropertyName as ClassB; %Persistent ClassB
    Property PropertyName as List of ClassB;
    Property PropertyName as Array of ClassB;
    Relationship PropertyName as ClassB [ Cardinality = one ...]; either
    Relationship PropertyName as ClassB [ Cardinality = parent ...];
    Relationship PropertyName as ClassB [ Cardinality = many...]; either ClassA
    Relationship PropertyName as ClassB [ Cardinality = child ...];
Later, to remove the generated data, use either the %DeleteExtent() method (safe) or the %KillExtent() method (fast) of the persistent interface. For more information, see Deleting Saved Objects in the chapter Working with Persistent Objects.”
Tip:
In practice, it is often necessary to populate classes repeatedly, as you make changes to your code. Thus it is useful to write a method or a routine to populate classes in the correct order, as well as to remove the generated data.
Populate() Details
Formally, the Populate() class method has the following signature:
classmethod Populate(count As %Integer = 10, 
                     verbose As %Integer = 0, 
                     DeferIndices As %Integer = 1, 
                     ByRef objects As %Integer = 0, 
                     tune As %Integer = 1) as %Integer
Where:
Populate() returns the number of objects actually populated:
 Set objs = ##class(MyApp.MyClass).Populate(100)
 // objs is set to the number of objects created.
 // objs will be less than or equal to 100
In cases with defined constraints, such as a minimum or maximum length, some of the generated data may not pass validation, so that individual objects will not be saved. In these situations, Populate() may create fewer than the specified number of objects.
If errors prevent objects from being saved, and this occurs 1000 times sequentially with no successful saves, Populate() quits.
Default Behavior
This section describes how the Populate() method generates data, by default, for the following kinds of properties:
The Populate() method ignores stream properties.
Literal Properties
This section describes how the Populate() method, by default, generates data for properties of the forms:
Property PropertyName as Type;
Property PropertyName;
Where Type is a datatype class.
For these properties, the Populate() method first looks at the name. Some property names are handled specially, as follows:
If the property name is any case variation of the following Populate() invokes the following method to generate data for it
NAME Name()
SSN SSN()
COMPANY Company()
TITLE Title()
PHONE USPhone()
CITY City()
STREET Street()
ZIP USZip()
MISSION Mission()
STATE USState()
COLOR Color()
PRODUCT Product()
If the property does not have one of the preceding names, then the Populate() method looks at the property type and generates suitable values. For example, if the property type is %String, the Populate() method generates random strings (respecting the MAXLEN parameter of the property). For another example, if the property type is %Integer, the Populate() method generates random integers (respecting the MINVAL and MAXVAL parameters of the property).
If the property does not have a type, Caché assumes that it is a string. This means that the Populate() method generates random strings for its values.
Exceptions
The Populate() method does not generate data for a property if the property is private, is multidimensional, is calculated, or has an initial expression.
Collection Properties
This section describes how the Populate() method, by default, generates data for properties of the forms:
Property PropertyName as List of Classname;
Property PropertyName as Array of Classname;
For such properties:
Properties That Refer to Serial Objects
This section describes how the Populate() method, by default, generates data for properties of the form:
Property PropertyName as SerialObject;
Where SerialObject is a class that inherits from %SerialObject.
For such properties:
Properties That Refer to Persistent Objects
This section describes how the Populate() method, by default, generates data for properties of the following form:
Property PropertyName as PersistentObject;
Where PersistentObject is a class that inherits from %Persistent.
For such properties:
For information on relationships, see the next section.
Relationship Properties
This section describes how the Populate() method, by default, generates data for properties of the following form:
Relationship PropertyName as PersistentObject;
Where PersistentObject is a class that inherits from %Persistent.
For such properties:
Specifying the POPSPEC Parameter
For a given property in a class that extends %Populate, you can customize how the Populate() method generates data for that property. To do so, do the following:
The POPSPEC parameter provides additional options for list and array properties, discussed in later subsections.
For a literal, non-collection property, another technique is to identify an SQL table column that contains values to use for this property; then specify the POPSPEC parameter to refer to this property; see the last subsection.
Note:
There is also a POPSPEC parameter defined at the class level that controls data population for an entire class. This is an older mechanism (included for compatibility) that is replaced by the property-specific POPSPEC parameter. This appendix does not discuss it further.
Specifying the POPSPEC Parameter for Non-Collection Properties
For a literal property that is not a collection, use one of the following variations:
For example:
Property HomeCity As %String(POPSPEC = "City()");
If you need to pass a string value as an argument to the given method, double the starting and closing quotation marks around that string. For example:
Property PName As %String(POPSPEC = "Name(""F"")");
Also, you can append a string to the value returned by the specified method. For example:
Property JrName As %String(POPSPEC = "Name()_"" jr."" ");
Notice that it is necessary to double the starting and closing quotation marks around that string. It is not possible to prepend a string, because the POPSPEC is assumed to start with a method.
Also see Specifying the POPSPEC Parameter via an SQL Table for a different approach.
Specifying the POPSPEC Parameter for List Properties
For a property that is a list of literals or objects, you can use the following variation:
POPSPEC="basicspec:MaxNo"
Where
For example:
Property MyListProp As list Of %String(POPSPEC = ".MyInstanceMethod():15");
You can omit basicspec. For example:
Property Names As list of Name(POPSPEC=":3");
In the following examples, there are lists of several types of data. Colors is a list of strings, Kids is a list of references to persistent objects, and Addresses is a list of embedded objects:
Property Colors As list of %String(POPSPEC="ValueList("",Red,Green,Blue"")");

Property Kids As list of Person(POPSPEC=":5");

Property Addresses As list of Address(POPSPEC=":3");
To generate data for the Colors property, the Populate() method calls the ValueList() method of the PopulateUtils class. Notice that this example passes a comma-separated list as an argument to this method. For the Kids property, there is no specified method, which results in automatically generated references. For the Addresses property, the serial Address class inherits from %Populate and data is automatically populated for instances of the class.
Specifying the POPSPEC Parameter for Array Properties
For a property that is an array of literals or objects, you can use the following variation:
POPSPEC="basicspec:MaxNo:KeySpecMethod"
Where:
The following examples show arrays of several types of data and different kinds of keys:
Property Tix As array of %Integer(POPSPEC="Integer():20:Date()");

Property Reviews As array of Review(POPSPEC=":3:Date()");

Property Actors As array of Actor(POPSPEC=":15:Name()");
The Tix property has its data generated using the Integer() method of the PopulateUtils class; its keys are generated using the Date() method of the PopulateUtils class. The Reviews property has no specified method, which results in automatically generated references, and has its keys also generated using the Date() method. The Actors property has no specified method, which results in automatically generated references, and has its keys generated using the Name() method of the PopulateUtils class.
Specifying the POPSPEC Parameter via an SQL Table
For POPSPEC, rather than specifying a method that returns a random value, you can specify an SQL table name and an SQL column name to use. If you do so, then the Populate() method constructs a dynamic query to return the distinct column values from that column of that table. For this variation of POPSPEC, use the following syntax:
POPSPEC=":MaxNo:KeySpecMethod:SampleCount:Schema_Table:ColumnName"
Where:
For example:
Property P1 As %String(POPSPEC=":::100:Wasabi_Data.Outlet:Phone");
In this example, the property P1 receives a random value from a list of 100 phone numbers retrieved from the Wasabi_Data.Outlet table.
Basing One Generated Property on Another
In some cases, the set of suitable value for one property (A) might depend upon the existing value of another property (B). In such a case:
How %Populate Works
This section describes how %Populate works internally. The %Populate class contains two method generators: Populate() and PopulateSerial(). Each persistent or serial class inheriting from %Populate has one or the other of these two methods included in it (as appropriate).
We will describe only the Populate method here. The Populate() method is a loop, which is repeated for each of the requested number of objects.
Inside the loop, the code:
  1. Creates a new object
  2. Sets values for its properties
  3. Saves and closes the object
A simple property with no overriding POPSPEC parameter has a value generated using code with the form:
 Set obj.Description = ##class(%PopulateUtils).String(50)
While using a library method from %PopulateUtils via a “Name:Name()” specification would generate:
 Set obj.Name = ##class(%PopulateUtils).Name()
An embedded Home property might create code like:
 Do obj.HomeSetObject(obj.Home.PopulateSerial())
The generator loops through all the properties of the class, and creates code for some of the properties, as follows:
  1. It checks if the property is private, is calculated, is multidimensional, or has an initial expression. If any of these are true, the generator exits.
  2. If the property is has a POPSPEC override, the generator uses that and then exits.
  3. If the property is a reference, on the first time through the loop, the generator builds a list of random IDs, takes one from the list, and then exits. For the subsequent passes, the generator simply takes an ID from the list and then exits.
  4. If the property name is one of the specially handled names, the generator then uses the corresponding library method and then exits.
  5. If the generator can generate code based on the property type, it does so and then exits.
  6. Otherwise, the generator sets the property to an empty string.
Refer to the %PopulateUtils class for a list of available methods.
Custom Populate Actions and the OnPopulate() Method
For additional control over the generated data, you can define an OnPopulate() method. If an OnPopulate() method is defined, then the Populate() method calls it for each object it generates. The method is called after assigning values to the properties but before the object is saved to disk. Each call to the Populate() method results in a check for the existence of the OnPopulate() method and a call to OnPopulate() it for each object it generates.
This instance method is called by the Populate method after assigning values to properties but before the object is saved to disk. This method provides additional control over the generated data. If an OnPopulate() method exists, then the Populate method calls it for each object that it generates.
Its signature is:
Method OnPopulate() As %Status 
{
    // body of method here...
}
Note:
This is not a private method.
The method returns a %Status code, where a failure status causes the instance being populated to be discarded.
For example, if you have a stream property, Memo, and wish to assign a value to it when populating, you can provide an OnPopulate() method:
Method OnPopulate() As %Status
{
    Do ..Memo.Write("Default value")
    QUIT $$$OK
}
You can override this method in subclasses of %Library.Populate.
Alternative Approach: Creating a Utility Method
There is another way to use the methods of the %Populate and %PopulateUtils classes. Rather than using %Populate as a superclass, write a utility method that generates data for your classes.
In this code, for each class, iterate a desired number of times. In each iteration:
  1. Create a new object.
  2. Set each property using a suitable random (or nearly random) value.
    To generate data for a property, call a method of %Populate or %PopulateUtils or use your own method.
  3. Save the object.
As with the standard approach, it is necessary to generate data for independent classes before generating it for the dependent classes.
For examples of this approach, see the two DeepSee samples in the SAMPLES database, contained in the DeepSee and HoleFoods packages.
Tips for Building Structure into the Data
In some cases, you might want to include certain values for only a percentage of the cases. You can use the $RANDOM function to do this.
In DeepSee.Populate, the sample method RandomTrue() returns true or false randomly, depending on a cutoff percentage that you provide as an argument. So, for example, it can return true 10% of the time or 75% of the time. (Internally, this method uses $RANDOM.)
When you generate data for a property, you can use this method to determine whether or not to assign a value:
 If ##class(DeepSee.Populate).RandomTrue(15) {
    set object.property="something"
    } 
In the example shown here, approximately 15 percent of the records will have the given value for this property.
In other cases, you might need to simulate a distribution. To do so, set up and use a lottery system. For example, suppose that 1/4 of the values should be A, 1/4 of the values should be B, and 1/2 the values should be C. The logic for the lottery can go like this:
  1. Choose an integer from 1 to 100, inclusive.
  2. If the number is less than 25, return value A.
  3. If the number is between 25 and 49, inclusive, return value B.
  4. Otherwise, return value C.