Creating a Custom Locale
This example will provide a template for creating a custom locale with a custom table. The custom table will translate between EBCDIC (the common form used in the US) and Latin-1 (ISO-8859–1). For more details, see the documentation for the respective classes.
As for any other table, first we need to get the definition for the character mappings. For this example we are using the data file from the web site http://source.icu-project.orgOpens in a new tab (International Components for Unicode). The relevant data fileOpens in a new tab is a text file with comment lines starting with a pound sign (#) and then a series of translation definition lines of the form:
A small excerpt of the file looks like:
#
#UNICODE EBCDIC_US
#_______ _________
<U0000> \x00 |0
<U0001> \x01 |0
<U0002> \x02 |0
<U0003> \x03 |0
<U0004> \x37 |0
<U0005> \x2D |0
...
The lines indicate that Unicode character Uaaaa maps to EBCDIC character \xbb (where aaaa and bb are expressed in hexadecimal). We assume that the table is reversible and that EBCDIC character \xbb maps back to Unicode character Uaaaa. This allows us to create both sides (that is, EBCDIC-to-Latin1 and Latin1-to-EBCDIC) from the same data file in a single scan. Because the Unicode range is just from 0 to 255, this is actually a Latin-1 table.
The process first creates the SubTable object, then the Table, and finally the Locale. For the first step, the process creates two SubTables objects, initializes their Name and Type properties, and then fills in the FromTo mapping array with data read from the definition file.
SubTable names take the form, Type–FromEncoding–ToEncoding. The Type for regular I/O translations is “XLT” and so the SubTable names will be XLT-yEBCDIC-Latin1 and XLT-yLatin1-EBCDIC.
The following code creates the SubTables objects. In a real world program, the code would perform a number of consistency checks that omitted here for the sake of clarity. This example deletes an existing previous versions of the same objects (SubTables, Tables and Locales) so that you can run the example multiple times. More properly, you should check for the existence of previous objects using the class method Exists() and take a different action if they are already present.
// Names for the new SubTables (save for later)
Set nam1 = "XLT-Latin1-yEBCDIC"
Set nam2 = "XLT-yEBCDIC-Latin1"
// Delete existing SubTables instances with same ids
Do ##class(Config.NLS.SubTables).Delete(nam1)
Do ##class(Config.NLS.SubTables).Delete(nam2)
// Create two SubTable objects
Set sub1 = ##class(Config.NLS.SubTables).%New()
Set sub2 = ##class(Config.NLS.SubTables).%New()
// Set Name and Description
Set sub1.Name = nam1
Set sub1.Description = "ICU Latin-1->EBCDIC sub-table"
Set sub2.Name = nam2
Set sub2.Description = "ICU EBCDIC ->Latin-1 sub-table"
The SubTables object contains a property, type, that is a small integer indicating whether we are dealing with a multibyte translation or not. This example sets type to zero indicating a single-byte mapping. The mapping is initialized so that code points (characters) not defined in the data file are mapped to themselves.
// Set Type (single-to-single)
Set sub1.Type = 0
Set sub2.Type = 0
// Initialize FromTo arrays
For i = 0 : 1 : 255
{
Do sub1.FromTo.SetAt(i, i)
Do sub2.FromTo.SetAt(i, i)
}
Next the application reads the file. Definitions in the file override those set as the default mapping. The function $ZHEX() converts the codes from hexadecimal to decimal.
// Assume file is in the mgr directory
Set file = "glibc-EBCDIC_US-2.1.2.ucm"
// Set EOF exit trap
Set $ZTRAP = "EOF"
// Make that file the default device
Open file
Use file
For
{
Read x
If x?1"<U"4AN1">".E
{
Set uni = $ZHEX($E(x,3,6)),ebcdic = $ZHEX($E(x,12,13))
Do sub1.FromTo.SetAt(ebcdic,uni)
Do sub2.FromTo.SetAt(uni,ebcdic)
}
}
EOF // No further data
Set $ZT = ""
Close file
// Save SubTable objects
Do sub1.%Save()
Do sub2.%Save()
The character mappings are now complete. The next step is to create the Table objects that reference the SubTables objects just defined. Table objects are really descriptors for the SubTables and have only a few properties. The following code makes the connection between the two:
// Delete existing Tables instances with same ids
Do ##class(Config.NLS.SubTables).Delete("XLT", "Latin1", "yEBCDIC")
Do ##class(Config.NLS.SubTables).Delete("XLT", "yEBCDIC", "Latin1")
// Create two Table objects
Set tab1 = ##class(Config.NLS.Tables).%New()
Set tab2 = ##class(Config.NLS.Tables).%New()
// Set description
Set tab1.Description = "ICU loaded Latin-1 -> EBCDIC table"
Set tab2.Description = "ICU generated EBCDIC -> Latin-1 table"
// Set From/To encodings
Set tab1.NameFrom = "Latin1"
Set tab1.NameTo = "yEBCDIC"
Set tab2.NameFrom = "yEBCDIC"
Set tab2.NameTo = "Latin1"
// Set SubTable
Set tab1.SubTableName = nam1
Set tab2.SubTableName = nam2
// Set Type
Set tab1.Type = "XLT"
Set tab2.Type = "XLT"
// Set Default Action
// 1 = Replace with replacement value
Set tab1.XLTDefaultAction = 1
Set tab2.XLTDefaultAction = 1
// Set Replacement value of "?"
Set tab1.XLTReplacementValue = $ASCII("?")
Set tab2.XLTReplacementValue = $ASCII("?")
// Set Reversibility
// 1 = Reversible
// 2 = Generated
Set tab1.XLTReversibility = 1
Set tab2.XLTReversibility = 2
// Set Translation Type
// 0 = non-modal to non-modal
Set tab1.XLTType = 0
Set tab2.XLTType = 0
// Save Table objects
Do tab1.%Save()
Do tab2.%Save()
With the Tables defined, the last step of the construction is to define a locale object that will incorporate the new tables. The application creates an empty Locale object and fills in each of the properties as was done for the Tables and SubTables. A Locale, however, is bigger and more complex. The easiest way to make a simple change like this is to copy an existing locale and change only what we need. This process uses enu8 as the source locale and names the new one, yen8. The initial y makes it clear this is a custom locale and should not be deleted on upgrades.
// Delete existing Locales instance with the same id
Do ##class(Config.NLS.Locales).Delete("yen8")
// Open source locale
Set oldloc = ##class(Config.NLS.Locales).%OpenId("enu8")
// Create clone
Set newloc = oldloc.%ConstructClone()
// Set new Name and Description
Set newloc.Name = "yen8"
Set newloc.Description = "New locale with EBCDIC table"
With the locale in place, the process now adds the EBCDIC table to the list of I/O tables that are loaded at startup. This is done by inserting a node in the array property XLTTables, as follows:
XLTTables(<TableName>) = <components>
-
tablename identifies the pair of input and output tables for this locale.
Because the name does not need to start with y, we use EBCDIC.
-
components is a four-item list as follows:
-
The input “From” encoding
-
The input “To” encoding
-
The output “From” encoding
-
The output “To” encoding
The following code adds the table to the list of available locales:
// Add new table to locale
Set component = $LISTBUILD("yEBCDIC", "Latin1", "Latin1", "yEBCDIC")
Do newloc.XLTTables.SetAt(component, "EBCDIC")
Before the locale is usable by InterSystems IRIS, it must be compiled into its internal form. This is also sometimes called validating the locale. The IsValid() class method does a detailed analysis and returns two arrays, one for errors and one for warnings, with human-readable messages if the locale is not properly defined.
// Check locale consistency
If '##class(Config.NLS.Locales).IsValid("yen8", .Errors, .Warns)
{
Write !,"Errors: "
ZWrite Errors
Write !,"Warnings: "
ZWrite Warns
Quit
}
// Compile new locale
Set status = ##class(Config.NLS.Locales).Compile("yen8")
If (##class(%SYSTEM.Status).IsError(status))
{
Do $System.OBJ.DisplayError(status)
}
Else
{
Write !,"Locale yen8 successfully created."
}