Skip to main content
InterSystems IRIS for Health 2024.3
AskMe (beta)
Loading icon

$ZCONVERT (ObjectScript)

Returns a converted value of the given string, given a conversion mode and optional arguments.

Synopsis

$ZCONVERT(string,mode,trantable,handle) 
$ZCVT(string,mode,trantable,handle)

Arguments

Argument Description
string The string to convert, specified as a quoted string. This string can be specified as a value, a variable, or an expression.
mode A letter code specifying the conversion mode, either the type of case conversion or input/output encoding. Specify mode as a quoted string.
trantable Optional — The translation table to use, specified as either an integer or a quoted string.
handle Optional — An unsubscripted local variable that holds a string value. Used for multiple invocations of $ZCONVERT. The handle contains the remaining portion of string that could not be converted at the end of $ZCONVERT, and supplies this remaining portion to the next invocation of $ZCONVERT.

Description

$ZCONVERT converts a string from one form to another. The nature of the conversion depends on the arguments you use:

Two-Argument Form: Case Conversion

The two-argument form of $ZCONVERT returns a new string which differs from the input string only by case. For example:

USER>set string="abcdef"
 
USER>write $zconvert(string,"u")
ABCDEF
USER>write $zconvert(string,"l")
abcdef

The two-argument form has the syntax:

$ZCONVERT(string,mode) 
$ZCVT(string,mode)

The mode argument must evaluate to one of the following:

L or l

Lowercase translation: Convert all characters in string to lowercase. Conversion works on Unicode letters as well as ASCII letters. In some alphabets, a small number of letters only have a lowercase letter form. For example, the German eszett ($CHAR(223)) is only defined as a lowercase letter. Attempting to convert it to an uppercase letter results in the same lowercase letter:

  IF $ZCONVERT($CHAR(223),"U")=$ZCONVERT($CHAR(223),"L") {
      WRITE "uppercase and lowercase letter are the same" }
  ELSE {WRITE "uppercase and lowercase are different" }

For this reason, when converting alphanumeric strings to a single letter case it is always preferable to convert to lowercase.

Also see the $$$LOWER system macro.

U or u

Uppercase translation: Convert all characters in string to uppercase. Conversion works on Unicode letters as well as ASCII letters. The following example converts the Greek alphabet from lowercase to uppercase:

   FOR i=945:1:969 {WRITE $ZCONVERT($CHAR(i),"U")}

You can perform similar letter case translations using the $TRANSLATE function, as shown in the following example:

  WRITE $TRANSLATE(text,"ABCDEFGHIJKLMNOPQRSTUVWXYZ","abcdefghijklmnopqrstuvwxyz")

Also see the $$$UPPER system macro.

T or t

Titlecase translation: Convert all characters in string to titlecase. Titlecase is only meaningful for those alphabets (principally Eastern European) that have three forms for a letter: uppercase, lowercase, and titlecase. For all other letters, titlecase translation is the same as uppercase translation.

Titlecase (“T”) mode converts every letter in the string to its titlecase form. Titlecase does not selectively uppercase letters based on their position in a word or string. Titlecase is the case that a letter is represented in when it is the first character of a word in a title. For standard Latin letters, the titlecase form is the same as the uppercase form.

Some languages (for example, Croatian) represent particular letters by two letter glyphs. For example, “lj” is a single letter in the Croatian alphabet. This letter has three forms: lowercase “lj”, uppercase “LJ”, and titlecase “Lj”. $ZCONVERT titlecase translation is used for this type of letter conversion.

W or w

Word translation: Convert the first character of each word in string to uppercase. Any character preceded by a blank space, a quotation mark ("), an apostrophe ('), or an open parenthesis (() is considered the first character of a word. Word translation converts all other characters to lowercase. Word translation is locale specific; the above syntax rules for English may differ for other language locales.

“W” and “S” modes determine whether a non-blank character is the first character of a word or the first character of a sentence, and if that character is a letter, translate it to uppercase. All other letters are translated to lowercase. Case translation works on letters in any alphabet, as shown in the following example which converts Greek letters ($CHAR(945) is lowercase alpha; $CHAR(913) is uppercase alpha):

   SET greek=$CHAR(945,946,947,913,914,915)
   WRITE $ZCONVERT(greek,"W")

However the rules determining what constitutes a word or sentence are locale dependent. For example, the following example uses the Spanish inverted exclamation point $CHAR(161). The default (English) locale does not recognize this character as beginning a sentence or word. In this example, all letters in spanish are translated to lowercase:

   SET spanish=$CHAR(161)_"ola MuNdO! "_$CHAR(161)_"olA!"
   SET english="hElLo wOrLd! heLLo!"
   WRITE !,$ZCONVERT(english,"S")
   WRITE !,$ZCONVERT(spanish,"S")
S or s

Sentence translation: Convert the first character of each sentence in string to uppercase. The first non-blank character of string, and any character preceded by a period (.), question mark (?), or exclamation mark (!) is considered the first character of a sentence. (Blank spaces between the preceding punctuation character and the letter are ignored.) If this character is a letter, it is converted to uppercase. Sentence translation converts all other letter characters to lowercase. Sentence translation is locale specific; the above syntax rules for English may differ for other language locales.

See the comments for W mode.

A or a

Remove accents from a string.

AU or au

Remove accents from a string, then convert to upper case.

AL or al

Remove accents from a string, then convert to lower case.

For further case conversion options, including non-ASCII and customized case conversion, see System Classes for National Language Support

Three-Argument Form: Encoding Translation, Escaping, and Unescaping

The three-argument form of $ZCONVERT returns a new string which is encoded differently or has been escaped or unescaped for use in a specific context (such as within a URL). The following example converts %Library.String for use within a URL:

USER>set string="%Library.String"
 
USER>write $zconvert(string,"o","URL")
%25Library.String

The three-argument form has the syntax:

$ZCONVERT(string,mode,trantable) 
$ZCVT(string,mode,trantable)

The mode argument must evaluate to one of the following:

O or o

Convert the input string to the encoding or format indicated by trantable.

I or I

Convert the input string from the encoding or format indicated by trantable.

The trantable argument can be:

  • An uppercase string value identifying an I/O translation table. In the preceding examples, "URL" is a translation table. See Translation Tables for a list and details.

  • A string value specifying an I/O translation table defined by an NLS locale. For example, Latin2 or CP1252. See Translation Tables for a list and details.

  • A string value specifying a user-defined I/O translation table. A named table can be defined in a locale and points to one or two translation tables. Use a named table to define a specific system-to/from-device encoding.

  • An empty string ("") specifying the use of the default process I/O translation table. (For equivalent functionality, see the $$GetPDefIO^%NLS() function of the %NLS utility.)

  • An integer value specifying a process I/O translation object (a translation handle). Available values are 0 through 3 (0 represents the current process I/O translation object).

For “I” translations, the string may be a hexadecimal string, such as %4B (the letter “K”); hexadecimal strings are not case-sensitive.

You can use ZZDUMP to display the hexadecimal encoding for a string of characters. You can use $CHAR to specify a character (or string of characters) by its decimal (base 10) encoding; you can use $ZHEX to convert a hexadecimal number to a decimal number, or a decimal number to a hexadecimal number. If the translated value is a non-printing character, InterSystems IRIS displays it as a null string. If the target device cannot represent a translated character, InterSystems IRIS substitutes a question mark (?) character for the non-displayable character.

Four-Argument Form: Input/Output String

The four-argument form of $ZCONVERT enables you to invoke the function repeatedly as a way to convert extremely long strings. The four-argument form has the syntax:

$ZCONVERT(string,mode,trantable,handle) 
$ZCVT(string,mode,trantable,handle)

The handle argument is a local variable that $ZCONVERT reads at the beginning of execution and writes when it completes execution. It is used to hold information between consecutive invocations of the $ZCONVERT function. It can be used for two purposes: concatenating a string to the beginning of string, and converting extremely long strings.

To concatenate a string to the beginning of string, set handle before invoking $ZCONVERT:

   SET handle="the "
   WRITE $ZCVT("quick brown fox","O","URL",handle),!
   /*  the%20quick%20brown%20fox  */
   WRITE $ZCVT("quick brown fox","O","URL",handle),!
   /*  quick%20brown%20fox  */

Note that $ZCONVERT resets handle when it completes execution. In the previous example, it resets handle to the empty string.

This handle argument may be used for input conversions. Specifying a handle is useful when dealing with multibyte character sequences when working with partial sets of characters, such as a stream read. In these cases, $ZCONVERT uses the handle argument to hold a partial character sequence that may be the leading bytes of a multibyte sequence. If there are input characters left in the buffer at the end of a $ZCONVERT which do not make a complete translation unit, these leftover characters are returned in the handle. At the beginning of next $ZCONVERT, if the handle contains data, these leftover characters are prepended to the normal input data. This is particularly valuable for use in UTF8 conversions, as shown in the following example:

  SET handle=""
  WHILE 'stream.AtEnd() {
     WRITE $ZCONVERT(stream.Read(20000),"I","UTF8",handle)
  } 

To convert an extremely long string, it may be necessary to perform more than one string conversions by invoking $ZCONVERT multiple times. $ZCONVERT provides the optional handle argument to hold the remaining unconverted portion of string. If you specify a handle argument, it is updated by each invocation of $ZCONVERT. When the string conversion completes, $ZCONVERT sets handle to the empty string.

   SET handle=""
   SET out = $ZCVT(hugestring,"O","HTML",handle)
   IF handle '= "" {
     SET out2 = $ZCVT(handle,"O","HTML",handle)
     WRITE "Converted string is: ",out,out2  }
   ELSE {
     WRITE "Converted string is: ",out }

Examples

The following example returns "HELLO":

   WRITE $ZCONVERT("Hello","U")

The following example returns "hello":

   WRITE $ZCVT("Hello","L")

The following example returns "HELLO":

   WRITE $ZCVT("Hello","T")

The following example uses the concatenate operator (_) to append and case-convert an accented character:

   WRITE "TOUCH"_$CHAR(201),!, $ZCVT("TOUCH"_$CHAR(201),"L")

returns:

TOUCHÉ

touché

The following example converts the angle brackets in the string to HTML escape characters for output, returning “<TAG>”

   WRITE $ZCVT("<TAG>","O","HTML")

Note that how these angle brackets display depends on the output device; try running this program here and then running it from the Terminal prompt.

The following example shows how $ZCONVERT substitutes a ? character for a translated character it cannot display. Both the UTF8 and the current process I/O translation object (trantable 0) conversions in this example display $CHAR(63), which is the actual ? character. UTF8 cannot display translated characters above $CHAR(127). Translation table 0 cannot display translated characters above $CHAR(255):

  FOR i=1:1:300 {IF $ZCONVERT($CHAR(i),"I","UTF8") '= "?"
                   { CONTINUE }
                  ELSE {WRITE "UTF8 ",i,"=",$ZCONVERT($CHAR(i),"I","UTF8")}
                  IF $ZCONVERT($CHAR(i),"I",0)="?"
                   {WRITE " trantable 0 ",i,"=",$ZCONVERT($CHAR(i),"I",0),!}
                 ELSE {WRITE !}
                 }

See Also

FeedbackOpens in a new tab