Skip to main content

Strings

Strings

A string is a set of characters: letters, digits, punctuation, and so on delimited by a matched set of quotation marks ("):

 SET string = "This is a string"
 WRITE string

Topics about strings include:

Null String / $CHAR(0)

  • SET mystr="": sets a null or empty string. The string is defined, is of zero length, and contains no data:

      SET mystr=""
      WRITE "defined:",$DATA(mystr),!
      WRITE "length: ",$LENGTH(mystr),!
      ZZDUMP mystr
  • SET mystr=$CHAR(0): sets a string to the null character. The string is defined, is of length 1, and contains a single character with the hexadecimal value of 00:

      SET mystr=$CHAR(0)
      WRITE "defined:",$DATA(mystr),!
      WRITE "length: ",$LENGTH(mystr),!
      ZZDUMP mystr

Note that these two values are not the same. However, a bitstring treats these values as identical.

Note that InterSystems SQL has its own interpretation of these values. Refer to NULL and the Empty String in the “Language Elements” chapter of Using Caché SQL.

Escaping Quotation Marks

You can include a " (double quote) character as a literal within a string by preceding it with another double quote character:

 SET string = "This string has ""quotes"" in it."
 WRITE string

There are no other escape character sequences within ObjectScript string literals.

Note that literal quotation marks are specified using other escape sequences in other InterSystems software. Refer to the $ZCONVERT function for a table of these escape sequences.

Concatenating Strings

You can concatenate two strings into a single string using the _ concatenate operator:

 SET a = "Inter"
 SET b = "Systems"
 SET string = a_b
 WRITE string

By using the concatenate operator you can include non-printing characters in a string. The following string includes the linefeed ($CHAR(10)) character:

 SET lf = $CHAR(10)
 SET string = "This"_lf_"is"_lf_"a string"
 WRITE string
Note:

How non-printing characters display is determined by the display device. For example, Terminal differs from browser display of the linefeed character, and other positioning characters. In addition, different browsers display the positioning characters $CHAR(11) and $CHAR(12) differently.

Caché encoded strings — bit strings, List structure strings, and JSON strings — have limitations on their use of the concatenate operator. For further details, see Concatenate Encoded Strings.

Some additional considerations apply when concatenating numbers. For further details, see “Concatenating Numbers”.

String Comparisons

You can use the equals (=) and does not equal ('=) operators to compare two strings. String equality comparisons are case-sensitive. Exercise caution when using these operators to compare a string to a number, because this comparison is a string comparison, not a numeric comparison. Therefore only a string containing a number in canonical form is equal to its corresponding number. ("-0" is not a canonical number.) This is shown in the following example:

  WRITE "Fred" = "Fred",!  // TRUE
  WRITE "Fred" = "FRED",!  // FALSE
  WRITE "-7" = -007.0,!    // TRUE
  WRITE "-007.0" = -7,!    // FALSE
  WRITE "0" = -0,!         // TRUE
  WRITE "-0" = 0,!         // FALSE
  WRITE "-0" = -0,!        // FALSE

The <, >, <=, or >= operators cannot be used to perform a string comparison. These operators treat strings as numbers and always perform a numeric comparison. Any non-numeric string is assigned a numeric value of 0 when compared using these operators.

Lettercase and String Comparisons

String equality comparisons are case-sensitive. You can use the $ZCONVERT function to convert the letters in the strings to be compared to all uppercase letters or all lowercase letters. Non-letter characters are unchanged.

A few letters only have a lowercase letter form. For example, the German eszett ($CHAR(223)) is only defined as a lowercase letter. Converting it to an uppercase letter results in the same lowercase letter. For this reason, when converting alphanumeric strings to a single letter case it is always preferable to convert to lowercase.

Long Strings

Caché supports two maximum string length options:

  • The traditional maximum string length of 32,767 characters.

  • Long Strings maximum string length of 3,641,144 characters.

Long strings are enabled by default. If long strings are enabled, the maximum length of a string is 3,641,144 characters. If long strings are disabled, the maximum length of a string is 32,767 characters.

Attempting to exceed the current maximum string length results in a <MAXSTRING> error.

You can return the current system-wide maximum string length by invoking the MaxLocalLength()Opens in a new tab method, as follows:

   WRITE $SYSTEM.SYS.MaxLocalLength()

You can use any of the following operations to enable or disable long strings system-wide:

  • In the Management Portal, select System, Configuration, Memory and Startup. On the System Memory and Startup Settings page, select the Enable Long Strings check box.

  • In the Caché parameter file (CPF file), specify the value of the EnableLongStrings parameter, as described in the EnableLongStrings section of the Caché Parameter File Reference.

  • In the Config.MiscellaneousOpens in a new tab class properties, specify an EnableLongStringsOpens in a new tab boolean value. This modifies the corresponding CPF file parameter. For example:

      ZNSPACE "%SYS"
      SET getstat=##class(Config.Miscellaneous).Get(.Properties)
        IF getstat '= 1 {WRITE "Get config property error",! QUIT}
      SET Properties("EnableLongStrings")=0
      SET modstat=##class(Config.Miscellaneous).Modify(.Properties)
        IF modstat '= 1 {WRITE "Modify config property error",! QUIT}
    

When a process actually uses a long string, the memory for the string comes from the operating system’s malloc() buffer, not from the partition memory space for the process. Thus the memory allocated for actual long string values is not subject to the limit set by the maximum memory per process (Maximum per Process Memory (KB)) parameter and does not affect the $STORAGE value for the process.

Bit Strings

A bit string represents a logical set of numbered bits with boolean values. Bits in a string are numbered starting with bit number 1. Any numbered bit that has not been explicitly set to boolean value 1 evaluates as 0. Therefore, referencing any numbered bit beyond those explicitly set returns a bit value of 0.

A bit string has a logical length, which is the highest bit position explicitly set to either 0 or 1. This logical length is only accessible using the $BITCOUNT function, and usually should not be used in application logic. To the bit string functions, an undefined global or local variable is equivalent to a bitstring with any specified numbered bit returning a bit value 0, and a $BITCOUNT value of 0.

A bit string is stored as a normal Caché string with an internal format. This internal string representation is not accessible with the bit string functions. Because of this internal format, the string length of a bit string is not meaningful in determining anything about the number of bits in the string.

Because of the bit string internal format, you cannot use the concatenate operator with bit strings. Attempting to do so results in an <INVALID BIT STRING> error.

Two bit strings in the same state (with the same boolean values) may have different internal string representations, and therefore string representations should not be inspected or compared in application logic. To the bit string functions, the null strings and undefined global/local variables are equivalent to a bitstring with all bits 0, and a length of 0.

Unlike an ordinary string, a bit string treats the empty string and the character $CHAR(0) to be equivalent to each other and to represent a 0 bit. This is because $BIT treats any non-numeric string as 0. Therefore:

  SET $BIT(bstr1,1)=""
  SET $BIT(bstr2,1)=$CHAR(0)
  SET $BIT(bstr3,1)=0
  IF $BIT(bstr1,1)=$BIT(bstr2,1) {WRITE "bitstrings are the same"} ELSE {WRITE "bitstrings different"}
  WRITE $BITCOUNT(bstr1),$BITCOUNT(bstr2),$BITCOUNT(bstr3) 

A bit set in a global variable during a transaction will be reverted to its previous value following transaction rollback. However, rollback does not return the global variable bit string to its previous string length or previous internal string representation. Local variables are not reverted by a rollback operation.

A logical bitmap structure can be represented by an array of bit strings, where each element of the array represents a "chunk" with a fixed number of bits. Since undefined is equivalent to a chunk with all 0 bits, the array can be sparse, where array elements representing a chunk of all 0 bits need not exist at all. For this reason, and due to the rollback behavior above, application logic should avoid depending on the length of a bit string or the count of 0-valued bits accessible using $BITCOUNT(str) or $BITCOUNT(str,0).

FeedbackOpens in a new tab