Home|Management Portal|Index
Caché ObjectScript Reference
$EXTRACT
« »
   
Server:docs.intersystems.com
Instance:CACHE20102
User:UnknownUser
 
-
Go to:
Search:    

Returns or replaces a substring, using a character count.
Synopsis
$EXTRACT(string,from,to)
$E(string,from,to)

SET $EXTRACT(string,from,to)=value
SET $E(string,from,to)=value
Parameters
string An expression that evaluates to the target string from which the substring is to be extracted.
from
Optional — The starting position within the target string from which to extract a character or the beginning of a range of characters. Different values are used for the two-parameter form $EXTRACT(string,from), and the three-parameter form $EXTRACT(string,from,to):
Without to: Specifies a single character. Specify either an expression that evaluates to a positive integer (counting from 1), an asterisk (which specifies the last character of the string), or an asterisk followed by an expression that evaluates to a negative integer (which specifies a start position counting backwards from the last character of the string). A zero (0) or negative number returns the empty string.
With to: Specifies the start of a range of characters. Either specify an expression that evaluates to a positive integer (counting from 1) or an asterisk followed by an expression that evaluates to a negative integer (which specifies a start position counting backwards from the last character of the string). A zero (0) or negative number evaluates as 1.
to Optional — The end position (inclusive) for a range of characters to be extracted, counting from the beginning or end of string. Specify either a positive integer (counting from 1), an asterisk (which specifies the last character of the string), or an asterisk followed by a negative integer (which specifies an end position counting backwards from the last character of the string).
Description
$EXTRACT can be used in two ways:
$EXTRACT returns or replaces a substring from a specified position in string. The nature of this substring depends on the parameters used.
You can also use the $EXTRACT function with the SET command to change the values of specified characters within a string. For example:
   SET var1="ABCD"
   SET $EXTRACT(var1,2)="Z"
   WRITE var1
 
returns “AZCD”.
Parameters
string
The string value can be a variable name, a numeric value, a string literal, or any valid Caché ObjectScript expression.
from
The from parameter can specify a single character, or the beginning of a range of characters. The from value must be a positive integer, an asterisk (*), or an asterisk and a negative number. A from asterisk value with a negative number may include or omit blank spaces; all of the following are permissible: *-3, * -3, * - 3.
The handling of 0 and negative number values differs for $EXTRACT(string,from), and $EXTRACT(string,from,to).
If the from integer value is greater than the number of characters in the string, $EXTRACT returns a null string. If the from asterisk negative value is equal to or greater than the number of characters in the string, $EXTRACT returns a null string.
If from is used with the to parameter, from identifies the start of the range to be extracted and must be less than the value of to. If from equals to, $EXTRACT returns the single character at the specified position. If from is greater than to, $EXTRACT returns a null string. If used with the to parameter, a from value less than 1 (zero, or a negative number) is treated as if it were the number 1.
to
The to parameter must be used with the from parameter. It must be a positive integer, an asterisk (*), or an asterisk followed by a negative integer. If the to value is an integer greater than or equal to the from value, $EXTRACT returns the specified substring. If the to value is an asterisk, $EXTRACT returns the substring beginning with the from character through the end of the string. If to is an integer greater than the length of the string, $EXTRACT also returns the substring beginning with the from character through the end of the string. If to is less than from, $EXTRACT returns a null string.
$EXTRACT Examples
The following example returns “D”, the fourth character in the string:
   SET x="ABCDEFGHIJK"
   WRITE $EXTRACT(x,4)
 
The following example returns “K”, the last character in the string:
   SET x="ABCDEFGHIJK"
   WRITE $EXTRACT(x,*)
 
In the following example, all the $EXTRACT functions return “J” the next-to-last character in the string:
   SET n=-1
   SET m=1
   SET x="ABCDEFGHIJK"
   WRITE !,$EXTRACT(x,*-1)
   WRITE !,$EXTRACT(x,*-m)
   WRITE !,$EXTRACT(x,*+n)
   WRITE !,$EXTRACT(x,*-1,*-1)
 
Note that a minus or plus sign is needed between the asterisk and the integer variable.
The following example shows that the one-argument format is equivalent to the two-argument format when the from value is “1”. Both $EXTRACT functions return “H”.
   SET x="HELLO"
   WRITE !,$EXTRACT(x)
   WRITE !,$EXTRACT(x,1)
 
The following example returns a substring “THIS IS” which is composed of the first through seventh characters.
   SET x="THIS IS A TEST"
   WRITE $EXTRACT(x,1,7)
 
The following example also returns the substring “THIS IS”. When the from variable contains a value less than 1, $EXTRACT treats that value as 1. Thus, the following example returns a substring composed of the first through seventh characters.
   SET X="THIS IS A TEST"
   WRITE $EXTRACT(X,-1,7)
 
The following example returns the last four characters of the string:
   SET X="THIS IS A TEST"
   WRITE $EXTRACT(X,*-3,*)
 
The following example also returns the last four characters of the string:
   SET X="THIS IS A TEST"
   WRITE $EXTRACT(X,*-3,14)
 
$EXTRACT with SET
You can use $EXTRACT with the SET command to replace a specified character or range of characters with another value.
The simplest form of SET $EXTRACT is a one-for-one substitution:
   SET foo="ABZD"
   SET $EXTRACT(foo,3)="C"
   WRITE foo
 
yields “ABCD”.
You can also extract a string and replace it with a string of a different length. For example, the following command extracts the string “Rhode Island” from foo and replaces it with the string “Texas”, with no padding.
   SET foo="Deep in the heart of Rhode Island"
   SET $EXTRACT(foo,22,33)="Texas"
   WRITE foo
 
yields "Deep in the heart of Texas".
You can extract a string and set it to the null string, removing the extracted characters from the string:
   SET foo="ABCzzzzzD"
   SET $EXTRACT(foo,4,8)=""
   WRITE foo
 
yields “ABCD”.
If you specify to larger than the string, $EXTRACT pads with blank spaces:
   SET foo="ABCD"
   SET $EXTRACT(foo,8)="X"
   WRITE foo
 
yields “ABCD^^^X” (here a blank space is shown using “^”).
If you specify from larger than to, no replacement occurs:
   SET foo="ABCD"
   SET $EXTRACT(foo,4,3)="X"
   WRITE foo
 
yields “ABCD”.
SET $EXTRACT Examples
The following example changes the value of x from “ABCD” to “ZBCD”:
   SET x="ABCD"
   SET $EXTRACT(x,1)="Z"
   WRITE x
 
The following example replaces “ABCD” with “GHIJ”.
   SET x="ABCD"
   SET $EXTRACT(x,1,4)="GHIJ"
   WRITE x
 
In the following example, assume that variable x does not exist.
   KILL x
   SET $EXTRACT(x,1,4)="ABCD"
   WRITE x
 
The SET command creates variable x and assigns it the value “ABCD”.
You can also use SET $EXTRACT to add or remove character positions in a target variable.
To add character positions, specify a from value or a from,to range that exceeds the length of the target variable. For example, if x contains “ABCD,” the following $EXTRACT function inserts the value “Z” in the tenth position:
   SET x="ABCD"
   SET $EXTRACT(x,10)="Z"
   WRITE x
 
Because 10 exceeds the number of characters in x, SET $EXTRACT fills the intervening positions (that is, position 5 through 9) with spaces. As a result, x now contains the value “ABCD^^^^^Z” , where ^ indicates a space.
The following example inserts the value “Y” in the eleventh position, but no additional characters in positions 12 and 13.
   SET x="ABCD"
   SET $EXTRACT(x,11,13)="Y"
   WRITE x
 
As a result, the original x value (“ABCD”) is changed to “ABCD^^^^^^Y” and x now has a length of 11. (If the assigned value “Y” were three characters, instead of just one, positions 12 and 13 would be filled as well.)
To remove character positions, extract a character or range and replace it with the null string. The following results in a two-character string with the value “AD”:
   SET x="ABCD"
   SET $EXTRACT(x,2,3)=""
   WRITE x
 
The following example shortens a character string by extracting a from,to range larger than the number of values in the replacement string.
   SET x="ABCDEFGH"
   SET $EXTRACT(x,3,6)="Z"
   WRITE x
 
inserts the value “Z” in the third position and removes positions 4, 5 and 6. Variable x now contains the value “ABZGH” and has a length of 5.
Notes
$EXTRACT and Unicode
The $EXTRACT function operates on characters, not bytes. Therefore, Unicode strings are handled the same as ASCII strings, as shown in the following example using the Unicode character for “pi” ($CHAR(960)):
   SET a="QT PIE"
   SET b="QT "_$CHAR(960)
   SET a1=$EXTRACT(a,-33,4)
   SET a2=$EXTRACT(a,4,4)
   SET a3=$EXTRACT(a,4,99)
   SET b1=$EXTRACT(b,-33,4)
   SET b2=$EXTRACT(b,4,4)
   SET b3=$EXTRACT(b,4,99)
   WRITE !,"ASCII form returns ",!,a1,!,a2,!,a3
   WRITE !,"Unicode form returns ",!,b1,!,b2,!,b3
 
Surrogate Pairs
$EXTRACT does not recognize surrogate pairs. A surrogate pair is a pair of 16-bit Unicode characters that together encode a single ideographic character. Surrogate pairs are used to represent some Chinese characters and to support the Japanese JIS2004 standard. You can use the $WISWIDE function to determine if a string contains a surrogate pair. The $WEXTRACT function recognizes and correctly parses surrogate pairs. $EXTRACT and $WEXTRACT are otherwise identical. However, because $EXTRACT is generally faster than $WEXTRACT, $EXTRACT is preferable for all cases where a surrogate pair is not likely to be encountered.
$EXTRACT in DTM Modes
In the DTM and DTM-J modes, $EXTRACT supports two additional arguments, as follows:
$EXTRACT(string,from,to,replace,pad)
The optional replace argument replaces the substring specified by from and to with the replace substring, and returns the result. The original string is not changed.
The optional pad argument specifies a padding character. This is used when the from argument specifies a position beyond the end of string. The returned string is padded to the location specified by from followed by the replace substring. The pad value may be any single character; a nonnumeric character must be enclosed in quotes. To specify a quote character as the pad character literal, double it.
You can use the LanguageMode() method of the %SYSTEM.Process class to set DTM mode (2) or DTM-J mode (7).
The following example shows the four-argument replace syntax:
   SET x="ABCDEFGH"
   DO ##class(%SYSTEM.Process).LanguageMode(2)
   WRITE $EXTRACT(x,3,6,"##")
     /* returns "AB##GH"  */
The following example use the four-argument syntax to append the replace string:
   SET x="ABCDEFGH"
   DO ##class(%SYSTEM.Process).LanguageMode(2)
   WRITE $EXTRACT(x,1,0,"##")
     /* returns "##ABCDEFGH"  */
The following example shows the five-argument pad and replace syntax:
   SET x="ABCDEFGH"
   DO ##class(%SYSTEM.Process).LanguageMode(2)
   WRITE $EXTRACT(x,12,16,"##","*")
     /* returns "ABCDEFGH***##"  */
Note:
When using four-argument or five-argument syntax, the $EXTRACT from and to arguments do not support asterisk syntax.
SET $EXTRACT cannot be used with four-argument or five-argument syntax.
$EXTRACT Compared with $PIECE and $LIST
$EXTRACT determines a substring by counting characters from the beginning of a string. $EXTRACT takes as input any ordinary character string. $PIECE and $LIST both work on specially prepared strings.
$PIECE determines a substring by counting user-defined delimiter characters within the string.
$LIST determines an element from an encoded list by counting elements (not characters) from the beginning of the list. $LIST cannot be used on ordinary strings, and $EXTRACT cannot be used on encoded lists.
See Also