Returns the number of characters in a string, recognizing surrogate pairs.
||A string or expression that evaluates to a string.
returns the number of characters in string
is functionally identical to $LENGTH
, except that $WLENGTH
recognizes surrogate pairs. It counts a surrogate pair as a single character. You can use the $WISWIDE
function to determine if a string contains a surrogate pair.
A surrogate pair is a pair of 16-bit InterSystems IRIS character elements that together encode a single Unicode character. Surrogate pairs are used to represent certain ideographs which are used in Chinese, Japanese kanji, and Korean hanja. (Most commonly-used Chinese, kanji, and hanja characters are
represented by standard 16-bit Unicode encodings.) Surrogate pairs provide InterSystems IRIS support for the Japanese JIS X0213:2004 (JIS2004) encoding standard and the Chinese GB18030 encoding standard.
A surrogate pair consists of high-order 16-bit character element in the hexadecimal range D800 through DBFF, and a low-order 16-bit character element in the hexadecimal range DC00 through DFFF.
function counts a surrogate pair as a single character. The $LENGTH
function counts a surrogate pair as two characters. In all other aspects, $WLENGTH
are functionally identical. However, because $LENGTH
is generally faster than $WLENGTH
is preferable for all cases where a surrogate pair is not likely to be encountered.
For further details on string length, refer to the $LENGTH
The following example shows how $WLENGTH
counts a surrogate pair as a single character:
WRITE !,$LENGTH(str)," $LENGTH characters in string"
WRITE !,$WLENGTH(str)," $WLENGTH characters in string"