Skip to main content
Previous section   Next section

Evaluating XPath Expressions

XPath (XML Path Language) is an XML-based expression language for obtaining data from an XML document. With the %XML.XPATH.Document class, you can easily evaluate XPath expressions, given an arbitrary XML document that you provide.

Note:

The XML declaration of any XML document that you use should indicate the character encoding of that document, and the document should be encoded as declared. If the character encoding is not declared, InterSystems IRIS uses the defaults described in “Character Encoding of Input and Output,” earlier in this book. If these defaults are not correct, modify the XML declaration so that it specifies the character set actually used.

Overview of Evaluating XPath Expressions in InterSystems IRIS

To use InterSystems IRIS XML support to evaluate XPath expressions using an arbitrary XML document, you do the following:

  1. Create an instance of %XML.XPATH.Document. To do so, use one of the following class methods: CreateFromFile(), CreateFromStream(), or CreateFromString(). With any of these methods, you specify the input XML document as the first argument and receive an instance of %XML.XPATH.Document as an output parameter.

    This step parses the XML document, using a built-in XSLT processor.

  2. Use the EvaluateExpression() method of your instance of %XML.XPATH.Document. For this method, you specify the node context and an expression to evaluate.

    • The node context specifies the context in which to evaluate the expression. This uses XPath syntax to express the path to that desired node. For example:

      "/staff/doc"
      Copy code to clipboard
    • The expression to evaluate also uses XPath syntax. For example:

      "name[@last='Marston']"
      Copy code to clipboard

    You receive the results as an output parameter (as the third argument).

The following sections provide details of all these methods, as well as examples.

Note:

If you are iterating through a large set of documents and evaluating XPath expressions for each of them, InterSystems recommends that you set the OREF for a document equal to null when you are done processing it, before you open the next document. This works around a limitation in third-party software. This limitation causes slightly increased CPU usage as you process large numbers of documents in a loop.

Argument Lists When Creating an XPATH Document

To create an instance of %XML.XPATH.Document, use the CreateFromFile(), CreateFromStream(), or CreateFromString() class method of that class. For these class methods, the complete argument list is as follows, in order:

  1. pSource, pStream, or pString — The source document.

    • For CreateFromFile(), this argument is the filename.

    • For CreateFromStream(), this argument is a binary stream.

    • For CreateFromString(), this argument is a string.

  2. pDocument — The result, which is returned as an output parameter. This is an instance of %XML.XPATH.Document.

  3. pResolver — An optional entity resolver to use when parsing the source. See “Performing Custom Entity Resolution,” in the chapter “Customizing How the SAX Parser Is Used.”

  4. pErrorHandler — An optional custom error handler. See “Customizing the Error Handling,” in the next chapter. If you do not specify a custom error handler, the method uses a new instance of %XML.XSLT.ErrorHandler.

  5. pFlags — Optional flags to control the validation and processing performed by the SAX parser. See “Setting the Parser Flags” in the chapter “Customizing How the SAX Parser Is Used.

  6. pSchemaSpec — An optional schema specification, against which to validate the document source. This argument is a string that contains a comma-separated list of namespace/URL pairs:

    "namespace URL,namespace URL"
    Copy code to clipboard

    Here namespace is the XML namespace used for the schema and URL is a URL that gives the location of the schema document. There is a single space character between the namespace and URL values.

  7. pPrefixMappings — An optional prefix mappings string. For details, see the subsection “Adding Prefix Mappings for Default Namespaces.”

The CreateFromFile(), CreateFromStream(), and CreateFromString() methods return a status, which should be checked. For example:

 Set tSC=##class(%XML.XPATH.Document).CreateFromFile("c:\sample.xml",.tDocument)
 If $$$ISERR(tSC) Do $System.OBJ.DisplayError(tSC)
Copy code to clipboard

Adding Prefix Mappings for Default Namespaces

When an XML document uses default namespaces, that poses a problem for XPath. Consider the following example:

<?xml version="1.0"?>
<staff xmlns="http://www.staff.org">
  <doc type="consultant">
    <name first="David" last="Marston">Mr. Marston</name>
    <name first="David" last="Bertoni">Mr. Bertoni</name>
    <name first="Donald" last="Leslie">Mr. Leslie</name>
    <name first="Emily" last="Farmer">Ms. Farmer</name>
  </doc>
</staff>
Copy code to clipboard

In this case, the <staff> element belongs to a namespace but does not have a namespace prefix. XPath does not provide an easy way to access the <doc> element.

So that you can easily access nodes that have default namespaces, the %XML.XPATH.Document class provides a prefix mappings feature, which you can use in two ways:

  • You can set the PrefixMappings property of your instance of %XML.XPATH.Document. This property is meant to provide a unique prefix for each default namespace in the source document, so that your XPath expressions can use those prefixes rather than the full namespace URIs.

    The PrefixMappings property is a string that consists of a comma-separated list; each list item is a prefix, followed by a space, followed by a namespace URI.

  • When you call CreateFromFile(), CreateFromStream(), or CreateFromString(), you can specify the pPrefixMappings argument. This string must be of the same form as previously described.

    See “Argument Lists When Creating an XPATH Document,” earlier in this chapter.

Then use these prefixes in the same way you use any namespace prefixes.

For example, suppose that when you read the preceding XML into an instance of %XML.XPATH.Document, you specified PrefixMappings as follows:

"s http://www.staff.org"
Copy code to clipboard

In this case you could use "/s:staff/s:doc" to access the <doc> element.

Note that you can use the instance method GetPrefix() to obtain the prefix that you previously specified for a given path in the document.

Evaluating XPath Expressions

To evaluate XPath expressions, you use the EvaluateExpression() method of your instance of %XML.XPATH.Document. For this method, you specify the following arguments, in order:

  1. pContext — The node context, which specifies the context in which to evaluate the expression. Specify a string that contains the XPath syntax for the path to that desired node. For example:

    "/staff/doc"
    Copy code to clipboard
  2. pExpression — A predicate that selects particular results. Specify a string that contains the desired XPath syntax. For example:

    "name[@last='Marston']"
    Copy code to clipboard
    Note:

    With other technologies, it is common practice to concatenate the predicate to the end of the node path. The %XML.XPATH.Document class does not support this syntax, because the underlying XSLT processor requires the node context and the predicate as separate arguments.

  3. pResults — The results, which are returned as an output parameter. For information on the results, see “Using the XPath Results,” later in this chapter.

The EvaluateExpression() method returns a status, which should be checked. For example:

 Set tSC=tDoc.EvaluateExpression("/staff/doc","name[@last='Smith']",.tRes)
 If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC)}
Copy code to clipboard

Using the XPath Results

An XPath expression can return a subtree of the XML document, multiple subtrees, or a scalar result. The EvaluateExpression() method of %XML.XPATH.Document is designed to handle all these cases. Specifically, it returns a list of results. Each item in that list has a Type property that has one of the following values:

These macros are defined in the %occXSLT.inc include file.

The following subsections provide details on these classes, as well as a summary and example of the overall approach you might take.

Examining an XML Subtree

This section describes how to navigate the XML subtree represented by %XML.XPATH.DOMResult and how to get information about your current location in that subtree.

Navigating the Subtree

To navigate an instance of %XML.XPATH.DOMResult, you can use the following methods of the instance: Read(), MoveToAttributeIndex(), MoveToAttributeName(), MoveToElement(), and Rewind().

To move to the next node in a document, use the Read() method. The Read() method returns a true value until there are no more nodes to read (that is, until the end of the document is reached).

When you navigate to an element, if that element has attributes, you can navigate to them, by using the following methods:

  • Use the MoveToAttributeIndex() method to move to a specific attribute by index (ordinal position of the attribute within the element). This method takes one argument: the index number of the attribute. Note that you can use the AttributeCount property to learn how many attributes a given element has.

  • Use the MoveToAttributeName() method to move to a specific attribute by name. This method takes two arguments: the name of the attribute and (optionally) the namespace URI.

When you are finished with the attributes for the current element, you can move to the next element in the document by invoking one of the navigation methods such as Read(). Alternatively, you can invoke the MoveToElement() method to return to the element that contains the current attribute.

All the methods described here go forward in a document, except for the Rewind() method, which navigates to the start of the document and resets all properties.

Properties of Nodes

In addition to the Type property, the following properties of %XML.XPATH.DOMResult provide information about your current location.

AttributeCount

If the current node is an element, this property indicates the number of attributes of the element.

EOF

True if the reader has reached the end of the source document; false otherwise.

HasAttributes

If the current node is an element, this property is true if that element has attributes (or false if it does not). If the current node is an attribute, this property is true.

For any other type of node, this property is false.

HasValue

True if the current node is a type of node that has a value (even if that value is null). Otherwise this property is false.

LocalName

For nodes of type attribute or element, this is the name of the current element or attribute, without the namespace prefix. For all other types of nodes, this property is null.

Name

Fully qualified name of the current node, as appropriate for the type of node.

NodeType

Type of the current node, one of the following: attribute, chars, cdata, comment, document, documentfragment, documenttype, element, entity, entityreference, notation, or processinginstruction.

Path

For nodes of type element, this is the path to the element. For all other types of nodes, this property is null.

ReadState

Indicates the overall read state, one of the following:

  • "initial" means that the Read() method has not yet been called.

  • "cursoractive" means that the Read() method has been called at least once.

  • "eof" means that the end of the file has been reached.

Uri

The URI of the current node. The value returned depends on the type of node.

Value

Value, if any, of the current node, as appropriate for the type of node. If the value is less than 32 KB in size, this is a string. Otherwise, it is a character stream. For details, see “Examining a Scalar Result.”

Examining a Scalar Result

This section describes use the XPath result that is represented by the %XML.XPATH.ValueResult class. In addition to the Type property, this class provides the Value property.

Note that if the value is larger than 32 KB in length, it is automatically placed in a stream object. Unless you are certain of the kinds of results that you will receive, you should check whether Value is a stream object. To do so, you can use the $IsObject function. (That is, if this value is an object, it is a stream object, because that is the only kind of object it can be.)

The following fragment shows a possible test:

 // Value can be a stream if result is greater than 32 KB in length
 Set tValue=tResult.Value

 If $IsObject(tValue){
     Write ! Do tValue.OutputToDevice()
 } else {
     Write tValue
 }
Copy code to clipboard

General Approach

Unless you can be certain of the kinds of results that you will receive when you evaluate XPath expressions, you should write your code to handle the most general possible case. A possible organization of the code is as follows:

  1. Find the number of elements in the list of returned results. Iterate through this list.

  2. For each list item, check the Type property.

    • If Type is $$$XPATHDOM, use the methods of the %XML.XPATH.DOMResult class to navigate this XML subtree and examine it.

    • If Type is $$$XPATHVALUE, check whether the Value property is a stream object. If it is a stream object, use the usual stream interface to access the data. Otherwise, the Value property is a string.

For an example, see the ExampleDisplayResults() class method of the %XML.XPATH.Document. This method examines the results as described in the preceding list and then writes them to the Terminal. The following section shows what the output from this method looks like.

Examples

The examples in this section evaluate XPath expressions against the following XML document:

<?xml version="1.0"?>
<staff>
  <doc type="consultant">
    <name first="David" last="Marston">Mr. Marston</name>
    <name first="David" last="Bertoni">Mr. Bertoni</name>
    <name first="Donald" last="Leslie">Mr. Leslie</name>
    <name first="Emily" last="Farmer">Ms. Farmer</name>
  </doc>
  <doc type="GP">
    <name first="Myriam" last="Midy">Ms. Midy</name>
    <name first="Paul" last="Dick">Mr. Dick</name>
    <name first="Scott" last="Boag">Mr. Boag</name>
    <name first="Shane" last="Curcuru">Mr. Curcuru</name>
    <name first="Joseph" last="Kesselman">Mr. Kesselman</name>
    <name first="Stephen" last="Auriemma">Mr. Auriemma</name>
  </doc>
</staff>
Copy code to clipboard

These examples were adapted from more extensive examples contained in the %XML.XPATH.Document class; see the source code for a closer look at those.

Evaluating an XPath Expression That Has a Subtree Result

The following class method reads an XML file and evaluates an XPath expression that returns an XML subtree:

/// Evaluates an XPath expression that returns a DOM Result
ClassMethod Example1()
{
    Set tSC=$$$OK
    do {
    
    Set tSC=##class(%XML.XPATH.Document).CreateFromFile(filename,.tDoc)
    If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC) Quit}
    
    Set context="/staff/doc"
    Set expr="name[@last='Marston']"
    Set tSC=tDoc.EvaluateExpression(context,expr,.tRes)
    If $$$ISERR(tSC) Quit
    
        Do ##class(%XML.XPATH.Document).ExampleDisplayResults(tRes)
    
    } while (0)
    If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC)}
    Quit
}
Copy code to clipboard

This example selects any nodes whose <name> element has a last attribute equal to Marston. This expression is evaluated in the <doc> node of the <staff> element.

Notice that this example uses the ExampleDisplayResults() class method of the %XML.XPATH.Document.

When you execute the Example1() method, providing the previous XML file as input, you see the following output:

XPATH DOM
element: name
         attribute: first Value: David
         attribute: last  Value: Marston
 
chars : #text Value: Mr. Marston
Copy code to clipboard

Evaluating an XPath Expression That Has a Scalar Result

The following class method reads an XML file and evaluates an XPath expression that returns a scalar result:

/// Evaluates an XPath expression that returns a VALUE Result
ClassMethod Example2()
{
    Set tSC=$$$OK
    do {
    
    Set tSC=##class(%XML.XPATH.Document).CreateFromFile(filename,.tDoc)
    If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC) Quit}
    
    Set tSC=tDoc.EvaluateExpression("/staff","count(doc)",.tRes)
        If $$$ISERR(tSC) Quit
        
        Do ##class(%XML.XPATH.Document).ExampleDisplayResults(tRes)
    
    } while (0)
    If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC)}
    Quit
}
Copy code to clipboard

This example counts <doc> subnodes. This expression is evaluated in the <staff> element.

When you execute the Example2() method, providing the previous XML file as input, you see the following output:

XPATH VALUE
2
Copy code to clipboard
Previous section   Next section