Using Caché XML Tools
Evaluating XPath Expressions
[Back] [Next]
   
Server:docs1
Instance:LATEST
User:UnknownUser
 
-
Go to:
Search:    

XPath (XML Path Language) is an XML-based expression language for obtaining data from an XML document. With the %XML.XPATH.Document class, you can easily evaluate XPath expressions, given an arbitrary XML document that you provide. This chapter discusses the following topics:

Note:
The XML declaration of any XML document that you use should indicate the character encoding of that document, and the document should be encoded as declared. If the character encoding is not declared, Caché uses the defaults described in Character Encoding of Input and Output,” earlier in this book. If these defaults are not correct, modify the XML declaration so that it specifies the character set actually used.
Overview of Evaluating XPath Expressions in Caché
To use Caché XML support to evaluate XPath expressions using an arbitrary XML document, you do the following:
  1. Create an instance of %XML.XPATH.Document. To do so, use one of the following class methods: CreateFromFile(), CreateFromStream(), or CreateFromString(). With any of these methods, you specify the input XML document as the first argument and receive an instance of %XML.XPATH.Document as an output parameter.
    This step parses the XML document, using a built-in XSLT processor.
  2. Use the EvaluateExpression() method of your instance of %XML.XPATH.Document. For this method, you specify the node context and an expression to evaluate.
    You receive the results as an output parameter (as the third argument).
The following sections provide details of all these methods, as well as examples.
Note:
If you are iterating through a large set of documents and evaluating XPath expressions for each of them, InterSystems recommends that you set the OREF for a document equal to null when you are done processing it, before you open the next document. This works around a limitation in third-party software. This limitation causes slightly increased CPU usage as you process large numbers of documents in a loop.
Argument Lists When Creating an XPATH Document
To create an instance of %XML.XPATH.Document, use the CreateFromFile(), CreateFromStream(), or CreateFromString() class method of that class. For these class methods, the complete argument list is as follows, in order:
  1. pSource, pStream, or pString — The source document.
  2. pDocument — The result, which is returned as an output parameter. This is an instance of %XML.XPATH.Document.
  3. pResolver — An optional entity resolver to use when parsing the source. See Performing Custom Entity Resolution,” in the chapter Customizing How the SAX Parser Is Used.”
  4. pErrorHandler — An optional custom error handler. See Customizing the Error Handling,” in the next chapter. If you do not specify a custom error handler, the method uses a new instance of %XML.XSLT.ErrorHandler.
  5. pFlags — Optional flags to control the validation and processing performed by the SAX parser. See Setting the Parser Flags in the chapter Customizing How the SAX Parser Is Used.
  6. pSchemaSpec — An optional schema specification, against which to validate the document source. This argument is a string that contains a comma-separated list of namespace/URL pairs:
    "namespace URL,namespace URL"
    Here namespace is the XML namespace used for the schema and URL is a URL that gives the location of the schema document. There is a single space character between the namespace and URL values.
  7. pPrefixMappings — An optional prefix mappings string. For details, see the subsection Adding Prefix Mappings for Default Namespaces.”
The CreateFromFile(), CreateFromStream(), and CreateFromString() methods return a status, which should be checked. For example:
 Set tSC=##class(%XML.XPATH.Document).CreateFromFile("c:\sample.xml",.tDocument)
 If $$$ISERR(tSC) Do $System.OBJ.DisplayError(tSC)
Adding Prefix Mappings for Default Namespaces
When an XML document uses default namespaces, that poses a problem for XPath. Consider the following example:
<?xml version="1.0"?>
<staff xmlns="http://www.staff.org">
  <doc type="consultant">
    <name first="David" last="Marston">Mr. Marston</name>
    <name first="David" last="Bertoni">Mr. Bertoni</name>
    <name first="Donald" last="Leslie">Mr. Leslie</name>
    <name first="Emily" last="Farmer">Ms. Farmer</name>
  </doc>
</staff>
In this case, the <staff> element belongs to a namespace but does not have a namespace prefix. XPath does not provide an easy way to access the <doc> element.
So that you can easily access nodes that have default namespaces, the %XML.XPATH.Document class provides a prefix mappings feature, which you can use in two ways:
Then use these prefixes in the same way you use any namespace prefixes.
For example, suppose that when you read the preceding XML into an instance of %XML.XPATH.Document, you specified PrefixMappings as follows:
"s http://www.staff.org"
In this case you could use "/s:staff/s:doc" to access the <doc> element.
Evaluating XPath Expressions
To evaluate XPath expressions, you use the EvaluateExpression() method of your instance of %XML.XPATH.Document. For this method, you specify the following arguments, in order:
  1. pContext — The node context, which specifies the context in which to evaluate the expression. Specify a string that contains the XPath syntax for the path to that desired node. For example:
    "/staff/doc"
  2. pExpression — A predicate that selects particular results. Specify a string that contains the desired XPath syntax. For example:
    "name[@last='Marston']"
    Note:
    With other technologies, it is common practice to concatenate the predicate to the end of the node path. The %XML.XPATH.Document class does not support this syntax, because the underlying XSLT processor requires the node context and the predicate as separate arguments.
  3. pResults — The results, which are returned as an output parameter. For information on the results, see Using the XPath Results,” later in this chapter.
The EvaluateExpression() method returns a status, which should be checked. For example:
 Set tSC=tDoc.EvaluateExpression("/staff/doc","name[@last='Smith']",.tRes)
 If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC)}
Using the XPath Results
An XPath expression can return a subtree of the XML document, multiple subtrees, or a scalar result. The EvaluateExpression() method of %XML.XPATH.Document is designed to handle all these cases. Specifically, it returns a list of results. Each item in that list has a Type property that has one of the following values:
These macros are defined in the %occXSLT.inc include file.
The following subsections provide details on these classes, as well as a summary and example of the overall approach you might take.
Examining an XML Subtree
This section describes how to navigate the XML subtree represented by %XML.XPATH.DOMResult and how to get information about your current location in that subtree.
Navigating the Subtree
To navigate an instance of %XML.XPATH.DOMResult, you can use the following methods of the instance: Read(), MoveToAttributeIndex(), MoveToAttributeName(), MoveToElement(), and Rewind().
To move to the next node in a document, use the Read() method. The Read() method returns a true value until there are no more nodes to read (that is, until the end of the document is reached).
When you navigate to an element, if that element has attributes, you can navigate to them, by using the following methods:
When you are finished with the attributes for the current element, you can move to the next element in the document by invoking one of the navigation methods such as Read(). Alternatively, you can invoke the MoveToElement() method to return to the element that contains the current attribute.
All the methods described here go forward in a document, except for the Rewind() method, which navigates to the start of the document and resets all properties.
Properties of Nodes
In addition to the Type property, the following properties of %XML.XPATH.DOMResult provide information about your current location.
AttributeCount
If the current node is an element, this property indicates the number of attributes of the element.
EOF
True if the reader has reached the end of the source document; false otherwise.
HasAttributes
If the current node is an element, this property is true if that element has attributes (or false if it does not). If the current node is an attribute, this property is true.
For any other type of node, this property is false.
HasValue
True if the current node is a type of node that has a value (even if that value is null). Otherwise this property is false.
LocalName
For nodes of type attribute or element, this is the name of the current element or attribute, without the namespace prefix. For all other types of nodes, this property is null.
Name
Fully qualified name of the current node, as appropriate for the type of node.
NodeType
Type of the current node, one of the following: attribute, chars, cdata, comment, document, documentfragment, documenttype, element, entity, entityreference, notation, or processinginstruction.
Path
For nodes of type element, this is the path to the element. For all other types of nodes, this property is null.
ReadState
Indicates the overall read state, one of the following:
Uri
The URI of the current node. The value returned depends on the type of node.
Value
Value, if any, of the current node, as appropriate for the type of node. If the value is less than 32 KB in size, this is a string. Otherwise, it is a character stream. For details, see Examining a Scalar Result.”
Examining a Scalar Result
This section describes use the XPath result that is represented by the %XML.XPATH.ValueResult class. In addition to the Type property, this class provides the Value property.
Note that if the value is larger than 32 KB in length, it is automatically placed in a stream object. Unless you are certain of the kinds of results that you will receive, you should check whether Value is a stream object. To do so, you can use the $IsObject function. (That is, if this value is an object, it is a stream object, because that is the only kind of object it can be.)
The following fragment shows a possible test:
 // Value can be a stream if result is greater than 32 KB in length
 Set tValue=tResult.Value

 If $IsObject(tValue){
     Write ! Do tValue.OutputToDevice()
 } else {
     Write tValue
 }
General Approach
Unless you can be certain of the kinds of results that you will receive when you evaluate XPath expressions, you should write your code to handle the most general possible case. A possible organization of the code is as follows:
  1. Find the number of elements in the list of returned results. Iterate through this list.
  2. For each list item, check the Type property.
For an example, see the ExampleDisplayResults() class method of the %XML.XPATH.Document. This method examines the results as described in the preceding list and then writes them to the Terminal. The following section shows what the output from this method looks like.
Examples
The examples in this section evaluate XPath expressions against the following XML document:
<?xml version="1.0"?>
<staff>
  <doc type="consultant">
    <name first="David" last="Marston">Mr. Marston</name>
    <name first="David" last="Bertoni">Mr. Bertoni</name>
    <name first="Donald" last="Leslie">Mr. Leslie</name>
    <name first="Emily" last="Farmer">Ms. Farmer</name>
  </doc>
  <doc type="GP">
    <name first="Myriam" last="Midy">Ms. Midy</name>
    <name first="Paul" last="Dick">Mr. Dick</name>
    <name first="Scott" last="Boag">Mr. Boag</name>
    <name first="Shane" last="Curcuru">Mr. Curcuru</name>
    <name first="Joseph" last="Kesselman">Mr. Kesselman</name>
    <name first="Stephen" last="Auriemma">Mr. Auriemma</name>
  </doc>
</staff>
These examples were adapted from more extensive examples contained in the %XML.XPATH.Document class; see the source code for a closer look at those.
Evaluating an XPath Expression That Has a Subtree Result
The following class method reads an XML file and evaluates an XPath expression that returns an XML subtree:
/// Evaluates an XPath expression that returns a DOM Result
ClassMethod Example1()
{
    Set tSC=$$$OK
    do {
    
    Set tSC=##class(%XML.XPATH.Document).CreateFromFile(filename,.tDoc)
    If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC) Quit}
    
    Set context="/staff/doc"
    Set expr="name[@last='Marston']"
    Set tSC=tDoc.EvaluateExpression(context,expr,.tRes)
    If $$$ISERR(tSC) Quit
    
        Do ##class(%XML.XPATH.Document).ExampleDisplayResults(tRes)
    
    } while (0)
    If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC)}
    Quit
}
This example selects any nodes whose <name> element has a last attribute equal to Marston. This expression is evaluated in the <doc> node of the <staff> element.
Notice that this example uses the ExampleDisplayResults() class method of the %XML.XPATH.Document.
When you execute the Example1() method, providing the previous XML file as input, you see the following output:
XPATH DOM
element: name
         attribute: first Value: David
         attribute: last  Value: Marston
 
chars : #text Value: Mr. Marston
Evaluating an XPath Expression That Has a Scalar Result
The following class method reads an XML file and evaluates an XPath expression that returns a scalar result:
/// Evaluates an XPath expression that returns a VALUE Result
ClassMethod Example2()
{
    Set tSC=$$$OK
    do {
    
    Set tSC=##class(%XML.XPATH.Document).CreateFromFile(filename,.tDoc)
    If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC) Quit}
    
    Set tSC=tDoc.EvaluateExpression("/staff","count(doc)",.tRes)
        If $$$ISERR(tSC) Quit
        
        Do ##class(%XML.XPATH.Document).ExampleDisplayResults(tRes)
    
    } while (0)
    If $$$ISERR(tSC) {Do $System.OBJ.DisplayError(tSC)}
    Quit
}
This example counts <doc> subnodes. This expression is evaluated in the <staff> element.
When you execute the Example2() method, providing the previous XML file as input, you see the following output:
XPATH VALUE
2