Skip to main content
Previous sectionNext section

Representing an XML Document as a DOM

The %XML.Document and %XML.Node classes enable you to represent an arbitrary XML document as a DOM (document object model). You can then navigate this object and modify it. You can also create a new DOM and add to it.

Note:

The XML declaration of any XML document that you use should indicate the character encoding of that document, and the document should be encoded as declared. If the character encoding is not declared, InterSystems IRIS uses the defaults described in “Character Encoding of Input and Output,” earlier in this book. If these defaults are not correct, modify the XML declaration so that it specifies the character set actually used.

Opening an XML Document as a DOM

To open an existing XML document for use as a DOM, do the following:

  1. Create an instance of the %XML.Reader class.

  2. Optionally specify the Format property of this instance, to specify the format of the file that you are importing.

    By default, InterSystems IRIS assumes that XML files are in literal format. If your file is in SOAP-encoded format, you must indicate this so that the file can be read correctly.

    See “Reader Properties” in the chapter “Importing XML into Objects.”

    This property has no effect unless you use Correlate() and Next().

  3. Open the source. To do so, use one of the following methods of %XML.Reader:

    • OpenFile() — Opens a file.

    • OpenStream() — Opens a stream.

    • OpenString() — Opens a string.

    • OpenURL() — Opens a URL.

    In each case, you can optionally specify a second argument for the method to override the value of the Format property.

  4. Access the Document property, which is a DOM. This property is an instance of %XML.Document and it provides methods that you can use to find information about the document as a whole. For example, CountNamespace() returns the total number of namespaces used by the DOM.

Or, if you have a stream that contains an XML document, call the GetDocumentFromStream() method of %XML.Document. This returns an instance of %XML.Document.

Example 1: Converting a File to a DOM

For example, the following method reads an XML file and returns an instance of %XML.Document that represents that document:

ClassMethod GetXMLDocFromFile(file) As %XML.Document
{
    set reader=##class(%XML.Reader).%New()

    set status=reader.OpenFile(file)
    if $$$ISERR(status) {do $System.Status.DisplayError(status) quit $$$NULLOREF}

    set document=reader.Document
    quit document
}
Copy code to clipboard

Example 2: Converting an Object to a DOM

The following method accepts an OREF and returns an instance of %XML.Document that represents that object. The method assumes that the OREF is an instance of an XML-enabled class:

ClassMethod GetXMLDoc(object) As %XML.Document
{
    //make sure this is an instance of an XML-enabled class
    if '$IsObject(object){
        write "Argument is not an object"
        quit $$$NULLOREF
        }
    set classname=$CLASSNAME(object)
    set isxml=$CLASSMETHOD(classname,"%Extends","%XML.Adaptor")
    if 'isxml {
        write "Argument is not an instance of an XML-enabled class"
        quit $$$NULLOREF
        }
    
    //step 1 - write object as XML to a stream
    set writer=##class(%XML.Writer).%New()
    set stream=##class(%GlobalCharacterStream).%New()
    set status=writer.OutputToStream(stream)
    if $$$ISERR(status) {do $System.Status.DisplayError(status) quit $$$NULLOREF}
    set status=writer.RootObject(object)
    if $$$ISERR(status) {do $System.Status.DisplayError(status) quit $$$NULLOREF}

    //step 2 - extract the %XML.Document from the stream
    set status=##class(%XML.Document).GetDocumentFromStream(stream,.document)
    if $$$ISERR(status) {do $System.Status.DisplayError(status) quit $$$NULLOREF}

   quit document
}
Copy code to clipboard

Getting the Namespaces of the DOM

When InterSystems IRIS reads an XML document and creates a DOM, it identifies all the namespaces used in the document and assigns an index number to each.

Your instance of %XML.Document class provides the following methods that you can use to find information about the namespaces in the document:

CountNamespace()

Returns the number of namespaces in the document.

FindNamespace()

Returns the index that corresponds to the given namespace.

GetNamespace()

Returns the XML namespace URI for the given index.

The following example method displays a report showing the namespaces used in a document:

ClassMethod ShowNamespaces(doc As %XML.Document)
{
    Set count=doc.CountNamespace()
    Write !, "Number of namespaces in document: "_count
    For i=1:1:count { 
        Write !, "Namespace "_i_" is "_doc.GetNamespace(i)
}

}
Copy code to clipboard

Also see “Getting Information about the Current Node,” later in this chapter.

Navigating Nodes of the DOM

To access nodes of the document, you can use two different techniques:

  • Use the GetNode() method of your instance of %XML.Document. This method accepts an integer, which indicates the node number, starting with 1.

  • Call the GetDocumentElement() method of your instance of %XML.Document.

    This method returns an instance of %XML.Node, which provides properties and methods that you use to access information about the root node and to move to other nodes. The following subsections provide details on working with %XML.Node.

Moving to Child or Sibling Nodes

To move to child nodes or sibling nodes, use the following methods of your instance of %XML.Node:

  • MoveToFirstChild()

  • MoveToLastChild()

  • MoveToNextSibling()

  • MoveToPreviousSibling()

Each of these methods tries to moves to another node (as indicated by the name of the method). If this is possible, the method returns true. If not, it returns false and the focus is the same as it was before the method was called.

Each of these methods has one optional argument, skipWhitespace. If this argument is true, the method ignores any whitespace. The default for skipWhitespace is false.

Moving to the Parent Node

To move to the parent of the current node, use the MoveToParent() method of your instance of %XML.Node.

This method takes an optional argument, restrictDocumentNode. If this argument is true, the method does not move to the document node (the root). The default for restrictDocumentNode is false.

Moving to a Specific Node

To move to a specific node, you can set the NodeId property of your instance of %XML.Node. For example:

   set saveNode = node.NodeId
   //..... lots of processing
   //... 
   // restore position
   set node.NodeId=saveNode
Copy code to clipboard

Using the id Attribute

In some cases, the XML document may include an attribute named id, which is used to identify different nodes in the document. For example:

<?xml version="1.0"?>
<team>
<member id="alpha">Jack O'Neill</member>
<member id="beta">Samantha Carter</member>
<member id="gamma">Daniel Jackson</member>
</team>
Copy code to clipboard

If (like this example) the document uses an attribute named id, you can use it to navigate to that node. To do so, you use the GetNodeById() method of the document, which returns an instance of %XML.Node positioned at that node. (Notice that unlike most other navigation methods, this method is available from %XML.Document, rather than %XML.Node.)

DOM Node Types

The %XML.Document and %XML.Node classes recognize the following DOM node types:

  • Element ($$$xmlELEMENTNODE)

    Note that these macros are defined in the %xml.DOM.inc include file.

  • Text ($$$xmlTEXTNODE)

  • Whitespace ($$$xmlWHITESPACENODE)

Other types of DOM nodes are simply ignored.

Consider the following XML document:

<?xml version="1.0"?>
<team>
<member id="alpha">Jack O'Neill</member>
<member id="beta">Samantha Carter</member>
<member id="gamma">Daniel Jackson</member>
</team>
Copy code to clipboard

When viewed as a DOM, this document consists of the following nodes:

Example of Document Nodes
NodeID NodeType LocalName Notes
0,29 $$$xmlELEMENTNODE team  
1,29 $$$xmlWHITESPACENODE   This node is a child of the <team> node
1,23 $$$xmlELEMENTNODE member This node is a child of the <team> node
2,45 $$$xmlTEXTNODE Jack O'Neill This node is a child of the first <member> node
1,37 $$$xmlWHITESPACENODE   This node is a child of the <team> node
1,41 $$$xmlELEMENTNODE member This node is a child of the <team> node
3,45 $$$xmlTEXTNODE Samantha Carter This node is a child of the second <member> node
1,45 $$$xmlWHITESPACENODE   This node is a child of the <team> node
1,49 $$$xmlELEMENTNODE member This node is a child of the <team> node
4,45 $$$xmlTEXTNODE Daniel Jackson This node is a child of the third <member> node
1,53 $$$xmlWHITESPACENODE   This node is a child of the <team> node

For information on accessing the node type, localname, and other details, see the next section.

Getting Information about the Current Node

The following string properties of %XML.Node provide information about the current node. In all cases, an error is thrown if there is no current node.

LocalName

Local name of the current element node. An error is thrown if you try to access this property for another type of node.

Namespace

Namespace URI of the current element node. An error is thrown if you try to access this property for another type of node.

NamespaceIndex

Index of the namespace of the current element node.

When InterSystems IRIS reads an XML document and creates a DOM, it identifies all the namespaces used in the document and assigns an index number to each.

An error is thrown if you try to access this property for another type of node.

Nil

Equals true if xsi:nil or xsi:null is true or 1 for this element node. Otherwise, this property equals false.

NodeData

Value of a character node.

NodeId

ID of the current node. You can set this property in order to navigate to another node.

NodeType

Type of the current node, as discussed in the previous section.

QName

Qname of the element node. Only used for output as XML when the prefix is valid for the document.

The following methods provide additional information about the current node:

GetText()
method GetText(ByRef text) as %Boolean
Copy code to clipboard

Gets the text contents of an element node. This method returns true if text is returned; in this case, the actual text is appended to the first argument, which is returned by reference.

HasChildNodes()
method HasChildNodes(skipWhitespace As %Boolean = 0) as %Boolean
Copy code to clipboard

Returns true if the current node has child nodes; otherwise it returns false.

GetNumberAttributes()
method GetNumberAttributes() as %Integer
Copy code to clipboard

Returns the number of the attributes of the current element. Attributes are discussed further later in this chapter.

Example

The following example method writes a report that gives information about the current node:

ClassMethod ShowNode(node as %XML.Node) 
{
    Write !,"LocalName="_node.LocalName
    If node.NodeType=$$$xmlELEMENTNODE  {
        Write !,"Namespace="_node.Namespace
    }
    If node.NodeType=$$$xmlELEMENTNODE {
        Write !,"NamespaceIndex="_node.NamespaceIndex
     }
    Write !,"Nil="_node.Nil
    Write !,"NodeData="_node.NodeData
    Write !,"NodeId="_node.NodeId
    Write !,"NodeType="_node.NodeType
    Write !,"QName="_node.QName
    Write !,"HasChildNodes returns "_node.HasChildNodes()
    Write !,"GetNumberAttributes returns "_node.GetNumberAttributes()
    Set status=node.GetText(.text)
    If status {
        Write !, "Text of the node is "_text
        } else {
            Write !, "GetText does not return text"
        }
}
Copy code to clipboard

Example output might be as follows:

LocalName=staff
Namespace=
NamespaceIndex=
Nil=0
NodeData=staff
NodeId=1
NodeType=e
QName=staff
HasChildNodes returns 1
GetNumberAttributes returns 5
GetText does not return text
Copy code to clipboard

Basic Methods for Examining Attributes

You can use the following methods of %XML.Node to examine the attributes of the current node. Also see “Additional Methods for Examining Attributes,” later in this chapter, for additional methods.

  • AttributeDefined() — Returns nonzero (true) if the current element has an attribute with the given name.

  • FirstAttributeName() — Returns the attribute name for the first attribute of the current element.

  • GetAttributeValue() — Returns the value of the given attribute. If the element does not have the attribute, the method returns null.

  • GetNumberAttributes() — Returns the number of the attributes of the current element.

  • LastAttributeName() — Returns the attribute name of the last attribute of the current element.

  • NextAttributeName() — Given an attribute name, this method returns the name of the next attribute in collation order, whether the specified attribute is valid or not.

  • PreviousAttributeName() — Given an attribute name, this method returns the name of the previous attribute in collation order, whether the specified attribute is valid or not.

The following example walks through the attributes in a given node and writes a simple report:

ClassMethod ShowAttributes(node as %XML.Node) 
{
    Set count=node.GetNumberAttributes()
    Write !, "Number of attributes: ", count
    Set first=node.FirstAttributeName()
    Write !, "First attribute is: ", first
    Write !, "   Its value is: ",node.GetAttributeValue(first)
    Set next=node.NextAttributeName(first)

    For i=1:1:count-2 {
        Write !, "Next attribute is: ", next
        Write !, "   Its value is: ",node.GetAttributeValue(next)
        Set next=node.NextAttributeName(next)
        }
    Set last=node.LastAttributeName()
    Write !, "Last attribute is: ", last
    Write !, "   Its value is: ",node.GetAttributeValue(last)
}
Copy code to clipboard

Consider the following sample XML document:

<?xml version="1.0"?>
<staff attr1="first" attr2="second" attr3="third" attr4="fourth" attr5="fifth">
  <doc>
    <name>David Marston</name>
  </doc>
</staff>
Copy code to clipboard

If you pass the first node of this document to the example method, you see the following output:

Number of attributes: 5
First attribute is: attr1
   Its value is: first
Next attribute is: attr2
   Its value is: second
Next attribute is: attr3
   Its value is: third
Next attribute is: attr4
   Its value is: fourth
Last attribute is: attr5
   Its value is: fifth
Copy code to clipboard

Additional Methods for Examining Attributes

This section discusses methods that you can use to get the name, value, namespace, QName, and value namespace for any attribute. These methods are grouped as follows:

Also see “Basic Methods for Examining Attributes.”

Methods That Use Only the Attribute Name

Use the following methods to obtain information about attributes.

GetAttribute()
method GetAttribute(attributeName As %String, 
                    ByRef namespace As %String, 
                    ByRef value As %String, 
                    ByRef valueNamespace As %String)
Copy code to clipboard

Returns data for the given attribute. This method returns the following values by reference:

  • namespace is the namespace URI from the QName of the attribute.

  • value is the attribute value.

  • valueNamespace is the namespace URI to which the value belongs. For example, consider the following attribute:

    xsi:type="s:string"
    Copy code to clipboard

    The value of this attribute is string, and this value is in the namespace that is declared elsewhere with the prefix s. Suppose that an earlier part of this document included the following namespace declaration:

    xmlns:s="http://www.w3.org/2001/XMLSchema" 
    Copy code to clipboard

    In this case, valueNamespace would be "http://www.w3.org/2001/XMLSchema".

GetAttributeNamespace()
method GetAttributeNamespace(attributeName As %String) as %String
Copy code to clipboard

Returns the namespace URI from QName of the attribute named attributeName for the current element.

GetAttributeQName()
method GetAttributeQName(attributeName As %String) as %String
Copy code to clipboard

Returns the QName of the given attribute.

GetAttributeValue()
method GetAttributeValue(attributeName As %String) as %String
Copy code to clipboard

Returns the value of the given attribute.

GetAttributeValueNamespace()
method GetAttributeValueNamespace(attributeName As %String) as %String
Copy code to clipboard

Returns the namespace of the value of the given attribute.

Methods That Use the Attribute Name and Namespace

To get information about attributes by using both their names and their namespaces, use the following methods:

GetAttributeNS()
method GetAttributeNS(attributeName As %String, 
                      namespace As %String, 
                      ByRef value As %String, 
                      ByRef valueNamespace As %String)
Copy code to clipboard

Returns data for the given attribute, where attributeName and namespace specify the attribute of interest. This method returns the following data by reference:

  • value is the attribute value.

  • valueNamespace is the namespace URI to which the value belongs. For example, consider the following attribute:

    xsi:type="s:string"
    Copy code to clipboard

    The value of this attribute is string, and this value is in the namespace that is declared elsewhere with the prefix s. Suppose that an earlier part of this document included the following namespace declaration:

    xmlns:s="http://www.w3.org/2001/XMLSchema" 
    Copy code to clipboard

    In this case, valueNamespace would be "http://www.w3.org/2001/XMLSchema".

GetAttributeQNameNS()
method GetAttributeQNameNS(attributeName As %String, 
                           namespace As %String)
                           as %String
Copy code to clipboard

Returns the QName of the given attribute, where attributeName and namespace specify the attribute of interest.

GetAttributeValueNS()
method GetAttributeValueNS(attributeName As %String, 
                           namespace As %String) 
                           as %String
Copy code to clipboard

Returns the value of the given attribute, where attributeName and namespace specify the attribute of interest.

GetAttributeValueNamespaceNS
method GetAttributeValueNamespaceNS(attributeName As %String, 
                                    namespace As %String) 
                                    as %String
Copy code to clipboard

Returns the namespace of the value of the given attribute, where attributeName and namespace specify the attribute of interest.

Creating or Editing a DOM

To create a DOM or to modify an existing one, you use the following methods of %XML.Document:

CreateDocument()
classmethod CreateDocument(localName As %String, 
                           namespace As %String) 
                           as %XML.Document 
Copy code to clipboard

Returns a new instance of %XML.Document that consists of only a root element.

AppendCharacter()
method AppendCharacter(text As %String)
Copy code to clipboard

Appends new character data node to the list of children of this element node. The current node pointer does not change; this node is still the parent of the appended child.

AppendChild()
method AppendChild(type As %String)
Copy code to clipboard

Appends new node to the list of children of this node. The current node pointer does not change; this node is still the parent of the appended child.

AppendElement()
method AppendElement(localName As %String, 
                     namespace As %String, 
                     text As %String)
Copy code to clipboard

Appends a new element node to the list of children of this node. If the text argument is specified, then character data is added as the child of the new element. The current node pointer does not change; this node is still the parent of the appended child.

AppendNode()
method AppendNode(sourceNode As %XML.Node) as %Status
Copy code to clipboard

Appends a copy of the specified node as a child of this node. The node to copy may be from any document. The current node pointer does not change. This node is still the parent of the appended child.

AppendTree()
method AppendTree(sourceNode As %XML.Node) as %Status
Copy code to clipboard

Appends a copy of the specified node, including all its children, as a child of this node. The tree to copy may be from any document, but this node may not be a descendant of the source node. The current node pointer does not change. This node is still the parent of the appended child.

InsertNamespace()
method InsertNamespace(namespace As %String)
Copy code to clipboard

Adds the given namespace URI to the document.

InsertCharacter()
method InsertCharacter(text as %String, ByRef child As %String, Output sc As %Status) as %String
Copy code to clipboard

Inserts a new character data node as a child of this node. The new character data is inserted just before the specified child node. The child argument is the node ID of the child node; it is passed by reference so that it may be updated after the insert. The nodeId of the inserted node is returned. The current node pointer does not change.

InsertNode()
method InsertNode(sourceNode As %XML.Node, ByRef child As %String, Output sc As %Status) as %String
Copy code to clipboard

Inserts a copy of the specified node as a child of this node. The node to copy may be from any document. The new node is inserted just before the specified child node. The child argument is the node ID of the child node; it is passed by reference so that it may be updated after the insert. The nodeId of the inserted node is returned. The current node pointer does not change.

InsertTree()
method InsertTree(sourceNode As %XML.Node, ByRef child As %String, Output sc As %Status) as %String
Copy code to clipboard

Inserts a copy of the specified node, including its children, as a child of this node. The tree to copy may be from any document, but this node may not be a descendant of the source node. The new node is inserted just before the specified child node. The child argument is the node ID of the child node; it is passed by reference so that it may be updated after the insert. The nodeId of the inserted node is returned. The current node pointer does not change.

Remove()
method Remove()
Copy code to clipboard

Removes the current node and make its parent the current node.

RemoveAttribute()
method RemoveAttribute(attributeName As %String)
Copy code to clipboard

Removes the given attribute.

RemoveAttributeNS()
method RemoveAttributeNS(attributeName As %String, 
                         namespace As %String)
Copy code to clipboard

Removes the given attribute, where attributeName and namespace specify the attribute of interest.

ReplaceNode()
method ReplaceNode(sourceNode As %XML.Node) as %Status
Copy code to clipboard

Replaces the node with a copy of the specified node. The node to copy may be from any document. The current node pointer does not change.

ReplaceTree()
method ReplaceTree(sourceNode As %XML.Node) as %Status
Copy code to clipboard

Replaces the node with a copy of the specified node, including all its children. The tree to copy may be from any document, but may not be a descendant of the source node. The current node pointer does not change.

SetAttribute()
method SetAttribute(attributeName As %String, 
                    namespace As %String = "", 
                    value As %String = "", 
                    valueNamespace As %String = "")
Copy code to clipboard

Sets data for an attribute of the current element. Here:

  • attributeName is the name of the attribute.

  • namespace is the namespace URI from QName of the attribute named attributeName for this element.

  • value is the attribute value.

  • valueNamespace is the namespace URI corresponding to the prefix when the attribute value is of the form "prefix:value".

Writing XML Output from a DOM

You can serialize a DOM or a node of a DOM and generate XML output. To do this, you use the following methods of %XML.Writer:

Document()
method Document(document As %XML.Document) as %Status
Copy code to clipboard

Given an instance of %XML.Document, this method writes the document to the currently specified output destination.

DocumentNode()
method DocumentNode(document As %XML.Node) as %Status
Copy code to clipboard

Given an instance of %XML.Node, this method writes the node to the currently specified output destination.

Tree()
method Tree(node As %XML.Node) as %Status
Copy code to clipboard

Given an instance of %XML.Node, this method writes the node and its tree of descendants to the currently specified output destination.

For information on specifying the output destination and setting properties of %XML.Writer, see the chapter “Writing XML Output from Objects.”