Skip to main content

Reading and Validating XML Documents

With InterSystems IRIS®, you can read and use XML documents in multiple ways. This page summarizes them and explains how to validate the XML documents.

Basic Techniques

InterSystems IRIS provides three basic ways to read and parse XML documents:

Both %XML.TextReaderOpens in a new tab and %XML.ReaderOpens in a new tab use the InterSystems IRIS SAX (Simple API for XML) Parser.

Checking That the Document Is Well-Formed

When you read an XML document, you generally perform validation at the same time. That is, the SAX parser checks that the document is well-formed, and validates the document against the declared schema or DTD as appropriate, along with performing other tasks.

If you simply need to see if an XML document is well formed, use %XML.TextReaderOpens in a new tab as follows:

  1. Specify a document source, via the first argument of one of the following methods:

    Method First Argument
    ParseFile() A file name, with complete path. Note that the filename and path must contain only ASCII characters.
    ParseStream() A stream
    ParseString() A string
    ParseURL() A URL
  2. Check the status returned by the parse method. If the status contains an error code, the error code will indicate the location of the problem.

    Note that the document may contain multiple errors, but the parser quits when it can no longer read the document.

For example:


USER>set file="C:\0work\XMLdemo\inputfile2.xml"
 
USER>set status=##class(%XML.TextReader).ParseFile(file)
 
USER>write status=1
1
USER>set file="C:\0work\XMLdemo\inputfile2a.xml"
 
USER>set status=##class(%XML.TextReader).ParseFile(file)
 
USER>write status=1
0
USER>d $system.OBJ.DisplayError(status)
 
ERROR #6301: SAX XML Parser Error: expected end of tag 'xlistitem' while processing 
C:\0work\XMLdemo\inputfile2a.xml at line 423 offset 3

Checking that the Document Follows Its Schema or DTD

You can also use %XML.TextReaderOpens in a new tab to validate the document against the declared schema or DTD as appropriate. To do so, call the ParseFile(), ParseStream(), ParseString(), or ParseURL() as described in the previous section but also include the second argument, TextReader. This argument is returned as output and is an instance of %XML.TextReaderOpens in a new tab that you can use to iterate over the document and find errors and warnings.

For example:

ClassMethod ValidateFile(file As %String = "C:\0work\XMLdemo\inputfile2.xml",schema as %String="")
{
    write !!,"Validating "_file_"..."
    
    if (schema="") {
        //in this case, use the schema that the file refers to
        Set status=##class(%XML.TextReader).ParseFile(file,.tReader,,flags)
    } else {
        //use an override schema
        Set status=##class(%XML.TextReader).ParseFile(file,.tReader,,flags,,schema)
    }
    if $$$ISERR(status) {
        do $system.OBJ.DisplayError(status)
    }
    if '$ISOBJECT(tReader) {
        write !, ">>> Cannot read this file, because it is not valid..."
        quit
    }
    set errcount=0
    set warningcount=0
    while (tReader.Read()) {
        if (tReader.NodeType="error") {
            set errcount=errcount+1
            Write !, ">>> *ERROR* ",tReader.Value
        } elseif (tReader.NodeType="warning") {
            set warningcount=warningcount+1
            Write !, ">>> *WARNING* ",tReader.Value
        }
    }
    if (errcount=0) && (warningcount=0) {
        write !, ">>> No warnings or errors"
    }
}

You could use the following additional method to scan the files recursively in a directory:

ClassMethod ValidateFilesInDir(dirtoprocess As %String = "C:\0work\XMLdemo",schema as %String="")
{
    set stmt = ##class(%SQL.Statement).%New()
    set status = stmt.%PrepareClassQuery("%File","FileSet")
    if $$$ISERR(status) { 
        do $system.OBJ.DisplayError(status)
        quit  
    }

    set rset = stmt.%Execute(dirtoprocess,"*.xml",,1)
    while rset.%Next() {
        set filetoprocess=rset.%Get("Name")
        set type=rset.%Get("Type")
        if (type="F") {
            do ..ValidateFile(filetoprocess,schema)
        } elseif (type="D") {
            set dirname=rset.%Get("Name")
            do ..ValidateFilesInDir(dirname)
        }
    }
}

Also see Using %XML.TextReader.

FeedbackOpens in a new tab