Reading and Validating XML Documents
With InterSystems IRIS® data platform, you can read and use XML documents in multiple ways. This page summarizes them and explains how to validate the XML documents.
Basic Techniques
InterSystems IRIS provides three basic ways to read and parse XML documents:
-
For any XML document, you can use %XML.TextReaderOpens in a new tab, which is described in another topic. With this API, after you read the document, you have an instance of %XML.TextReaderOpens in a new tab, which you can use to examine the document node by node.
-
For any XML document, you can use %XML.ReaderOpens in a new tab and then access the document as an XML Document Object Model (DOM).
-
If the XML document maps appropriately to an XML-enabled class definition, you can use the %XML.ReaderOpens in a new tab class to import that document into an instance of that class, as described in another topic.
Both %XML.TextReaderOpens in a new tab and %XML.ReaderOpens in a new tab use the InterSystems IRIS SAX (Simple API for XML) Parser.
Checking That the Document Is Well-Formed
When you read an XML document, you generally perform validation at the same time. That is, the SAX parser checks that the document is well-formed, and validates the document against the declared schema or DTD as appropriate, along with performing other tasks.
If you simply need to see if an XML document is well formed, use %XML.TextReaderOpens in a new tab as follows:
-
Specify a document source, via the first argument of one of the following methods:
Method First Argument ParseFile() A file name, with complete path. Note that the filename and path must contain only ASCII characters. ParseStream() A stream ParseString() A string ParseURL() A URL -
Check the status returned by the parse method. If the status contains an error code, the error code will indicate the location of the problem.
Note that the document may contain multiple errors, but the parser quits when it can no longer read the document.
For example:
USER>set file="C:\0work\XMLdemo\inputfile2.xml"
USER>set status=##class(%XML.TextReader).ParseFile(file)
USER>write status=1
1
USER>set file="C:\0work\XMLdemo\inputfile2a.xml"
USER>set status=##class(%XML.TextReader).ParseFile(file)
USER>write status=1
0
USER>d $system.OBJ.DisplayError(status)
ERROR #6301: SAX XML Parser Error: expected end of tag 'xlistitem' while processing
C:\0work\XMLdemo\inputfile2a.xml at line 423 offset 3
Checking That the Document Follows Its Schema or DTD
You can also use %XML.TextReaderOpens in a new tab to validate the document against the declared schema or DTD as appropriate. To do so, call the ParseFile(), ParseStream(), ParseString(), or ParseURL() as described in the previous section but also include the second argument, TextReader. This argument is returned as output and is an instance of %XML.TextReaderOpens in a new tab that you can use to iterate over the document and find errors and warnings.
For example:
ClassMethod ValidateFile(file As %String = "C:\0work\XMLdemo\inputfile2.xml",schema as %String="")
{
write !!,"Validating "_file_"..."
if (schema="") {
//in this case, use the schema that the file refers to
Set status=##class(%XML.TextReader).ParseFile(file,.tReader,,flags)
} else {
//use an override schema
Set status=##class(%XML.TextReader).ParseFile(file,.tReader,,flags,,schema)
}
if $$$ISERR(status) {
do $system.OBJ.DisplayError(status)
}
if '$ISOBJECT(tReader) {
write !, ">>> Cannot read this file, because it is not valid..."
quit
}
set errcount=0
set warningcount=0
while (tReader.Read()) {
if (tReader.NodeType="error") {
set errcount=errcount+1
Write !, ">>> *ERROR* ",tReader.Value
} elseif (tReader.NodeType="warning") {
set warningcount=warningcount+1
Write !, ">>> *WARNING* ",tReader.Value
}
}
if (errcount=0) && (warningcount=0) {
write !, ">>> No warnings or errors"
}
}
You could use the following additional method to scan the files recursively in a directory:
ClassMethod ValidateFilesInDir(dirtoprocess As %String = "C:\0work\XMLdemo",schema as %String="")
{
set stmt = ##class(%SQL.Statement).%New()
set status = stmt.%PrepareClassQuery("%File","FileSet")
if $$$ISERR(status) {
do $system.OBJ.DisplayError(status)
quit
}
set rset = stmt.%Execute(dirtoprocess,"*.xml",,1)
while rset.%Next() {
set filetoprocess=rset.%Get("Name")
set type=rset.%Get("Type")
if (type="F") {
do ..ValidateFile(filetoprocess,schema)
} elseif (type="D") {
set dirname=rset.%Get("Name")
do ..ValidateFilesInDir(dirname)
}
}
}
Also see Using %XML.TextReader.