Skip to main content

Parts of an XML Document

An XML document looks like this:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE bookinfo "bookinfo.dtd">
<?cursor line="122" col="48" ?>
<!-- This is a comment -->
<bookinfo id="48">
<title>Moby Duck&mdash;The Great White Mallard</title>
</bookinfo>

Basically it consists of plain text with embedded markup. Markup directives are set off from the main text using enclosing <> characters.

An XML document can contain the following components:

  1. The XML declaration:

    <?xml version="1.0" encoding="utf-8" ?>
    
    

    This provides basic information about the XML document such as the XML version it uses or its character encoding scheme.

  2. A document type declaration:

    <!DOCTYPE bookinfo "bookinfo.dtd">
    
    

    This defines an optional DTD (document type definition—information about the structure of the document).

  3. A processing instruction:

    <?cursor line="122" col="48" ?>
    

    An XML document may contain processing instructions. These are not considered to be part of the document; instead they are special markers that can be used by XML tools (such as an XML editor).

  4. A comment:

    <!-- This is a comment -->
    

    Comments are ignored during XML processing.

  5. An element:

    <bookinfo>
    

    The top-level element is called the root element.

    All elements must have a corresponding closing tag:

    </bookinfo>
    
  6. An attribute:

    <bookinfo id="48">
    

    An element may contain zero or more attributes. Attributes specify additional information about an element.

  7. An entity:

    &mdash;
    

    An entity is the XML version of a macro: when a document is processed, entities are replaced with a corresponding set of characters.

  8. Text:

    Moby Duck
    

    This is actual, textual content of the document.

FeedbackOpens in a new tab