XML Tutorial – Introduction, Structure, and Syntax Rules

In this tutorial, we will introduce XML data exchange format and discuss structure of a XML document and syntax rules for XML.


What is XML?


  • XML (Extensible Markup Language) is a markup language like HTML for storage or transmission of data.
  • XML is widely used in web services to transport data over the network.
  • XML has no predefined tags, unlike HTML.
  • XML is very easy to parse and generate.
  • XML provides strong support for unicode characters. The default character encoding is UTF-8 for XML documents.
  • XML defines set of rules for encoding documents in a format which are human-friendly.
  • XML is widely used in a SOA (Services Oriented Architecture).
  • XML files have the extension .xml and the media types of XML are application/xml and text/xml
  • Almost all major programming languages supports XML due to its language-independent data format.


Structure of a XML document:


  1. A XML document contain exactly one root element which is the start tag of the XML document and it contains all other elements.


  2. XML documents may begin with a prolog that appears before the root element. It has the metadata about the XML document, such as character encoding, document structure, and style sheets. For example,


  3. A tag in XML is a case-sensitive markup construct that begins with < and ends with >. A tag can be:


    • A start-tag, such as <name>;
    • A end-tag, such as </name>;
    • An empty-element tag, such as <name/>.


  4. An element in XML is formed by characters between the start-tag and the end-tag. For example, <name>John Snow</name>. It can also consists only of an empty-element tag. For example, <name/>.

  6. XML elements can have attributes which exists within a start-tag or empty-element tag.
    An attribute consist of a name–value pair. For example,


    Here the names of the attributes are src and alt, and their values are screenshot.png and screenshot respectively.


Syntax Rules:


  1. Each start-tag in XML must have a matching end-tag and all tags should be properly nested, with none missing and none overlapping.
    The tag names cannot contain any of the characters !"#$%&'()*+,/;<=>[email protected][\]^`{|}~, nor a space character,
    and cannot begin with -, ., or a number.
  2. The characters < and & holds special meaning in XML. They are key syntax characters and should not be used in an element outside a CDATA section.

    XML provides escape facilities to handle these special characters. For example:

    • &lt; represents <;
    • &amp; represents &.

    XML has three other predefined entities:

    • &gt; represents >;
    • &apos; represents ';
    • &quot; represents ".


  3. A XML document cannot contain any whitespace before the XML declaration else it will be treated as a processing instruction by the parser. XML processors preserve all white space in element content, while all whitespace within the attribute values are reported as single spaces.
  4. Similar to HTML, a comment in XML begins with <!-- and ends with -->.

That's all about XML data exchange format, XML structure and XML syntax rules. Thanks for reading.

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 5.00 out of 5)


Useful links:

XML Validator, XML Formatter, XML Minifier

Convert XML to JSON


Leave a Reply

Notify of