XML documents may begin with a prolog that appears before the root element. It has the metadata about the XML document, such as character encoding, document structure, and style sheets. For example,
<?xml version="1.0" encoding="UTF-8"?>
A tag in XML is a case-sensitive markup construct that begins with < and ends with >. A tag can be:
A start-tag, such as <name>;
A end-tag, such as </name>;
An empty-element tag, such as <name/>.
An element in XML is formed by characters between the start-tag and the end-tag. For example, <name>John Snow</name>.
It can also consists only of an empty-element tag. For example, <name/>.
XML elements can have attributes which exists within a start-tag or empty-element tag.
An attribute consist of a name–value pair.
For example, <img src="screenshot.png" alt="screenshot" />. Here the names of the attributes are src and alt, and their values are screenshot.png and screenshot respectively.
Syntax Rules:
Each start-tag in XML must have a matching end-tag and all tags should be properly nested, with none missing and none overlapping.
The tag names cannot contain any of the characters !"#$%&'()*+,/;<=>?@[\]^`{|}~, nor a space character,
and cannot begin with "-", ".", or a number.
The characters < and & holds special meaning in XML. They are key syntax characters and should not be used in an element outside a CDATA section.
XML provides escape facilities to handle these special characters. For example:
< represents <;
& represents &.
XML has three other predefined entities:
> represents >;
' represents ';
" represents ".
A XML document cannot contain any whitespace before the XML declaration else it will be treated as a processing instruction by the parser. XML processors preserve all white space in element content, while all whitespace within the attribute values are reported as single spaces.
Similar to HTML, a comment in XML begins with <!-- and ends with -->.