An XML Schema Tutorial

This tutorial covers the basics of XML Schemas. Before reading this tutorial you should already be familiar with XML and DTDs. You may want to read my XML and DTD tutorials. Click the above links to do so.

What is a Schema?

Much like Data Type Definitions (DTDs), Schemas define the elements that can appear in an XML document and the attributes that can be associated with those elements.

Schemas define the document's structure - which elements are children of others, the order the child elements can appear, and the number of child elements. Schemas specify if an element is empty or if it can include text. They can also specify default values for attributes.

Schemas are more powerful and flexible than DTDs and use XML syntax.

Independent developers can agree to use a common Schema for exchanging XML data. Your application can use this agreed upon Schema to verify the data it receives. Verifying an XML document against the schema is known as validating the document.

Schema standards are defined by the World Wide Web Consortium (W3C). The W3C site provides a comprehensive reference of XML schemas.

However, discussions contained herein focus on Microsoft's implementation of schemas. All samples require their Internet Explorer, version 5.0 or later, browser which includes their Msxml parser. All references to the Msxml parser, either in text or in sample code, assume Msxml V2.5 or later. For more information or to download Microsoft's XML products, visit their site.

Using Schemas

To use a schema in an XML document, add a schema namespace declaration:

<book xmlns="x-schema:yourschema.xml">
<title>Presenting XML</title>
<author>Richard Light</author>


Elements and Attributes

You define elements and attributes in a Schema by specifying <ElementType> and <AttributeType> tags. Instances of elements or attributes are declared using the <element> and <attribute> tags.

Move the cursor over the following text for more information.

<?xml version="1.0"?>
<Schema xmlns="schemas-microsoft-com:xml-data">
<ElementType name="title" />
<ElementType name="author" />
<ElementType name="pages" />
<ElementType name="book" model="closed">
   <element type="title" />
   <element type="author" />
   <element type="pages" />

   <AttributeType name="copyright" />
   <attribute type="copyright" />

Here, there are 4 <ElementType> elements: "title", "author", "pages" and "book." These are definitions of the elements. The content for a book is declared within the "book" ElementType. Each book contains "title", "author" and "pages" elements using the <element> tag with a type attribute that references the ElementType.

You can also define an <AttributeType> for the copyright attribute and then declare its usage with the <attribute> element tag with a type attribute that references its definition.

The copyright attribute was defined within the "Book" ElementType. Thus, different element types can declare attributes with the same name but with potentially different meaning.

<AttributeType> elements can also be declared globally by placing them outside of an ElementType. Then, multiple elements can share a common attribute type without having to redeclare the AttributeType inside each ElementType.

Content Model

A content model indicates what an element can contain.

In the above example, a "Book" element is defined to contain a sequence of "title", "author" and "pages" elements. Thus, a valid XML file might look like:

<book xmlns="x-schema:book-schema.xml">
   <title>Cooking 101: A Cookbook for Beginners</title>
   <author>Joseph Cook</author>

If the book element contains any elements other than those specified (illustrator for instance) the XML document will not validate. The book content is a closed model due to its model="closed" attribute.

Open Content Models
Open content models enable additional elements and/or attributes to exist within an element without having to declare each and every element in the XML Schema. Content models are open by default.

This is now a valid XML document:

<book xmlns="x-schema:book-schema.xml" xmlns:new="urn:new-namespace">
   <title>Cooking 101: A Cookbook for Beginners</title>
   <author>Joseph Cook</author>
   <new:illustrator>John Doe</new:illustrator>

A few rules apply to open content models:

  1. You can't add/remove content that will break the existing content model. For example, since <book> is defined as a sequence, the valid data must provide that exact sequence first, before adding any "open" content. Removing the <pages> element or providing two <title> elements next to each other would cause validation to fail.

  2. You can add undeclared elements as long as they are defined in a different namespace.

  3. You can add other elements declared in the same schema. For example, a second <title> element after the <pages> element will validate.

Element Content
An element can contain text, other elements, a mixture of text and elements, or nothing at all. The content attribute specifies what the element can contain.

Here's an example and the valid content values.

<ElementType name="title" content="textOnly"/>





The element can contain text but no sub elements.


Element can contain sub elements only.


Text and sub-elements are not allowed.


Both text and sub-elements are allowed.

Element Occurrences
The minOccurs and maxOccurs attributes specify how many times an element can appear within another element.

<element type="item" maxOccurs="*">

MaxOccurs specifies the maximum number of times a sub-element can appear. Valid values are integers and "*", which means that an unrestricted number of elements may appear. The default value is "1". However, when content="mixed", the default value is "*".

You can specify a minimum number of times a sub-element may appear with minOccurs. To make a sub-element optional, set minOccurs to "0". The default value is 1.

These attributes can be used for both element and group declarations.

Sub Element Order
The order attribute specifies if sub-elements must appear in a certain order, and if only one sub-element of a set can appear. Legal values are seq, one and many.

The seq value indicates that sub-elements must appear in the order listed in the schema (title, author, pages). For example:


<ElementType name="Book" order="seq">

The one value specifies that only one sub-element can be used from a list of sub-elements. For example, to specify that an "Item" element may contain either a "product" element or a "backOrderedProduct" element, but not both:


<ElementType name="Item" order="one">
   <element type="product" />
   <element type="backOrderedProduct" />

The many value indicates that the sub-elements may appear in any order, and in any quantity. The default value for order is "seq" when the content attribute is set to "eltOnly", and the default is "many" when content is set to "mixed".

Element Grouping
The group element lets you specify rules for a specific set of sub-elements. To indicate that the "Item" element has either a "product" or a "backOrderedProduct" element, and then a "quantity" and "price", you can use the following XML:


<ElementType name="Item">
   <group order="one">
   <element type="product" />
   <element type="backOrderedProduct" />
   <element type="quantity"/>
   <element type="price"/>

Attributes are different than elements and the rules that apply to elements do not apply to attributes. Also, different element types can have attributes with the same name but the attributes are independent and unrelated.

You can specify whether an attribute is required or optional and you can limit its value to a small set of strings. You can also indicate a default value to be used if the attribute is omitted from an element.


'Make the attribute required.
<AttributeType name="shipTo" dt:type="idref" required="yes"/>

'Limit the attribute's values to high, medium and low
<AttributeType name="priority" dt:type="enumeration" dt:values="high medium low" />

'Provide a default value of 1
<AttributeType name="quantity" dt:type="int">
<attribute type="quantity" default="1"/>


Data Types

Unlike a DTD, XML schemas let you specify a data type for an element or attribute. To use a data type, your schema must include the datatypes namespace:


<Schema name="myschema"
<!-- ... -->

Data types can be specified on <ElementType> and <AttributeType> tags using one of the following syntaxes:


<ElementType name="pages" dt:type="int"/>
   <ElementType name="pages">
   <datatype dt:type= "int"/>

Although schema allow for specifying data types, IE's XML parser does not fully support them. You can read more about data types by visiting Microsoft's web site.


XML Schemas are extensible. They are built on an open content model. You are free to add your own elements and attributes to XML Schema documents.

For example, you could add additional constraints to a "pages" element. This sample declares the "pages" element. Extended tags from the "myExt" namespace augment this information with an added rule that books must have a minimum of 50 pages and a maximum of 100 pages.


<ElementType name="pages" xmlns:myExt="urn:myschema-extensions">

Although the XML parser will not use the additional "myExt" constraints when validating the XML data, your application can.

Using Other Schemas

Schemas can use other schemas allowing you to build a new schema from other existing ones. Say you already have a schema that defines an "Address" element. Using namespaces, you can use that schema in your new schema by adding a namespace declaration for it.

For an example, see the new schema reference in the sample under Open Content Models. You can also read about namespaces in my XML tutorial or on Microsoft's site.



About TheScarms
About TheScarms

Sample code
version info

If you use this code, please mention "www.TheScarms.com"

Email this page

© Copyright 2024 TheScarms
Goto top of page