Chapter 19: XML Schema
1. What is XML Schema (XSD)? The clearest possible explanation
XML Schema (usually called XSD — XML Schema Definition) is a powerful language that lets you define very precise rules for what an XML document is allowed to look like.
It answers questions like:
- Which elements are allowed?
- In what order must they appear?
- Which elements are required vs optional?
- What data types are allowed (string, integer, date, decimal, email, etc.)?
- Which attributes must exist?
- What are the allowed values (enumerations)?
- What are the minimum/maximum occurrences?
- What are the patterns (regex)?
- Are there any uniqueness or key constraints?
Analogy everyone understands:
Imagine you are building houses (XML documents) for a city council.
- A DTD is like a very basic building code: “Every house must have a door, windows, and a roof — in that order.”
- An XML Schema (XSD) is like a modern, detailed construction law:
- The door must be 90 cm wide and made of wood
- Windows must be double-glazed
- At least 2 bedrooms, maximum 5
- Electricity connection must be 220V, 50Hz
- Roof must have slope between 30° and 45°
- No swimming pool allowed on the roof
- Each house must have a unique plot number
Key advantages over DTD:
| Feature | DTD | XML Schema (XSD) |
|---|---|---|
| Data types | Almost none | Many (integer, decimal, date, email, regex…) |
| Namespaces | No support | Full support |
| Complex rules | Very limited | Very powerful (sequences, choices, groups…) |
| Attribute constraints | Basic | Very detailed |
| Patterns / regex | No | Yes |
| Uniqueness / keys | Very limited (ID/IDREF) | Full support (unique, key, keyref) |
| Still used in 2025–2026 | Legacy only | Dominant standard |
2. Basic Structure of an XSD File
Every XSD document looks roughly like this:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 |
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.com/invoice" elementFormDefault="qualified"> <!-- Here go all your definitions --> </xs:schema> |
Important parts:
- xmlns:xs=”http://www.w3.org/2001/XMLSchema” → the namespace for schema keywords (xs:element, xs:string, etc.)
- targetNamespace → the namespace your XML documents should use (very important in real projects)
- elementFormDefault=”qualified” → most elements must be namespace-qualified
3. Very First Realistic Example – Simple Product Schema
product.xsd
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.com/product" elementFormDefault="qualified"> <!-- The root element --> <xs:element name="product"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="sku" type="xs:string"/> <xs:element name="price"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:decimal"> <xs:attribute name="currency" type="xs:string" use="required"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="stock" type="xs:nonNegativeInteger"/> <xs:element name="category" type="xs:string" minOccurs="0"/> </xs:sequence> <xs:attribute name="id" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:schema> |
Valid XML document
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
<?xml version="1.0" encoding="UTF-8"?> <product xmlns="http://example.com/product" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://example.com/product product.xsd" id="P-784512"> <name>Wireless Mouse</name> <sku>WM-BLK-001</sku> <price currency="INR">1499.00</price> <stock>45</stock> <category>electronics</category> </product> |
Invalid examples (will be rejected by validator):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
<!-- Missing required attribute "id" --> <product>...</product> <!-- price missing required currency attribute --> <price>1499.00</price> <!-- stock cannot be negative --> <stock>-3</stock> <!-- name is missing --> <product id="P-001"> <sku>...</sku> ... </product> |
4. Most Important Building Blocks – With Examples
a) Simple Types vs Complex Types
- Simple type = only text or value (no child elements) → xs:string, xs:integer, xs:decimal, xs:date, xs:boolean…
- Complex type = can have child elements and/or attributes
b) Common Element Declarations
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
<!-- Required element, any string --> <xs:element name="name" type="xs:string"/> <!-- Optional element --> <xs:element name="description" type="xs:string" minOccurs="0"/> <!-- At least 1, maximum 10 --> <xs:element name="tag" type="xs:string" minOccurs="1" maxOccurs="10"/> <!-- Unbounded = any number (including zero) --> <xs:element name="photo" type="xs:string" minOccurs="0" maxOccurs="unbounded"/> |
c) Attributes
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
<xs:attribute name="currency" type="xs:string" use="required"/> <xs:attribute name="status" default="pending"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="pending"/> <xs:enumeration value="processing"/> <xs:enumeration value="shipped"/> <xs:enumeration value="delivered"/> </xs:restriction> </xs:simpleType> </xs:attribute> |
d) Patterns (regex)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
<xs:element name="gstin"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{2}[A-Z]{5}[0-9]{4}[A-Z]{1}[1-9A-Z]{1}[Z]{1}[0-9A-Z]{1}"/> </xs:restriction> </xs:simpleType> </xs:element> |
e) Choice (either-or)
|
0 1 2 3 4 5 6 7 8 9 |
<xs:choice> <xs:element name="individual" type="IndividualType"/> <xs:element name="company" type="CompanyType"/> </xs:choice> |
5. Real-World Style Example – Invoice (very typical structure)
invoice.xsd (simplified but realistic)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://example.com/invoice" elementFormDefault="qualified"> <xs:element name="invoice"> <xs:complexType> <xs:sequence> <xs:element name="number" type="xs:string"/> <xs:element name="issueDate" type="xs:date"/> <xs:element name="dueDate" type="xs:date"/> <xs:element name="seller" type="PartyType"/> <xs:element name="buyer" type="PartyType"/> <xs:element name="items"> <xs:complexType> <xs:sequence> <xs:element name="item" type="ItemType" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="total"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:decimal"> <xs:attribute name="currency" type="xs:string" use="required"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> <!-- Reusable types --> <xs:complexType name="PartyType"> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="gstin" type="xs:string"/> </xs:sequence> </xs:complexType> <xs:complexType name="ItemType"> <xs:sequence> <xs:element name="description" type="xs:string"/> <xs:element name="quantity" type="xs:positiveInteger"/> <xs:element name="rate" type="xs:decimal"/> </xs:sequence> <xs:attribute name="line" type="xs:positiveInteger" use="required"/> </xs:complexType> </xs:schema> |
This is the kind of schema you see in real enterprise, e-invoicing, financial messages, etc.
Quick Summary – XML Schema Cheat Sheet
- Root = <xs:schema>
- Define elements with <xs:element name=”…” type=”…”/>
- Simple types: xs:string, xs:integer, xs:decimal, xs:date, xs:boolean…
- Complex types = elements + attributes + children
- minOccurs / maxOccurs control cardinality
- use=”required” / use=”optional” for attributes
- pattern, enumeration, minInclusive, maxInclusive for restrictions
- targetNamespace + elementFormDefault=”qualified” → modern best practice
Would you like to continue with one of these next?
- Writing a realistic XSD step by step for a specific use case (invoice, order, student report…)
- XSD with namespaces – how it really works in practice
- Difference between XSD 1.0 vs XSD 1.1 (assertions, etc.)
- How to validate XML against XSD (tools, command line, code)
- Common mistakes when writing XSDs
- How GST e-invoice schema looks in real life
Tell me which direction feels most useful or interesting for you right now! 😊
