Chapter 18: XML DTD

1. What is a DTD? (The clearest explanation)

DTD = Document Type Definition

A DTD is a set of rules that describes:

  • Which elements are allowed in the XML document
  • In what order they must appear
  • Which attributes each element can have
  • Whether elements/attributes are required or optional
  • What kind of content each element can contain (text, other elements, both…)

Think of a DTD as a very strict blueprint or building code that says:

“If you want to build a house (XML document) in this style, then every house must have exactly one front door (<root>), at least two windows (<child>), no swimming pool on the roof, and the door must have a number attribute…”

Key point:

  • A document that follows all DTD rules → valid
  • A document that follows syntax rules but not DTD rules → well-formed but invalid

2. Two Ways to Use a DTD

Way Syntax in XML file When people use it
Internal DTD Inside the XML file itself (inside <!DOCTYPE … >) Quick tests, small documents, learning
External DTD Separate .dtd file referenced with SYSTEM or PUBLIC Real projects, reusable rules, company/gov standards

Most common today: External DTD (but even this is declining)

3. Very First Realistic Example – Internal DTD

XML

What this DTD says:

  • Root element must be <student>
  • <student> must contain exactly in this order: rollno, name, class, section, marks
  • Each of those is text (#PCDATA)
  • <marks> must contain one or more (+) <subject>
  • Each <subject> must have exactly<name> and <score>

Invalid examples (validator would reject these):

XML

4. Most Important DTD Building Blocks (Cheat Sheet)

Declaration Meaning / Example Most common usage
<!ELEMENT name (child1, child2)> Element with exactly these children in order Strict structure
<!ELEMENT name (child1 child2)> Element can have either one or the other
<!ELEMENT name (child*)> Zero or more children Optional repeating items
<!ELEMENT name (child+)> One or more children Required repeating items
<!ELEMENT name (#PCDATA)> Only text content Leaf elements
<!ELEMENT name EMPTY> No content allowed Empty tags (<br/>)
<!ELEMENT name ANY> Anything allowed (very loose) Transitional / debugging
<!ATTLIST element attr CDATA #REQUIRED> Attribute is mandatory Required fields
<!ATTLIST element attr CDATA #IMPLIED> Attribute is optional Optional fields
<!ATTLIST element attr (yes no) "yes"> Enumeration – fixed list of values
<!ATTLIST element attr ID #REQUIRED> Unique identifier Linking with IDREF

5. Real-World Style Example – External DTD (what you see in serious files)

invoice.dtd

dtd
<!ELEMENT invoice (header, seller, buyer, items, totals)>
<!ATTLIST invoice
number CDATA #REQUIRED
date CDATA #REQUIRED
>
<!ELEMENT header (invoiceNumber, issueDate, dueDate)>
<!ELEMENT invoiceNumber (#PCDATA)>
<!ELEMENT issueDate (#PCDATA)>
<!ELEMENT dueDate (#PCDATA)>

<!ELEMENT seller (name, gstin, address?)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT gstin (#PCDATA)>
<!ELEMENT address (#PCDATA)>

<!ELEMENT buyer (name, gstin)>
<!ELEMENT items (item+)>
<!ELEMENT item (description, quantity, rate, amount)>
<!ATTLIST item line CDATA #REQUIRED>
<!ELEMENT description (#PCDATA)>
<!ELEMENT quantity (#PCDATA)>
<!ELEMENT rate (#PCDATA)>
<!ELEMENT amount (#PCDATA)>

<!ELEMENT totals (subtotal, tax, grandTotal)>
<!ELEMENT subtotal (#PCDATA)>
<!ELEMENT tax (#PCDATA)>
<!ELEMENT grandTotal (#PCDATA)>

XML file that uses it

XML

6. Very Common DTD Patterns You Will See

  • ID / IDREF for internal linking
dtd
<!ATTLIST person id ID #REQUIRED>
<!ATTLIST employee manager IDREF #IMPLIED>
  • Enumeration for controlled values
dtd
<!ATTLIST order status (pending | processing | shipped | delivered) "pending">
  • Mixed content (text + elements)
dtd
<!ELEMENT description (#PCDATA | bold | italic)*>

7. Honest Reality in 2025–2026

Situation What actually happens today
New projects needing validation Use XSD (XML Schema)
Very old systems / SOAP services Still many use DTD
Configuration files (web.xml, etc.) Often internal DTD or no validation
Learning XML DTD is still taught first because it’s simpler
Government / enterprise standards Mostly XSD now (GST e-invoice, ISO 20022, HL7, etc.)

Bottom line:

DTD is legacy technology — you should understand it, but almost never choose it for new work.

Would you like to continue with one of these next?

  • Writing a more realistic DTD for invoice / order / student report
  • Differences between DTD and XSD – detailed comparison
  • How to validate XML against DTD (xmllint, online tools, code examples)
  • Common errors people make when writing DTDs
  • How ID/IDREF really works in practice
  • Transitioning from DTD to XSD (what changes)

Just tell me what feels most useful or interesting for you right now! 😊

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *