Chapter 12: XML DOM
1. What is XML DOM really?
DOM = Document Object Model
When we talk about XML DOM, we mean:
A standard way to represent an entire XML document as a tree of objects in memory — so your program can easily read, navigate, search, modify, add, remove, and write back XML data.
Think of it like this:
- The XML file/text is like a folded book
- The DOM parser opens the book completely
- Turns every chapter, section, paragraph, word into objects connected in a family tree
- Now your program can walk around this tree like it’s a real house — go to any room, change furniture, add new rooms, etc.
Key characteristics of DOM:
- The whole document is loaded into memory
- You get a complete navigable tree
- You can go up, down, left, right (parent, child, sibling…)
- You can modify the tree and then serialize it back to XML
2. The XML DOM Tree – How it Really Looks
Take this small but realistic XML:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
<?xml version="1.0" encoding="UTF-8"?> <order orderId="ORD-20250715-9184" date="2025-07-15"> <customer type="guest"> <name>Samarth Jain</name> <email>samarth.j@example.com</email> </customer> <items count="2"> <item line="1" sku="HDMI-2M"> <name>HDMI Cable 2m</name> <quantity>1</quantity> <price currency="INR">349.00</price> </item> <item line="2" sku="TS-BLK-M"> <name>Black T-Shirt Medium</name> <quantity>2</quantity> <price currency="INR">499.00</price> </item> </items> <total currency="INR">1347.00</total> </order> |
DOM tree view (simplified):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
Document └── Element: order (root element) ├── Attribute: orderId = "ORD-20250715-9184" ├── Attribute: date = "2025-07-15" ├── Element: customer │ ├── Attribute: type = "guest" │ ├── Element: name → Text: "Samarth Jain" │ └── Element: email → Text: "samarth.j@example.com" ├── Element: items │ ├── Attribute: count = "2" │ ├── Element: item (first) │ │ ├── Attribute: line = "1" │ │ ├── Attribute: sku = "HDMI-2M" │ │ ├── Element: name → Text: "HDMI Cable 2m" │ │ ├── Element: quantity → Text: "1" │ │ └── Element: price → Text: "349.00" │ │ └── Attribute: currency = "INR" │ └── Element: item (second) ... └── Element: total → Text: "1347.00" └── Attribute: currency = "INR" |
Every tag → becomes an Element node Every attribute → becomes an Attr node Every piece of text → becomes a Text node
3. Most Important DOM Concepts & Names
| Term | What it is | Common methods / properties |
|---|---|---|
| Document | The entire XML document (the root of everything) | document.getElementById(), document.documentElement |
| Element | Any XML tag (<order>, <customer>, <item>) | getAttribute(), setAttribute(), textContent, innerHTML (in browsers) |
| Attr | An attribute (orderId=”…”) | name, value |
| Text | The actual characters between tags | nodeValue, data |
| Node | General term — everything is a Node | nodeType, parentNode, childNodes, nextSibling |
| NodeList | List of nodes (like array, but not real array) | item(index), length |
Node types you will meet most often:
- 1 = ELEMENT_NODE
- 2 = ATTRIBUTE_NODE
- 3 = TEXT_NODE
- 9 = DOCUMENT_NODE
4. Real Code Examples – How People Actually Use XML DOM
Python – xml.etree.ElementTree (very popular & simple)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
import xml.etree.ElementTree as ET # Load from file or string tree = ET.parse('order.xml') root = tree.getroot() # <order> element # Read attributes order_id = root.get('orderId') print(f"Order ID: {order_id}") # Navigate to child customer = root.find('customer') customer_name = customer.find('name').text print(f"Customer: {customer_name}") # Find all items for item in root.findall('.//item'): sku = item.get('sku') name = item.find('name').text qty = item.find('quantity').text print(f" - {qty}x {name} (SKU: {sku})") # Modify something root.set('status', 'shipped') # Add new element new_note = ET.SubElement(root, 'note') new_note.text = 'Customer requested fast delivery' # Save back tree.write('order_modified.xml', encoding='utf-8', xml_declaration=True) |
JavaScript – Browser DOMParser (very common in web apps)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
const xmlString = `...paste the order XML here...`; const parser = new DOMParser(); const doc = parser.parseFromString(xmlString, "application/xml"); // Check for parse errors if (doc.querySelector("parsererror")) { console.error("XML parsing failed"); return; } const root = doc.documentElement; // Get attribute const orderId = root.getAttribute("orderId"); console.log("Order ID:", orderId); // Find elements const customerName = root.querySelector("customer name").textContent; console.log("Customer:", customerName); // All items const items = root.querySelectorAll("item"); items.forEach(item => { const sku = item.getAttribute("sku"); const name = item.querySelector("name").textContent; console.log(`Item: ${name} (SKU: ${sku})`); }); // Modify root.setAttribute("status", "processing"); // Serialize back to string const serializer = new XMLSerializer(); const newXml = serializer.serializeToString(doc); console.log(newXml); |
Java – Classic DOM (javax.xml.parsers)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
import javax.xml.parsers.DocumentBuilderFactory; import org.w3c.dom.Document; import org.w3c.dom.Element; import org.w3c.dom.NodeList; Document doc = DocumentBuilderFactory.newInstance() .newDocumentBuilder() .parse("order.xml"); Element root = doc.getDocumentElement(); String orderId = root.getAttribute("orderId"); // Find customer name Element customer = (Element) root.getElementsByTagName("customer").item(0); String name = customer.getElementsByTagName("name").item(0).getTextContent(); // Add attribute root.setAttribute("status", "confirmed"); // Create new element Element note = doc.createElement("note"); note.setTextContent("Urgent delivery requested"); root.appendChild(note); |
5. Most Useful DOM Methods & Properties (Cheat Sheet)
Finding / navigating
- getElementById() (only if id attribute exists)
- getElementsByTagName(“item”) → NodeList
- querySelector(“customer name”) → first match (modern)
- querySelectorAll(“item”) → all matches
- parentNode, firstChild, lastChild, nextSibling, previousSibling
Reading & writing
- getAttribute(“sku”) / setAttribute(“status”, “shipped”)
- removeAttribute(“oldAttr”)
- textContent / nodeValue (for text nodes)
- innerHTML (in browsers — careful with security)
Creating
- createElement(“note”)
- createTextNode(“Hello”)
- appendChild(), insertBefore(), removeChild()
6. When to Use DOM vs When Not To
Use DOM when:
- File is small to medium (< 5–10 MB)
- You need to read + modify freely
- You want simple, readable code
- You’re working in browser (DOMParser)
- Configuration files, small invoices, Android manifest…
Avoid / use streaming instead when:
- Very large files (hundreds of MB or GB)
- Memory is limited (mobile, server batch jobs)
- You only need to extract a few fields
- One-pass processing
Quick Summary – XML DOM in One Page
- DOM = whole XML loaded as navigable object tree in memory
- Everything is a Node (Element, Attr, Text, Document…)
- You can find, read, change, add, remove, write back
- Very intuitive — feels like working with folders & files
- Memory-hungry → not good for huge documents
- Still very widely used in 2025–2026 (especially in browsers, Python, Java config)
Would you like to go deeper into any part?
- How to handle namespaces in DOM
- Creating XML from scratch using DOM
- Common errors (null pointers, parse errors, namespace issues)
- DOM vs StAX / iterparse – detailed comparison
- How DOM is used in real projects (Spring, Android, browser apps…)
- XPath with DOM – powerful searching
Just tell me what you want to explore next! 😊
