Chapter 51: XPath Nodes

XPath Nodes — written as if I’m your patient teacher sitting next to you.

We’ll go slowly, step by step, with many drawings, analogies, real examples, common mistakes, and exercises.

Lesson 1 – What exactly is a “node” in XPath?

In XPath, everything in an XML document is a node.

A node is the smallest unit XPath can select or work with.

Real-life analogy Imagine the XML document is a big family photo album. Each page, each photo, each caption, each sticky note, each photo corner label — every single thing is a node.

There are 7 main types of nodes in XPath (and in the XML DOM):

node-type number Node type name What it is in plain English Example in XML How XPath selects it (typical)
1 element Any tag: <book>, <price>, <author>… <book id=”101″> //book, //price
2 attribute Name=value pairs inside opening tags id=”101″, currency=”INR” @id, //@currency, price/@currency
3 text Plain readable text between tags Atomic Habits, 499.00 //title/text(), text()
4 CDATA section Text that should NOT be parsed as markup <![CDATA[<b>not a tag</b>]]> //text() (CDATA is also text node in XPath)
7 processing-instruction <?xml … ?> or <?php … ?> <?xml version=”1.0″ encoding=”UTF-8″?> //processing-instruction()
8 comment <!– comment –> <!– TODO: update price –> //comment()
9 document The entire document (the invisible root) The whole XML file / or /*

Most important fact for beginners 90–95% of the time you will only care about 3 types:

  • element nodes (type 1)
  • attribute nodes (type 2)
  • text nodes (type 3)

Lesson 2 – Visualizing nodes in a real XML document

Let’s take this small but realistic XML:

XML

Here’s how XPath sees the nodes (simplified drawing):

text

Lesson 3 – How to select different node types in XPath

3.1 Selecting element nodes (most common)

xpath
//book → all <book> elements
/library/book → <book> that are direct children of <library>
book → <book> elements from current context

3.2 Selecting attribute nodes

xpath
//@id → every id attribute anywhere
//book/@id → id attribute of every book
//price/@currency → currency attribute of every price

3.3 Selecting text nodes

xpath
//title/text() → all text nodes directly inside any <title>

Important difference:

xpath
//title → the <title> elements
//title/text() → the text nodes inside the <title> elements

3.4 Selecting comment nodes

xpath
//comment() → all comment nodes anywhere

3.5 Selecting processing instructions

xpath
//processing-instruction()

Lesson 4 – Very practical examples (copy-paste & try)

Example 1 – Get all book titles (two ways)

xpath
//book/title
//book/title/text()

Both give you the titles — but the second one gives you text nodes, the first gives you element nodes.

Example 2 – Get all prices in INR

xpath
//price[@currency = "INR"]
//price[@currency = "INR"]/text()

First → the <price> elements Second → the text inside them (“499”, “349”, etc.)

Example 3 – Get the title of the first book

xpath
/library/book[1]/title
/library/book[1]/title/text()

Example 4 – Get all attributes named “id”

xpath
//@id

Example 5 – Get books that are not in stock

xpath
//book[inStock = "false"]
//book[inStock/text() = "false"]

Both work — but the second is more precise (it looks only at text nodes).

Lesson 5 – Common beginner mistakes & how to fix them

Mistake 1 Thinking //title gives you the text

xpath
//title = "Atomic Habits" ← WRONG!

Correct

xpath
//title = "Atomic Habits" ← compares element to string → usually false
//title/text() = "Atomic Habits" ← correct

Mistake 2 Forgetting that whitespace creates text nodes

XML

→ There is a text node with newline + spaces before <title>

So book/text() will return that whitespace, not the title.

Fix Use book/title/text() instead

Mistake 3 Using //text() when you want element text

xpath
//text() = "Atomic Habits" ← usually false — matches any text node

Better

xpath
//title[text() = "Atomic Habits"]

Lesson 6 – Try yourself exercises (do these!)

  1. Select all prices (both element and text node versions)
  2. Select all books that have price > 400
  3. Select the title of the book with id=”b2″
  4. Select all attributes named “currency”
  5. Select all comments in the document
  6. Select the text inside the magazine title
  7. Select books that are not in stock

Lesson 7 – Real-world context (where you actually use node-type aware XPath)

  • Browser DevTools → $x(“//title/text()”)
  • Web scraping → selecting text nodes only to avoid tags
  • XSLT → matching text nodes vs elements
  • SOAP / web services → extracting values from very nested XML
  • Automated testing → checking exact text content without markup
  • Data extraction → getting clean prices, names, dates from XML feeds

Would you like to continue with one of these next?

  • XPath with namespaces (very common in real XML)
  • Advanced node tests (text(), comment(), processing-instruction())
  • XPath functions that work with nodes (name(), local-name(), lang(), normalize-space())
  • XPath axes in detail (ancestor, following-sibling, preceding…)
  • Real-world examples — RSS, SOAP envelope, e-invoice, Android manifest
  • XPath vs CSS selectors – when to use which for node selection

Just tell me which direction feels most useful or interesting for you right now! 😊

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *