Chapter 13: XML XPath

XML + XPath — written as if I’m sitting next to you, explaining everything step by step like a patient teacher who wants you to really understand both the XML structure and how to find things inside it using XPath.

We will go slowly: first refresh XML structure → then understand what XPath is → learn the most useful expressions → see many real examples → practice common patterns → finish with tips & pitfalls.

1. Quick Reminder: Why we need XPath

XML documents are hierarchical trees (like folders inside folders).

Example simple XML we will use a lot:

XML

Now imagine you want to answer questions like:

  • Give me all book titles
  • Find the price of “Atomic Habits”
  • Get all books published after 2000
  • Find books that are out of stock
  • Get the last name of the author of the first book

XPath is the standard language created exactly to answer these kinds of questions easily.

2. What is XPath? (The clearest explanation)

XPath = XML Path Language

It is a query language for selecting nodes (elements, attributes, text…) from an XML document.

Think of it like:

  • File path on your computer: /home/user/documents/report.pdf
  • URL path: /products/electronics/phones/iphone-15

XPath is the same idea — but for navigating inside XML trees.

Two important styles:

  • Absolute path → starts from the root → /library/book/title
  • Relative path → starts from current position → book/title

3. Most Important XPath Concepts & Symbols

Symbol / Expression Meaning Example What it selects
/ Child separator (direct child) /library/book All <book> directly under <library>
// Descendant-or-self (any level) //title All <title> elements anywhere
. Current node ./price <price> child of current node
.. Parent node ../author <author> sibling of current node
@ Attribute @category or //@year All category attributes / all year attrs
* Any element //book/* All direct children of any <book>
[] Predicate (condition) //book[@category=’fiction’] Books with category=”fiction”
position() Position in list (//book)[position()=1] First <book> in document order
last() Last item (//book)[last()] Last <book>
text() Select text content //title/text() Text inside all <title> elements
starts-with() String starts with… //title[starts-with(.,’The ‘)] Titles starting with “The “
contains() String contains… //book[contains(author,’Coelho’)] Books where author contains “Coelho”
= , != , < , > Comparison operators //book[@year > 2000] Books newer than 2000

4. Step-by-Step XPath Examples (using the library XML)

What you want to select XPath expression Result (what it finds)
All book titles //book/title or //title Atomic Habits, Rich Dad Poor Dad, The Alchemist
Title of the first book (//book/title)[1] Atomic Habits
All prices //price 499.00, 350.00, 12.99, 120.00
Prices in INR //price[@currency=’INR’] 499.00, 350.00, 120.00
All books that are fiction //book[@category=’fiction’] Atomic Habits + The Alchemist
Books published after 2000 //book[@year > 2000] Atomic Habits
Books that are out of stock //book[stock=0] The Alchemist
Author’s last name of first book //book[1]/author/last Clear
All author names (simple & complex) //author or //author/text() or //author/* James Clear, Robert Kiyosaki, Paulo Coelho
All elements that have a currency attribute //*[@currency] All <price> elements
Second book’s title (//book)[2]/title Rich Dad Poor Dad
Books with price less than 400 INR //book[price[@currency=’INR’ and . < 400]] Rich Dad Poor Dad
Magazines (anything that is not book) //*[not(self::book)] <magazine> element

5. Real-World Style Examples (very common patterns)

Pattern 1: Find products by category and price range

xpath
//product[@category='electronics' and price > 1000 and price < 5000]

Pattern 2: Get all items from orders of a specific customer

xpath
//order[customer/name='Samarth Jain']//item/name

Pattern 3: Find elements with specific text (case-sensitive)

xpath
//title[. = 'Atomic Habits']

Pattern 4: Find elements containing certain text (partial match)

xpath
//title[contains(., 'Dad')]

Pattern 5: Select attribute values only

xpath
//book/@category

Pattern 6: Count something

xpath
count(//book[@year > 2000])

(returns 1 in our example)

6. How XPath is used in real code (very quick examples)

JavaScript (browser)

JavaScript

Python (lxml – very popular)

Python

Java (very common in enterprise)

Java

Quick Summary – XPath Cheat Sheet (keep this handy)

Goal Typical XPath
All something //something
Something inside specific parent /root/parent/something
By attribute //tag[@attr=’value’]
By attribute value comparison //tag[@price > 500]
Text equals //tag[. = ‘exact text’]
Text contains //tag[contains(., ‘part’)]
First / last item (//tag)[1] or (//tag)[last()]
Position (//tag)[position() = 2]
Has attribute //*[@attr]
Children of current ./child or .//child (any level)

Would you like to continue with one of these next?

  • More advanced XPath (functions: normalize-space(), string-length(), not(), or, and…)
  • XPath with namespaces (very common in real SOAP, UBL, Android…)
  • How to use XPath in different languages (Java, Python, C#, JavaScript…)
  • Common mistakes people make with XPath
  • XPath vs CSS Selectors – when to use which
  • Real-world examples from e-invoice, SOAP, Android manifest

Just tell me what feels most useful or interesting for you right now! 😊

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *