Chapter 13: Set Operations
Set operations let you combine the results of two or more SELECT queries as if they were mathematical sets. Think of each SELECT as producing a “bag of rows”. With set operations, you can:
- Merge them together
- Find what’s common
- Find what’s in one but not the other
The four main set operations we’ll cover today are:
- UNION and UNION ALL
- INTERSECT
- EXCEPT (called MINUS in Oracle)
These are very powerful for reports, data cleaning, finding duplicates, comparing datasets, etc.
Important Rules That Apply to ALL Set Operations
- All SELECTs must have the same number of columns
- Corresponding columns must have compatible data types (e.g., INT with INT, VARCHAR with VARCHAR)
- Column names are taken from the FIRST SELECT (others are ignored)
- Order of columns matters — they must match positionally
- Set operations remove duplicates by default (except UNION ALL)
Let’s use our bookshop database and create two more tables for realistic examples.
Setup – Extra Tables for This Chapter
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
-- Indian authors table (subset) CREATE TABLE indian_authors AS SELECT * FROM authors WHERE country = 'India'; -- US authors table (subset) CREATE TABLE us_authors AS SELECT * FROM authors WHERE country = 'USA'; -- Old books (published before 2000) CREATE TABLE old_books AS SELECT * FROM books WHERE published_year < 2000; -- Recent books (published 2000 or later) CREATE TABLE recent_books AS SELECT * FROM books WHERE published_year >= 2000; |
1. UNION – Combine Two Result Sets (Removes Duplicates)
UNION takes all rows from both queries and removes duplicates.
Syntax:
|
0 1 2 3 4 5 6 7 8 |
SELECT column1, column2, ... FROM table1 UNION SELECT column1, column2, ... FROM table2; |
Example – All authors from India AND USA (no duplicates):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 |
SELECT name, country FROM indian_authors UNION SELECT name, country FROM us_authors ORDER BY name; |
Result (only unique authors):
| name | country |
|---|---|
| Amish Tripathi | India |
| Chetan Bhagat | India |
| Chitra Banerjee | USA |
| Jhumpa Lahiri | USA |
| Khushwant Singh | India |
| R.K. Narayan | India |
| Ruskin Bond | India |
| Sudha Murty | India |
| Vikram Seth | India |
If an author was in both tables → appears only once.
2. UNION ALL – Combine Two Result Sets (Keeps All Duplicates)
UNION ALL is like UNION but does NOT remove duplicates → faster and keeps everything.
Example – All books from old AND recent (even if somehow duplicated):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 |
SELECT title, price, published_year FROM old_books UNION ALL SELECT title, price, published_year FROM recent_books ORDER BY published_year; |
Result: All 10 books appear, even if there was overlap (in our case, no overlap, but if there was → duplicates kept).
When to use UNION ALL instead of UNION?
- When you know there are no duplicates (or you want to keep them)
- When performance matters (UNION ALL is faster because no deduplication)
3. INTERSECT – Find Common Rows (What’s in Both)
INTERSECT returns only rows that appear in both result sets.
Syntax:
|
0 1 2 3 4 5 6 7 8 |
SELECT ... FROM table1 INTERSECT SELECT ... FROM table2; |
Example – Authors who appear in both indian_authors and us_authors (in our data: none, but imagine):
|
0 1 2 3 4 5 6 7 8 9 10 |
SELECT name, country FROM indian_authors INTERSECT SELECT name, country FROM us_authors; |
Result: Empty (no common authors)
Realistic example – Find books that are both old and recent? (none in our case):
|
0 1 2 3 4 5 6 7 8 9 10 |
SELECT title, price FROM old_books INTERSECT SELECT title, price FROM recent_books; |
Note: MySQL does not support INTERSECT natively until version 8.0.31+. Workaround: Use INNER JOIN or EXISTS.
Alternative in MySQL:
|
0 1 2 3 4 5 6 7 8 9 10 |
SELECT title, price FROM old_books WHERE (title, price) IN ( SELECT title, price FROM recent_books ); |
4. EXCEPT / MINUS – Rows in First But Not in Second
EXCEPT (SQL standard) or MINUS (Oracle) returns rows that are in the first result set but not in the second.
Syntax:
|
0 1 2 3 4 5 6 7 8 |
SELECT ... FROM table1 EXCEPT -- or MINUS in Oracle SELECT ... FROM table2; |
Example – Authors who are Indian but NOT in the US list:
|
0 1 2 3 4 5 6 7 8 9 10 11 |
SELECT name, country FROM indian_authors EXCEPT SELECT name, country FROM us_authors ORDER BY name; |
Result: All Indian authors (since no overlap)
Another example – Old books that are NOT in recent books:
|
0 1 2 3 4 5 6 7 8 9 10 |
SELECT title, published_year FROM old_books EXCEPT SELECT title, published_year FROM recent_books; |
Result: All old books (The Room on the Roof, The Blue Umbrella, Malgudi Days, Train to Pakistan)
Note: MySQL does not support EXCEPT/MINUS natively. Workaround: Use LEFT JOIN + IS NULL or NOT EXISTS.
MySQL workaround for EXCEPT:
|
0 1 2 3 4 5 6 7 8 9 10 11 |
SELECT b1.title, b1.published_year FROM old_books b1 LEFT JOIN recent_books b2 ON b1.title = b2.title AND b1.published_year = b2.published_year WHERE b2.title IS NULL; |
Full Realistic Business Example – Combining Everything
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
-- Report: All unique authors from both Indian and US lists, -- plus a flag if they have books published after 2000 SELECT name AS author_name, country, 'Indian List' AS source, CASE WHEN EXISTS (SELECT 1 FROM books b WHERE b.author_id = a.author_id AND b.published_year >= 2000) THEN 'Has recent books' ELSE 'No recent books' END AS recent_activity FROM indian_authors a UNION ALL SELECT name, country, 'US List' AS source, CASE WHEN EXISTS (SELECT 1 FROM books b WHERE b.author_id = a.author_id AND b.published_year >= 2000) THEN 'Has recent books' ELSE 'No recent books' END FROM us_authors a ORDER BY author_name; |
