Chapter 48: R Mode
1. What the Mode Actually Is (Simple & Honest Intuition)
The mode is:
the value (or values) that appears most frequently in the data set.
That’s it — no math, no summing, no sorting involved in the middle.
Real-life feeling Imagine you ask 20 friends in Hyderabad: “Which biryani do you prefer most?”
Answers:
- Hyderabadi: 9 people
- Ambur: 4 people
- Lucknowi: 3 people
- Kolkata: 2 people
- Vijayawada: 2 people
→ The mode = Hyderabadi (appears 9 times)
Even if someone says “I ate a ₹15,000 Hyderabadi family pack once”, it doesn’t change the mode — frequency is all that matters.
2. Important Facts About Mode (Many Beginners Miss These)
- A data set can have no mode (every value appears exactly once)
- It can have one mode → unimodal
- It can have two modes → bimodal
- It can have several modes → multimodal
- Mode is the only measure of central tendency that works well with categorical / nominal data (colors, brands, cities, food types, gender, blood group, etc.)
3. Real Hyderabad Examples – When Mode Wins
Example A – Favorite food delivery cuisine
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
cuisine <- c("Biryani", "Pizza", "Biryani", "Chinese", "Biryani", "Pizza", "Biryani", "South Indian", "Biryani", "Biryani", "Pizza", "Chinese", "Biryani", "Biryani") table(cuisine) # Biryani Chinese Pizza South Indian # 8 2 3 1 # Mode = Biryani (8 times) |
→ If Swiggy/Zomato asks “what cuisine do people order most in Hyderabad?”, they report the mode, not the mean price or median time.
Example B – Most common shoe size in a college class
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
shoe_sizes <- c(7, 8, 9, 8, 7, 10, 8, 9, 7, 8, 8, 9, 7, 8, 11) table(shoe_sizes) # 7 8 9 10 11 # 4 6 3 1 1 # Mode = 8 (appears 6 times) |
→ Shoe companies care about mode when deciding which size to produce the most of.
Example C – Bimodal / multimodal data (very important real case)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
ages_at_party <- c(22, 23, 24, 22, 35, 36, 37, 38, 22, 23, 24, 35, 36) table(ages_at_party) # 22 23 24 35 36 37 38 # 3 2 2 2 2 1 1 # Modes = 22, 35, 36 (multiple peaks) |
→ This shows two clusters — college students (~22–24) and young working professionals (~35–38). Mode(s) reveal the bimodal / multimodal nature — mean and median would hide this completely.
4. How R Actually Computes Mode (Base R vs Modern Way)
Base R problem There is no built-in mode() function in base R that gives you the statistical mode — mode() does something completely different (it tells you the storage mode: numeric, character, etc.).
So people write workarounds:
|
0 1 2 3 4 5 6 7 |
# Classic ugly base R way names(which.max(table(ages_at_party))) # "22" |
Modern / recommended way 2026 (much cleaner)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
# Best option – DescTools package # install.packages("DescTools") once library(DescTools) Mode(ages_at_party) # 22 Mode(ages_at_party, na.rm = TRUE) # handles NA # Or using dplyr + count (very tidyverse style) library(dplyr) ages_at_party_df <- data.frame(age = ages_at_party) ages_at_party_df |> count(age) |> slice_max(n, n = 3) # top 3 most frequent (handles ties) |
5. When to Use Mode (Quick 2026 Decision Guide)
Use mode when:
- Data is categorical / nominal (colors, brands, cities, food types, gender, blood group, preferred language, most common complaint)
- You want the most frequent / most popular answer
- You suspect multiple clusters (bimodal / multimodal data)
- You are analyzing survey data where people choose from fixed options
- You want to know “what do most people do / prefer / buy”
Do NOT use mode when:
- Data is continuous numeric with few repeats (exam marks out of 100, temperatures, heights) → almost always no mode or meaningless
- All values appear roughly equally often → no mode
- You need a measure that uses every data point (mean does this)
- You want to do further math (variance, standard deviation need mean)
6. Your Mini Practice Right Now (Copy → Run & Play)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# Your own small survey – favorite Hyderabad chaat item chaat <- c("Pani Puri", "Bhel Puri", "Pani Puri", "Samosa Chat", "Pani Puri", "Sev Puri", "Pani Puri", "Bhel Puri", "Pani Puri", "Pani Puri", "Samosa Chat", "Pani Puri", "Bhel Puri") # Classic way table(chaat) # Modern clean way library(dplyr) data.frame(chaat = chaat) |> count(chaat) |> arrange(desc(n)) |> slice_head(n = 3) # top 3 most popular # Using DescTools library(DescTools) Mode(chaat) # "Pani Puri" |
Now try these experiments:
- Add ten more “Sev Puri” → see mode change
- Make every item appear exactly twice → see “no mode” situation
- Add a third strong contender → see bimodal / multimodal result
You just saw how mode behaves with your own data!
Clearer now?
Next logical questions?
- Want to see bimodal data visualized (histogram / density plot)?
- Learn quartiles, percentiles, IQR next?
- Compare mean / median / mode side-by-side on real datasets (iris, mtcars)?
- Or jump to first measure of spread (range, variance, standard deviation)?
Just tell me — whiteboard is ready! 📊🧮🚀
