Chapter 50: R Examples
R Examples” — which I understand as:
“Show me lots of real, practical, copy-paste-ready examples of how people actually use R in daily work — not just isolated functions, but small meaningful tasks.”
That’s a great question — because R is not learned by memorizing 300 functions, but by seeing repeated patterns that solve 90 % of real problems.
I’m going to give you a carefully chosen set of 12 mini real-world examples — the ones that appear again and again in data analyst / data scientist / researcher / student workflows in 2026.
Each example is:
- short enough to copy-paste
- uses only common packages (mostly base + tidyverse)
- has comments explaining “why”
- represents a task you will do very often
Let’s go — like we’re sitting together and I’m showing you my screen.
1. Load a CSV file and get instant overview
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# Most common first step — reading your own data library(tidyverse) # includes readr, dplyr, ggplot2, etc. df <- read_csv("monthly_sales_hyd_2026.csv") # or read_excel() for .xlsx # Instant powerful overview (2026 favorite) skimr::skim(df) # Alternative quick look glimpse(df) summary(df) |
2. Clean column names & fix types
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# Very frequent — messy Excel/CSV column names library(janitor) df <- df |> clean_names() |> # snake_case, no spaces/special chars mutate( order_date = as.Date(order_date), # fix date columns customer_id = as.character(customer_id), revenue = as.numeric(revenue) ) |
3. Filter rows + select columns + arrange
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
# Classic “give me only Hyderabad sales above ₹50,000 sorted by date” top_sales <- df |> filter(city == "Hyderabad", revenue > 50000) |> select(order_date, customer_name, revenue, product_category) |> arrange(desc(revenue)) print(top_sales) |
4. Group by + summarise (the heart of aggregation)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# Monthly revenue by category monthly_summary <- df |> mutate(month = floor_date(order_date, "month")) |> group_by(month, product_category) |> summarise( total_revenue = sum(revenue, na.rm = TRUE), n_orders = n(), avg_order = mean(revenue, na.rm = TRUE), .groups = "drop" ) print(monthly_summary) |
5. Add calculated columns (mutate)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 |
df <- df |> mutate( profit = revenue - cost, profit_margin = profit / revenue * 100, is_high_value = revenue >= 10000, delivery_delay = delivery_days > 3 ) |
6. Quick scatter plot with trend line
|
0 1 2 3 4 5 6 7 8 9 10 |
ggplot(df, aes(x = order_amount, y = profit, color = product_category)) + geom_point(alpha = 0.6, size = 2.5) + geom_smooth(method = "lm", se = FALSE) + theme_minimal() + labs(title = "Profit vs Order Amount by Category") |
7. Bar chart — top 10 customers by revenue
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
df |> group_by(customer_name) |> summarise(total_spent = sum(revenue)) |> slice_max(total_spent, n = 10) |> ggplot(aes(x = reorder(customer_name, total_spent), y = total_spent)) + geom_col(fill = "#00A087") + coord_flip() + theme_minimal() + labs(title = "Top 10 Customers by Total Spend", x = NULL, y = "Revenue (₹)") |
8. Find outliers (simple percentile method)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# Flag orders in top 1% or bottom 1% df <- df |> mutate( is_outlier = revenue > quantile(revenue, 0.99, na.rm = TRUE) | revenue < quantile(revenue, 0.01, na.rm = TRUE) ) # How many outliers? sum(df$is_outlier, na.rm = TRUE) |
9. Quick correlation matrix + visualization
|
0 1 2 3 4 5 6 7 8 9 |
df |> select(where(is.numeric)) |> cor(use = "pairwise.complete.obs") |> corrplot::corrplot(method = "color", type = "lower", tl.cex = 0.8) |
10. Simple t-test (compare two groups)
|
0 1 2 3 4 5 6 7 |
# Do high-value customers have higher profit margin? t.test(profit_margin ~ is_high_value, data = df) |
Modern tidy version:
|
0 1 2 3 4 5 6 7 |
library(rstatix) df |> t_test(profit_margin ~ is_high_value) |
11. Linear regression + nice table
|
0 1 2 3 4 5 6 7 8 9 |
model <- lm(profit ~ revenue + delivery_days + factor(product_category), data = df) library(modelsummary) modelsummary(model, stars = TRUE) |
12. Save plot + table for report
|
0 1 2 3 4 5 6 7 8 9 10 |
# Save plot ggsave("top_customers_bar.png", width = 8, height = 5, dpi = 300) # Save table as CSV or Excel write_csv(monthly_summary, "monthly_summary_2026.csv") |
Final Teacher Summary – The Patterns You Will Repeat Forever
Almost every real R script follows this skeleton:
- Load packages + read data
- Clean names & types (clean_names, mutate)
- Filter / select / arrange
- Group + summarise (the heart)
- Create new columns (mutate)
- Visualize (ggplot)
- Statistical test or model (t.test, lm, glm)
- Tidy output (broom, modelsummary)
- Save results (ggsave, write_csv)
You now have 12 concrete, reusable mini-blocks — each one solves a task you will do hundreds of times.
Which of these examples felt most useful / closest to what you actually want to do?
Want to:
- Take any one of them and expand it into a full 30–50 line script?
- Do a complete small project together (e.g. analyze sales data from CSV)?
- Learn one specific pattern deeper (e.g. more mutate tricks, more ggplot customizations)?
- Or move to next big topic (loops, functions, R Markdown / Quarto reports)?
Just tell me — I’m right here with the whiteboard ready! 🚀📊
