Chapter 10: R Variable Names
R Variable Names (also called object names or identifiers).
Naming variables well is one of the most important skills in R — even more than knowing fancy functions. Bad names make code hard to read in 2 weeks; good names make it feel like self-documenting English.
I’ll explain everything like we’re sitting together in RStudio, going line by line — rules first, then best practices 2026, lots of good/bad examples, and real mini-scripts you can copy-paste.
1. The Strict Syntax Rules (What R Actually Allows)
R has very clear legal rules for variable names. If you break them → immediate error.
| Rule | Allowed (Good) | Not Allowed (Error) | Explanation |
|---|---|---|---|
| Must start with | letter (a–z, A–Z) or . | 1st_place, 2026_sales, -temp | Cannot start with number or most symbols |
| After first character | letters, numbers, _, . | student-name, price@tax, total space | No -, @, #, $, spaces, %, &, etc. |
| Case-sensitive | Age ≠ age ≠ AGE | — | student and Student are completely different |
| Length | practically unlimited (but be reasonable) | — | Very long names are legal but painful to type |
| Reserved words | — | if, else, TRUE, function, NA, Inf, NULL, for, while, break, next, return | You cannot use language keywords as names |
| Special exceptions | . alone or ..1, ..2 etc. | — | . is legal but almost never used for normal variables (used for hidden/temp objects) |
Quick test you can run right now:
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# Legal student_1 <- 85 .hyd_temp <- 29.5 AgeGroup2026 <- "Young" total.revenue <- 45000 # Illegal (will give error) 1st_place <- 1 # Error: unexpected symbol student-name <- 92 # Error: unexpected symbol in "student-name" if <- 10 # Error: unexpected assignment TRUE <- FALSE # Error: can't assign to reserved word |
2. Real-World Best Practices & Conventions (2026 Reality)
The legal rules are loose — but the community has very strong opinions about what makes code readable.
Today (2026) the dominant convention — especially in data science, tidyverse world, most universities, journals, and new packages — is the tidyverse style guide (written by Hadley Wickham, still the gold standard):
- Use only lowercase letters, numbers, and _ (underscore)
- Separate words with _ → called snake_case
- Variable names → nouns (what it contains)
- Function names → verbs (what it does)
|
0 1 2 3 4 5 6 7 8 9 10 11 |
# Tidyverse style – overwhelmingly recommended in 2026 student_name <- "Webliance" exam_marks <- c(92, 85, 78) monthly_revenue <- 145000 is_hot_day <- temperature > 30 customer_id_list <- c("C001", "C145", "C289") |
Why snake_case won in modern R:
- Very readable (words clearly separated)
- No confusion with S3 methods (which use . like print.myclass)
- Matches column names in tidy data (dplyr, tidyr, etc.)
- Easy to type (no need to hold Shift for capitals)
- Used in: tidyverse packages, r4ds book, most recent textbooks, Posit training
3. Other Styles You Will Still See (and When)
| Style | Example | Where You See It | Pros | Cons / Why Avoid in 2026 |
|---|---|---|---|---|
| snake_case | total_revenue, student_marks | tidyverse, dplyr pipelines, most new code | Most readable, modern standard | None really |
| dot.case | total.revenue, student.marks | base R, old code, some legacy packages | Looks like data.frames | Confusing with S3 methods (print.data.frame) |
| camelCase | totalRevenue, studentMarks | Bioconductor, some Shiny apps, older teaching | Familiar from Java/JS | Harder to read long names |
| UpperCamelCase (PascalCase) | TotalRevenue, StudentMarks | Class names, some function names (Google style) | Clear for classes | Not for variables |
Bottom line 2026: If you’re learning new R, doing data analysis, using tidyverse/ggplot2/dplyr → snake_case is the way. Consistency in one project matters more than which style — but snake_case is the safest bet right now.
4. Good vs Bad Naming – Real Examples
Bad (hard to understand later):
|
0 1 2 3 4 5 6 7 8 |
x <- read.csv("data.csv") # what is x??? df2 <- filter(x, age > 18) # df2? why 2? m <- mean(df2$score) # m = mean? magic? |
Good (self-explaining):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 |
raw_survey_data <- read.csv("survey_responses_2026.csv") adults_only <- raw_survey_data |> filter(age >= 18) average_satisfaction <- mean(adults_only$satisfaction_score, na.rm = TRUE) cat("Average satisfaction among adults:", round(average_satisfaction, 1), "\n") |
Even better (with context):
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
# Hyderabad customer feedback survey – Feb 2026 survey_raw <- readxl::read_excel("hyd_feedback.xlsx", sheet = "Responses") survey_clean <- survey_raw |> filter(!is.na(rating), rating >= 1, rating <= 5) |> mutate( feedback_type = case_when( rating >= 4 ~ "positive", rating == 3 ~ "neutral", TRUE ~ "negative" ) ) avg_rating_by_age_group <- survey_clean |> group_by(age_group) |> summarise(avg_rating = mean(rating, na.rm = TRUE)) |
5. Quick Tips from Experienced R Users
- Be descriptive but not crazy long → customer_lifetime_value > clv but < customer_lifetime_monetary_value_calculated_on_2026_data
- Avoid numbers in names if possible → instead of sales_2025, sales_2026 → put year in a variable or filter
- Use plural for collections → student_marks (vector), students (data frame)
- Logical variables → start with is_, has_, can_ → is_weekend, has_missing_values
- Constants → ALL_CAPS → MAX_TEMPERATURE, PI
- Never use . for normal variables (save for S3 classes/methods)
Your Mini Practice (Copy → Run & Rename!)
|
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
# Bad names – try to understand quickly a <- 29.5 b <- c("Biryani", "Haleem", "Irani Chai") c <- data.frame(x = 1:3, y = c(28,30,29)) # Rename them properly (snake_case style) current_temp_hyd <- 29.5 favorite_hyd_foods <- c("Biryani", "Haleem", "Irani Chai") daily_temperatures <- data.frame( day_number = 1:3, temp_c = c(28, 30, 29) ) print(daily_temperatures) |
Now the code tells a story without comments!
Questions?
- Want to fix naming in one of your old scripts together?
- Go deeper into why . is dangerous in names?
- Next topic (vectors, subsetting, data frames…)?
Just tell me — whiteboard is ready! ☕🚀
