Chapter 41: File Compression
File Compression in Bash/Linux! π
Imagine you have a big folder full of photos, code files, documents, or logs from your Hyderabad project. It takes a lot of space on your laptop or server, and sending it over internet (email, WhatsApp, cloud upload) is slow and expensive on data.
File compression is like packing your suitcase very smartly:
- Squeeze everything smaller so it takes less space
- You can still open it later and get everything back exactly as it was (if it’s lossless compression β which is what we use for code/documents/text)
We never use lossy compression (like MP3 for music or JPEG for photos) for important files because it throws away data forever.
In Linux/Bash, compression usually comes in two steps (most people do both together):
- Archiving β bundle many files/folders into one single file (like putting everything in one big box). Tool: tar (Tape ARchiver β old name from tape backups days).
- Compressing β squeeze that single file smaller. Tools: gzip, bzip2, xz, zstd, etc.
Most common formats you see every day:
- .tar.gz or .tgz β tar archive + gzip compression (most popular)
- .tar.bz2 or .tbz2 β tar + bzip2
- .tar.xz β tar + xz (best compression nowadays)
- .zip β all-in-one (archive + compress, Windows favorite but works on Linux)
- .tar.zst β tar + zstd (new fast modern one)
Letβs learn step-by-step with real examples.
Part 1: Why Two Steps? (tar + compression)
tar alone = bundling (no size reduction, sometimes even bigger because of metadata)
|
0 1 2 3 4 5 6 |
tar -cvf myfiles.tar folder1 file2.txt photo.jpg |
β Creates myfiles.tar (big, no compression)
Then compress it:
|
0 1 2 3 4 5 6 |
gzip myfiles.tar β becomes myfiles.tar.gz |
But smart way β do both in one command (most common):
|
0 1 2 3 4 5 6 |
tar -czvf myfiles.tar.gz folder1 file2.txt photo.jpg |
- -c = create
- -z = use gzip
- -v = verbose (show progress)
- -f = filename follows
Part 2: Main Compression Tools Comparison (2026 Reality)
| Tool/Command | File Extension | Speed (Compress) | Speed (Decompress) | Compression Ratio | Best For | Memory Use |
|---|---|---|---|---|---|---|
| gzip | .gz / .tar.gz | Fast β β β β β | Very fast β β β β β | Good (medium) | Everyday use, logs, quick backups | Low |
| bzip2 | .bz2 / .tar.bz2 | Medium β β β | Medium β β β | Better than gzip | Older standard, text files | Medium |
| xz | .xz / .tar.xz | Slow β β | Medium β β β | Excellent (best classic) | Software packages, long-term storage | High |
| zstd | .zst / .tar.zst | Very fast β β β β β β | Extremely fast β β β β β β | Very good (beats gzip, close to xz at high speed) | Modern default (Arch, Fedora, many in 2026) | Low-Medium |
| zip | .zip | Medium-Fast | Fast | Okay (worse than above) | Sharing with Windows users | Low |
2026 trend β zstd is winning everywhere because it’s fast + good compression. xz had a famous security incident in 2024 (backdoor), so many moved to zstd.
Part 3: Real Examples β Let’s Practice!
Create a playground first:
|
0 1 2 3 4 5 6 7 8 9 10 11 |
mkdir compression_test cd compression_test echo "Hello from Hyderabad!" > file1.txt echo "Bash is fun" > file2.txt mkdir photos touch photos/pic{1..3}.jpg # empty files for demo |
1. gzip (Fast & Simple)
Single file:
|
0 1 2 3 4 5 6 7 8 |
gzip file1.txt ls -lh file1.txt.gz # see smaller size gunzip file1.txt.gz # back to original |
Multiple files + tar (best way):
|
0 1 2 3 4 5 6 7 |
tar -czvf backup.tar.gz . # or highest compression: tar -czf -9 backup.tar.gz . |
Extract:
|
0 1 2 3 4 5 6 |
tar -xzvf backup.tar.gz |
2. xz (Best compression β slow but tiny files)
|
0 1 2 3 4 5 6 7 |
tar -cJvf super_small.tar.xz . # -J = xz |
Extract:
|
0 1 2 3 4 5 6 |
tar -xJvf super_small.tar.xz |
3. zstd (Modern favorite β fast & good)
First install if needed (Ubuntu 22.04+ has it):
|
0 1 2 3 4 5 6 |
sudo apt install zstd |
Compress:
|
0 1 2 3 4 5 6 7 |
tar -I zstd -cvf fast_good.tar.zst . # or level 3 (balance): tar --use-compress-program="zstd -3" -cvf ... |
Even better one-liner:
|
0 1 2 3 4 5 6 |
tar -caf backup.tar.zst . # -a = auto choose (zst if available) |
Extract:
|
0 1 2 3 4 5 6 |
tar -xaf backup.tar.zst |
4. zip (Cross-platform β Windows friendly)
|
0 1 2 3 4 5 6 7 |
zip -r myfiles.zip . # -r = recursive (folders) |
Extract:
|
0 1 2 3 4 5 6 |
unzip myfiles.zip |
With password (extra safe):
|
0 1 2 3 4 5 6 |
zip -r -e secret.zip . # asks for password |
Part 4: Quick Reference Table β Commands Youβll Use Daily
| What you want | Command Example | Notes |
|---|---|---|
| Compress folder β .tar.gz | tar -czvf archive.tar.gz my_folder/ | Fast & compatible |
| Highest gzip | tar -czf -9 archive.tar.gz my_folder/ | -9 = max compression |
| Compress β .tar.xz | tar -cJvf archive.tar.xz my_folder/ | Best ratio, slow |
| Compress β .tar.zst (modern) | tar -caf archive.tar.zst my_folder/ | Fast + good (2026 default many places) |
| Extract any tar.* | tar -xvf archive.tar.* or tar -xaf archive… | -a auto detects |
| Only extract one file | tar -xzvf archive.tar.gz file_inside.txt | Useful |
| Zip folder | zip -r docs.zip Documents/ | Windows friendly |
| Check size before/after | du -sh my_folder/ vs ls -lh archive.tar.gz | See savings |
Part 5: Pro Tips from Teacher
- Always use -v first time β see progress
- Use –dry-run with rsync if syncing compressed archives
- For backups β add date: tar -czvf backup_$(date +%Y-%m-%d).tar.gz ~/Documents/
- Don’t compress already compressed files (jpg, mp4, zip) β little gain, waste time
- Server packages β .deb/.rpm often use xz or zstd now
Got it, boss? File compression = save space + faster transfer + easy backups. Practice these 3β4 times today β create folder, compress with gzip/xz/zstd, extract, compare sizes with ls -lh.
Any part confusing? Want next: “Teacher, how to password protect zip/tar” or “compress & upload to server with rsync” or “zstd vs xz benchmark”?
Just say β teacher is ready in Hyderabad! Keep compressing smartly! π§π¦πΎ π
