Caching 101 to Advanced

Caching is one of the most important performance techniques in modern system design, yet it’s often misunderstood or explained only partially. This guide walks you through caching from the ground up — starting with what caching is, why systems rely on it, and how different caching strategies shape data flow between your application and the database. We then explore cache invalidation, the rules that keep cached data fresh, followed by eviction policies like LRU and LFU that decide what stays in memory.

With clear explanations and practical JavaScript examples, this is the perfect starting point for mastering caching from both a system-design and developer perspective.

To make these ideas easier to follow, here’s the structured flow we’ll use throughout the guide — starting from the fundamentals and gradually moving toward hands-on implementations:

What is caching?
Why do systems use caching?
Types of caches
Caching strategies
- Cache-aside
- Read-through
- Write-through
- Write-back
- Write-around
Cache invalidation
- TTL / time-based
- Event-based
- Write-based invalidation
- Versioned invalidation
- Stampede prevention
Cache eviction policies
- LRU / LFU / FIFO / TTL
Implement caching in JavaScript
Build a cache-aside service
Redis introduction & integration

What is Caching?

Caching is the technique of storing frequently accessed data in a fast storage layer so that future requests are served faster.

Think of it like keeping important files on your desk instead of going to another place(store room) every time.

Why Do Systems Use Caches?

1. Reduce Latency

Fetching data from a cache is much faster than:

Disk
Database
Network calls
Expensive computations

Example speeds:

CPU register   ~1 ns
RAM            ~100 ns
Redis network  ~0.5–1 ms
DB query       ~5–50 ms
Disk IO        ~10–100 ms

Caches move you from milliseconds → nanoseconds or microseconds.

2. Reduce Load on Backend Systems

Without caching:

Every request hits the DB
DB gets overloaded
Latency increases → system fails

With caching:

Most reads are served from memory
Only a few requests go to database → DB survives traffic spikes

This is why large-scale systems cannot exist without caching.

3. Handle High Read Traffic Efficiently

Most real systems are read-heavy:

Product listings
User profiles
Comments
Feed items
Notifications count

Caching makes these reads extremely fast.

4. Save Cost

DB queries are expensive → both performance-wise and $$$.
Cache is cheap and fast memory → RAM or in-memory store like Redis.

5. Improve User Experience

Faster responses → better UX
E.g., Instagram feed loads instantly because most of the data is cached.

🔥 What should be cached?

Not everything. Only data that is:

Frequently read
Rarely changing → or you can tolerate some staleness
Slow/expensive to compute
Needed at low latency

Examples:

Product details
User profile
Weather data
Stock prices
Computation-heavy analytics
Database query results

🧩 When NOT to use cache?

Caching is not useful when:

Data changes very frequently
Data must always be strictly up-to-date
Write-heavy workloads
Space-limited memory
Inconsistency cannot be tolerated

Example: Bank account balance — must always reflect real value → caching risky.

🔥 Problem caching solves in system design

Caching fixes two key system bottlenecks:

1. Latency bottleneck → response is too slow

DB takes 40ms → Cache gives 1ms → System feels 40x faster.

2. Throughput bottleneck → DB can’t handle traffic

A DB that handles 50k qps normally might crash at 200k qps.
A cache can handle millions of reads per second.

💡 High-level flow of how caching works (simple mental model)

Client → API → Cache → (Hit? Return)
                     → (Miss? Fetch DB → Store in cache → Return)

Types of Caches:

Types of caches describe where the cache lives in the system and what kind of problems it solves. Understanding this helps you decide what to cache and where to cache it.

1. Client-side cache

This cache lives on the user’s device.

Examples:

Browser HTTP cache
LocalStorage / SessionStorage
Service Worker cache
cookies

What it’s used for:

Static assets (JS, CSS, images)
API responses that don’t change often
Reducing repeated network calls

Why it matters:

Eliminates network latency entirely
Reduces server load

Trade-offs:

Limited storage
Harder to invalidate
Data can be stale or user-specific

2. CDN cache

A CDN (Content Delivery Network) caches content at edge locations close to users.

Examples:

Cloudflare
AWS CloudFront
Akamai

What it’s used for:

Images, videos, static pages
Public API responses

Why it matters:

Very low latency globally
Absorbs massive traffic spikes

Trade-offs:

Mostly read-only
Invalidation can be slow or costly
Not suitable for highly dynamic data

3. Application-level (In-memory) cache

This cache lives inside your application process.

Examples:

JavaScript Map
LRU cache in Node.js
In-memory objects

What it’s used for:

Hot data
Computation results
Short-lived caching

Why it matters:

Extremely fast
Simple to implement
No network overhead

Trade-offs:

Lost on app restart
Not shared across instances
Memory-limited

This is where we’ll start coding.

4. Distributed cache

This cache lives outside the application, shared by multiple services.

Examples:

Redis
Memcached

What it’s used for:

Shared user sessions
Database query results
Frequently accessed objects

Why it matters:

Shared across instances
Survives app restarts
Scales horizontally

Trade-offs:

Network latency (still fast)
Operational complexity
Consistency challenges

This is the most common production cache.

5. Database-level cache

This cache is handled inside the database layer.

Examples:

Query cache
Buffer pool (Postgres, MySQL)
Materialized views

What it’s used for:

Repeated queries
Index-heavy lookups

Why it matters:

Transparent to applications
Improves DB performance

Trade-offs:

Less control
Still slower than in-memory or Redis
Cannot replace application caching

🧠 How to think about cache layers (important mental model)

Most real systems use multiple caches together:

Browser cache
   ↓
CDN
   ↓
Application in-memory cache
   ↓
Distributed cache (Redis)
   ↓
Database

Each layer reduces load on the one below it.

💡 Key takeaway

Cache location matters as much as cache logic
In-memory cache = fastest, simplest
Distributed cache = scalable, production-ready
CDN cache = global performance
DB cache = safety net, not a strategy

Summary & Next Steps

Caching shifts repeated work to faster layers, cutting latency and easing database load. Use it where reads are frequent, changes are infrequent, and low latency matters. Think in layers—browser → CDN → app memory → Redis → database—and keep data fresh with sound invalidation and eviction.

Next, we’ll dive into practical caching strategies, invalidation approaches, eviction policies, JavaScript implementations, and Redis integration to make caching work end-to-end.

I am Harisai