"There are only two hard things in computer science: cache invalidation and naming things." This famous joke by Phil Karlton sums up the paradox of caching: it is the most effective technique for speeding up a system and, at the same time, one of the subtlest sources of bugs. A cache stores copies of expensive-to-obtain data in a fast-access location, so as not to recompute them or request them from the database every time. Used well, it reduces latency from milliseconds to microseconds and drastically offloads the main storage. Used poorly, it serves stale data, hides errors, and causes cascading failures. In this lesson we will study cache levels, read and write patterns, invalidation and TTL strategies, classic problems such as the cache stampede, and we will look at concrete examples with Redis.

Contents

  1. What a cache is and why it works
  2. Cache levels
  3. Read patterns: cache-aside and read-through
  4. Write patterns: write-through and write-behind
  5. Invalidation, TTL, and eviction policies
  6. Cache problems and how to mitigate them
  7. Practical example with Redis

  1. What a cache is and why it works

A cache is an intermediate store, fast and of limited capacity, that keeps copies of data to serve them without repeating the work of obtaining them from the source. It works thanks to two principles:

  • Temporal locality: a piece of data queried now is likely to be queried again soon.
  • Pareto principle: a small percentage of the data concentrates the majority of the accesses (the best-selling products, the most active users).

The two fundamental metrics are the hit ratio (percentage of requests served from the cache) and the latency. A hit avoids going to the source; a miss involves the full cost plus that of storing in the cache.

  1. Cache levels

The cache appears at many layers of an architecture. From closest to the user to closest to the data:

Level Where it lives Example Scope
Browser / client User's device Cache-Control headers One user
CDN Edge network Cloudflare, CloudFront Global, static content
Gateway / reverse proxy Front end of the system Nginx, Varnish All requests
Application cache (local) Process memory Caffeine, Guava One instance
Distributed cache Shared external service Redis, Memcached All instances
Database cache DB engine Buffer pool Internal

The key distinction for the architect is between the local cache (in-process: blazing fast but not shared and lost on restart) and the distributed cache (Redis: slightly slower due to the network, but shared across all instances and persistent). In systems with several instances, the distributed cache prevents each one from keeping inconsistent copies.

  1. Read patterns: cache-aside and read-through

3.1 Cache-Aside (Lazy Loading)

This is the most common pattern. The application manages the cache explicitly: it looks in the cache first and, if it isn't there, goes to the database and stores the result.

public Product getProduct(long id) {
    String key = "product:" + id;

    Product cached = cache.get(key);     // 1. is it in the cache?
    if (cached != null) {
        return cached;                    // HIT: we return without touching the DB
    }

    Product product = repository.findById(id);  // 2. MISS: we go to the DB
    if (product != null) {
        cache.set(key, product, Duration.ofMinutes(10)); // 3. we store with a TTL
    }
    return product;
}

Step by step:

  • Step 1: the cache is queried. If there is a hit, it is returned immediately.
  • Step 2: in case of a miss, we go to the data source.
  • Step 3: it is stored in the cache with a TTL (time to live) of 10 minutes for future requests.

Advantage: only what is actually requested is cached (lazy). Drawback: the cache logic gets mixed with the business logic and the first access is always slow (a mandatory cache miss).

3.2 Read-Through

The cache is responsible for loading the data from the source when it is missing; the application only talks to the cache. The difference from cache-aside is one of responsibility: here the loading code lives inside the cache layer (configured with a "cache loader"), not in the service.

// The cache knows how to load what it doesn't have; the service just asks
LoadingCache<Long, Product> cache = Caffeine.newBuilder()
    .expireAfterWrite(Duration.ofMinutes(10))
    .build(id -> repository.findById(id));   // loader: invoked only on a miss

Product p = cache.get(42L);   // if it's not there, Caffeine calls the loader for us
Aspect Cache-Aside Read-Through
Who loads from the source The application The cache (loader)
Coupling Cache logic in the service Encapsulated in the cache
Control Maximum Less, cleaner

  1. Write patterns: write-through and write-behind

When data changes, you have to decide how the cache is updated.

4.1 Write-Through

Each write goes to the cache and to the database synchronously, in the same operation.

public void updatePrice(long id, BigDecimal newPrice) {
    Product p = repository.findById(id);
    p.setPrice(newPrice);
    repository.save(p);                          // 1. writes to the DB
    cache.set("product:" + id, p, Duration.ofMinutes(10)); // 2. and to the cache
}
  • Advantage: the cache is never stale relative to the database; the next read is always consistent.
  • Drawback: each write pays the cost of updating both stores, increasing write latency.

4.2 Write-Behind (Write-Back)

The write goes first to the cache and is persisted to the database asynchronously, deferred (in batches, after a delay).

  • Advantage: very fast writes; it allows grouping and absorbing spikes.
  • Serious drawback: if the cache goes down before flushing to the database, data is lost. Only suitable when that loss is tolerated or the durability of the cache is ensured.
Pattern Write latency Loss risk Cache-DB consistency
Write-through High (synchronous double) Low Strong
Write-behind Low (asynchronous) High if the cache goes down Eventual

  1. Invalidation, TTL, and eviction policies

The central challenge: when do cached data stop being valid? There are two complementary approaches.

5.1 TTL expiration

Each entry is assigned a Time To Live: after that time, the cache considers it expired and reloads it on the next access. It is simple and self-cleaning.

  • Short TTL: fresher data, lower hit ratio.
  • Long TTL: better hit ratio, greater risk of serving stale data.

The choice depends on how much staleness the business tolerates. A catalog can tolerate minutes; a balance, seconds or none.

5.2 Explicit invalidation

When a piece of data changes, we delete or update its cache entry immediately:

public void updateProduct(Product p) {
    repository.save(p);
    cache.delete("product:" + p.getId());  // invalidates; the next access will reload
}

Deleting (instead of updating) is often safer: it avoids caching a half-computed value.

5.3 Eviction policies

Since the cache has limited capacity, when it fills up it must evict entries:

Policy Criterion Ideal when
LRU (Least Recently Used) Evicts the least recently used There is temporal locality (the usual case)
LFU (Least Frequently Used) Evicts the least frequent There is stable "hot" data
FIFO Evicts the oldest Simple cases
TTL/Random By expiration or at random When the pattern is uniform

  1. Cache problems and how to mitigate them

  • Cache Stampede (thundering herd): when a very popular entry expires, thousands of simultaneous requests suffer a miss at the same time and hit the database in unison, potentially bringing it down. Mitigations: (a) a lock or single-flight so that only one request recomputes while the others wait; (b) early recomputation (refresh before it expires); (c) TTL with jitter (add randomness so they don't all expire at the same time).
  • Cache Penetration: queries for keys that do not exist in the database; they are never cached and always hit the source. Mitigation: cache the "does not exist" (a null value with a short TTL) or use a Bloom filter.
  • Cache Avalanche: many entries expire simultaneously (e.g., all with the same TTL set at startup). Mitigation: staggered TTL with jitter.
  • Stale data: the data changed in the database but the cache still serves the old value. It is the inherent cost; it is managed with the combination of an appropriate TTL and explicit invalidation.

  1. Practical example with Redis

Redis is the most widely used distributed cache: an in-memory key-value store, blazing fast and shared across instances. Let's look at cache-aside against Redis with anti-stampede protection.

public Product get(long id) {
    String key = "product:" + id;

    String json = redis.get(key);                // 1. query to Redis
    if (json != null) {
        return deserialize(json);                // HIT
    }

    // 2. MISS: we try to acquire a lock to avoid the stampede
    String lockKey = "lock:" + key;
    boolean acquired = redis.set(lockKey, "1", SetParams.setParams().nx().px(3000));
    if (!acquired) {
        Thread.sleep(50);                        // another thread is recomputing: we wait
        return get(id);                          // we retry: it's probably there already
    }

    try {
        Product p = repository.findById(id);         // 3. only ONE thread goes to the DB
        int ttl = 600 + new Random().nextInt(60);    // 4. TTL with jitter (600-660s)
        redis.setex(key, ttl, serialize(p));         // stores with expiration
        return p;
    } finally {
        redis.del(lockKey);                          // 5. we release the lock
    }
}

Analysis of the code:

  • Step 1: direct read from Redis with GET. If there is a value, it's a hit and we deserialize.
  • Step 2: on a miss, we try SET ... NX PX 3000. NX means "only if it does not exist," so only one thread obtains the lock; PX 3000 gives it a 3 s expiration so the lock doesn't hang if the thread dies.
  • If we don't get the lock, we wait a bit and retry: by then, the "winning" thread has probably already populated the cache.
  • Step 3: only the thread with the lock queries the database, avoiding the stampede.
  • Step 4: we store with SETEX and a TTL with jitter (600 to 660 s) so that not all keys expire at the same time (anti-avalanche).
  • Step 5: we release the lock in the finally, no matter what.

And the invalidation on update:

public void update(Product p) {
    repository.save(p);
    redis.del("product:" + p.getId());   // invalidates; the next GET will reload from the DB
}

Common Mistakes and Tips

  • Caching data that changes constantly. If a piece of data changes faster than it is read, the cache almost never hits and adds complexity without benefit. Cache what is read a lot and changes little.
  • Not setting a TTL. A cache without expiration accumulates stale data indefinitely. Always set a TTL, even a long one, as a safety net.
  • Identical TTLs for all keys. This causes avalanches. Add jitter.
  • Local cache in multi-instance systems without coordination. Each instance has its own copy; when you invalidate on one, the others keep serving the old value. Use a distributed cache or an invalidation channel (pub/sub).
  • Treating the cache as the source of truth. The cache is a disposable copy; the database is the authority. Your system must work (more slowly) even if the cache is emptied entirely.
  • Tip: measure the hit ratio in production. A cache with a low hit ratio is not helping; review what you cache and the TTLs.

Exercises

Exercise 1. Explain the difference between cache-aside and read-through in terms of "who is responsible for loading the data from the source."

Exercise 2. A very popular product has a TTL of 600 s. Right when it expires, 5,000 requests arrive in the same second. Describe what problem occurs and propose two concrete mitigations.

Exercise 3. You want to cache the balance of a bank account that must always appear up to date after a transfer. Which write and invalidation pattern would you use and why? Which pattern would you avoid?

Solutions

Solution 1. In cache-aside, the responsibility for loading the data from the source falls on the application: the code checks the cache and, in case of a miss, queries the database and repopulates it manually. In read-through, that responsibility is assumed by the cache itself through a configured loader; the application only asks the cache for the data, which internally loads it if missing.

Solution 2. The problem is a cache stampede (thundering herd): when the entry expires, the 5,000 requests suffer a simultaneous miss and hit the database at once, potentially saturating it. Two mitigations: (a) a single-flight lock with Redis SET NX so that only one request recomputes while the others wait and reuse the result; (b) TTL with jitter and/or early refresh of the value before it expires, so that there is never a window of massive miss.

Solution 3. I would use write-through (write to the cache and the database synchronously) or, simpler and safer, explicit invalidation: after persisting the transfer, delete the balance key so that the next read reloads it up to date from the database. I would avoid write-behind, because its asynchronous persistence can lose data if the cache goes down, something unacceptable for a bank balance. In addition, a very short TTL would be advisable as a safety net.

Conclusion

You have completed the tour of caching: you know what it is and why it works, at which levels it appears (from the CDN to Redis), how to choose between read patterns (cache-aside, read-through) and write patterns (write-through, write-behind), and how to manage invalidation by combining TTL, explicit deletion, and eviction policies. You also know the classic dangers (stampede, penetration, avalanche, stale data) and how to mitigate them with locks, jitter, and negative caching, illustrated with a real example in Redis. With this lesson you close Module 7, in which you have learned to decide where to store data (SQL vs NoSQL), how to access it cleanly (Repository, Unit of Work, DAO), how to manage it in distributed systems (database per service, Sagas, CQRS), and how to speed up its reading without sacrificing consistency (caching). The next module takes us outside our own data center to explore cloud architecture and deployment.

Application Architecture Course

Module 1: Fundamentals of Application Architecture

Module 2: Design Principles and Tactics

Module 3: Architectural Styles and Patterns

Module 4: Distributed Architectures and Microservices

Module 5: Event-Driven Architectures and Messaging

Module 6: Domain-Driven Design (DDD)

Module 7: Data and Persistence

Module 8: Cloud Architecture and Deployment

Module 9: Quality, Security and Observability

Module 10: Evolution, Governance and Case Studies

© Copyright 2026. All rights reserved