Service Workers + Offline Strategy: Cache First, Network First, Update Lifecycle

Easy

A service worker runs separately from the page and can intercept requests, populate caches, serve fallback responses, and control how the app behaves when the network is unavailable or slow.

The most common mistake is treating service workers as “cache everything and go offline.” That approach often creates stale UI, broken deploys, or users stuck on incompatible versions. Strong candidates explain both the upside and the operational risks.

A strong answer usually separates three layers:

  • app shell: static UI and assets that can often be cached aggressively
  • dynamic data: responses whose freshness depends on product semantics
  • update flow: how new service worker code, new assets, and old tabs interact during deployment

Quick Decision Guide

Senior-Level Decision Guide:

- Use cache-first for versioned static assets like JS, CSS, fonts, and logos. - Use network-first for critical data where freshness matters more than speed. - Use stale-while-revalidate when immediate response matters and slight staleness is acceptable. - Treat update lifecycle as a first-class design concern: stale workers and stale app shells cause real production bugs. - Separate offline shell from network-dependent data. - Remember that service workers are a programmable request layer, not a reason to ignore HTTP caching semantics.

Interview framing: Service workers are a programmable network layer for resilience and caching, not just a PWA checkbox.

Service Worker Mental Model

A service worker is not part of the page's normal JavaScript runtime.

It runs in its own worker context and can listen for lifecycle and network-related events.

High-level architecture

Page
  |
  | register()
  v
Service Worker
  |
  | intercept fetch
  +--> Cache Storage
  |
  +--> Network

Key properties

runs off the main thread
has no direct DOM access
can intercept matching requests
can read/write Cache Storage
can support offline fallbacks and background-style behavior

Practical mental model

Think of the service worker as a programmable proxy sitting between the page and the network.

Lifecycle: register, install, activate, fetch

A service worker usually goes through these major phases:

1) Register

The page registers a service worker script.

2) Install

The browser installs the new worker. This is often where apps pre-cache shell assets.

3) Waiting

If an older worker is still controlling open pages, the new worker may wait rather than taking over immediately.

4) Activate

The new worker becomes active. This is typically where old caches are cleaned up.

5) Fetch

The active worker can intercept matching requests and decide whether to serve cache, network, fallback, or a combination.

Why lifecycle matters

The lifecycle is not just API trivia. It determines:

when new code becomes active
whether users stay on old cached assets
when old caches are deleted
whether multiple versions of the app can coexist briefly

Important lifecycle nuance

By default, a newly installed service worker does not immediately take over already-open pages. That is why skipWaiting() and clients.claim() are common interview talking points.

Cache Storage vs HTTP Cache

A common confusion is treating service worker cache and browser HTTP cache as the same thing.

They are related, but different.

HTTP cache

built into the browser
controlled by HTTP headers like Cache-Control
automatic

Cache Storage API

explicitly managed by JavaScript
available to service workers and other contexts
gives you programmable control over what to store and return

Mental model

Request
  |
Service Worker logic
  |
  +--> Cache Storage API
  |
  +--> fetch()
          |
          +--> browser HTTP cache may still apply
          +--> network

Interview framing

A strong answer makes clear that service workers add a programmable caching layer, but HTTP caching semantics still matter.

Core Cache Strategies

Cache First

Return cached response if available; otherwise fetch from network and store it.

Best for: hashed static assets, fonts, icons, logos, stable shell files.

Risk: stale resources if URLs are not versioned properly.

Network First

Try network first; fall back to cache if network fails or times out.

Best for: dynamic data like feed content, dashboards, or inventory-like data.

Risk: poor-network latency unless you define good fallbacks.

Stale While Revalidate

Return cached response immediately, then refresh in background.

Best for: content where perceived speed matters and small staleness is acceptable.

Risk: users may briefly see stale content.

Cache Only / Network Only

Usually niche and used when behavior needs to be very explicit.

Strategy mental model

Cache First:
  cache -> network fallback

Network First:
  network -> cache fallback

Stale While Revalidate:
  cache immediately + refresh in background

Offline UX Is a Product Decision

Good offline design is not just technical caching. It requires product decisions about what remains usable without network.

Examples:

a docs app can keep reading experience available offline
a chat app may show cached history but disable new sends until reconnection
a banking app may keep the shell available but should be cautious about caching sensitive account state

A strong architecture usually separates:

app shell / static UI → cacheable and offline-friendly
dynamic data → freshness rules and fallbacks depend on business semantics

Interview framing

Offline-first does not mean every feature must work fully offline. It means the experience degrades intentionally instead of failing chaotically.

Offline Fallbacks and App Shell Thinking

A common pattern is to pre-cache a minimal app shell and optionally an offline fallback page.

Example idea

shell assets: JS, CSS, logo, base HTML
offline fallback: /offline.html

Why this helps

When navigation fails because the network is unavailable, the app can still return something meaningful instead of a browser error page.

Navigation fallback flow

User navigates
   |
Service worker tries network/cache strategy
   |
   |-- success -> normal response
   |
   |-- failure -> offline fallback page

Senior insight

An offline fallback is not just a technical trick. It is part of UX design and should clearly communicate what still works and what does not.

Update Lifecycle and Stale Worker Pitfalls

The hardest production issue is often not fetch strategy, but update handling.

Common failures:

old HTML points to deleted assets
stale worker keeps serving outdated shell
users do not see new version until all tabs close
mixed versions make debugging extremely hard

Why this happens

A new service worker may install, but the old one may still control open tabs until activation conditions are met.

Common tools

version cache names explicitly
clean old caches during activate
use hashed asset filenames
decide deliberately whether to call skipWaiting()
decide whether to call clients.claim()
consider prompting the user when a new version is available

Trade-off

Immediate takeover can reduce stale-version problems, but forced activation may also surprise users if the page state and cached assets are mid-session.

Precache vs Runtime Caching

A strong system design answer distinguishes between assets you know ahead of time and resources discovered during usage.

Precache

Assets are added intentionally during install.

Good for:

app shell
offline fallback page
critical icons and fonts

Runtime caching

Requests are cached as users actually encounter them.

Good for:

images
infrequently visited routes
API responses
large optional assets

Why this matters

Pre-caching too much increases install size and update pain.

Runtime caching gives flexibility, but can be less predictable if not designed carefully.

Common Pitfalls

1) Caching HTML too aggressively

This can trap users on an old app shell after deploy.

2) Caching API data without freshness policy

Users may see stale prices, stale dashboards, or misleading account state.

3) Forgetting cache cleanup

Old caches accumulate and waste storage or serve outdated resources.

4) Mixed-version deploy bugs

Old HTML + new JS or old worker + new assets can create hard-to-debug production failures.

5) Using one strategy for everything

Different resources need different strategies.

6) Ignoring service worker scope and control timing

A page may not be controlled immediately after registration, which confuses many developers.

Workbox and Practical Abstractions

In real projects, many teams use Workbox rather than hand-writing every route and strategy.

Why:

common strategies are already implemented
route matching becomes easier
offline fallback patterns are easier to express
precaching and runtime caching become more maintainable

Interview framing

You should understand the underlying concepts even if a library like Workbox handles the implementation details.

Interview Scenarios

Scenario 1: News or content app

Strong answer:

cache-first for shell assets
stale-while-revalidate for article/content pages when slight staleness is acceptable
offline fallback for navigation failures

Scenario 2: Admin dashboard

Strong answer:

cache shell assets aggressively
network-first for API data
clear offline state messaging
avoid misleading stale business-critical values

Scenario 3: E-commerce PWA

Strong answer:

cache shell and media carefully
network-first for price and inventory
be careful about stale checkout state

Scenario 4: Users stuck on an old version after deploy

Likely causes:

old service worker still controls tabs
old HTML or shell cached too aggressively
activation/update flow was not designed deliberately

Scenario 5: “Should we make the whole app offline-first?”

Strong answer:

depends on product semantics
separate shell from data
define what must stay fresh and what can degrade gracefully

Key Takeaways

1Service workers act as a programmable request interception layer.
2They run outside the page in a worker context and do not have direct DOM access.
3Cache-first, network-first, and stale-while-revalidate each fit different resource types.
4Offline UX should separate app shell from dynamic data.
5Cache Storage and the browser HTTP cache are related but not the same thing.
6Update lifecycle is one of the hardest and most important parts of service worker architecture.
7Versioning, cache cleanup, and deploy-safe asset hashing matter as much as fetch strategy.
8A bad update strategy can leave users trapped on stale or mixed app versions.