Streaming SSR: Progressive HTML Streaming
Streaming SSR lets the server progressively send HTML as parts of the render become ready. The main advantage is not that total work disappears, but that the user no longer waits for the slowest region before seeing anything useful.
Quick Decision Guide
Streaming SSR sends HTML in chunks instead of waiting for the entire page to finish rendering.
- Improves how quickly useful HTML reaches the browser - Works best when the page has independent slow sections - Relies on Suspense boundaries to separate shell content from delayed content - Often improves perceived performance more than raw backend speed
Interview signal: streaming helps the browser see content sooner, but hydration and JavaScript cost still determine when the UI actually feels interactive.
What is Streaming SSR?
Streaming SSR sends HTML to the browser progressively as it is produced, instead of waiting for the full page render to complete first.
🔥 Insight
Streaming changes the shape of waiting. The work may still exist, but the browser can start rendering useful UI earlier.
Traditional SSR vs Streaming SSR
Traditional SSR
1. Server receives request
2. Server waits for all blocking data
3. Server renders full page
4. Browser receives one completed HTML response
Problem: the user is effectively gated by the slowest dependency.
Streaming SSR
1. Server receives request
2. Server renders the shell early
3. Browser receives the first chunk quickly
4. Slow regions stream later as their data resolves
5. Browser progressively updates visible content
Benefit: users can see useful structure and content before every slow query completes.
What Streaming Does Not Do
Streaming does not eliminate:
It mainly improves when HTML arrives, not the total amount of work in the system.
How Streaming Works
React 18 streaming is built around server rendering APIs and Suspense boundaries.
Mental Model
> The page is split into a shell plus slower islands of content.
The shell can be sent early, while delayed regions continue rendering in parallel or resume later.
Suspense Boundaries
Suspense boundaries mark places where React can pause a portion of rendering and continue streaming the rest.
import { Suspense } from 'react';
export default function Page() {
return (
<div>
<Header />
<Suspense fallback={<ProductSkeleton />}>
<ProductList />
</Suspense>
<Suspense fallback={<ReviewSkeleton />}>
<ProductReviews />
</Suspense>
</div>
);
}
async function ProductList() {
const products = await fetchProducts();
return <div>{/* render products */}</div>;
}Streaming Sequence
1. Shell content like Header renders first
2. Server sends the first HTML chunk
3. Slower boundaries suspend
4. Fallback UI is sent for those boundaries
5. When data resolves, React streams the completed HTML for that boundary
6. Browser updates the page progressively
Why This Helps
Interview Takeaway
Streaming SSR is most valuable when the page has different latency zones. If everything is equally fast or equally tiny, the benefit is smaller.
Suspense and Parallelism
Streaming works best when slow regions are separated into meaningful boundaries.
🔥 Insight
Suspense is not only a loading-state API. In streaming SSR, it is also a render scheduling boundary.
Example
import { Suspense } from 'react';
export default function Page() {
return (
<div>
<Suspense fallback={<Skeleton1 />}>
<Component1 />
</Suspense>
<Suspense fallback={<Skeleton2 />}>
<Component2 />
</Suspense>
</div>
);
}If Component1 and Component2 fetch independently, each boundary can complete on its own timeline.
What Good Boundaries Look Like
Good boundaries usually separate:
Bad Boundary Design
Too many tiny boundaries can create noisy loading states and complexity.
Too few boundaries can recreate the original problem: one slow region delays too much of the page.
Mental Model
> Boundaries should match user value, not just component file boundaries.
React Server Components and Next.js Streaming
In modern Next.js App Router architecture, streaming works naturally with Server Components and Suspense.
Server Components
Server Components render on the server and can await data directly.
// app/products/page.jsx
export default async function ProductsPage() {
const products = await fetchProducts();
return (
<div>
<h1>Products</h1>
<ProductList products={products} />
</div>
);
}Streaming with Suspense
import { Suspense } from 'react';
export default function ProductsPage() {
return (
<div>
<Header />
<Suspense fallback={<ProductSkeleton />}>
<ProductList />
</Suspense>
<Suspense fallback={<ReviewSkeleton />}>
<ProductReviews />
</Suspense>
</div>
);
}Why This Combination Works Well
Important Clarification
Streaming HTML earlier is not the same as making that HTML interactive earlier. Hydration still happens later on the client, and Client Components still carry JavaScript cost.
Edge Streaming
Streaming can also run in edge runtimes, placing server work closer to users geographically.
export const runtime = 'edge';
export default function ProductsPage() {
return (
<Suspense fallback={<Skeleton />}>
<ProductList />
</Suspense>
);
}Benefits
Trade-offs
Interview Takeaway
Edge streaming helps most when latency to the server is a meaningful part of the user’s wait. It does not replace good boundary design or good data-fetching strategy.
Traditional SSR vs Streaming SSR
| Aspect | Traditional SSR | Streaming SSR |
|---|---|---|
| HTML delivery | One complete response | Multiple progressive chunks |
| Waiting behavior | Blocked by slowest dependency | Shell can appear before slow regions finish |
| Perceived performance | Often worse on complex pages | Usually better when boundaries are well chosen |
| Complexity | Lower | Higher |
| Suspense usefulness | Limited for server response timing | Central to progressive delivery |
| Best fit | Smaller/simple pages | Pages with independent slow regions |
Best Practices
1. Keep Critical Shell Outside Slow Boundaries
Header, nav, hero, and key layout structure should usually render early.
function Page() {
return (
<div>
<Header />
<Hero />
<Suspense fallback={<Skeleton />}>
<RelatedContent />
</Suspense>
</div>
);
}2. Use Meaningful Fallbacks
Fallbacks should preserve layout and reduce visual jumpiness.
<Suspense fallback={<ProductCardSkeleton />}>
<ProductCard />
</Suspense>3. Avoid Waterfalls
Do not make one slow fetch unnecessarily depend on another when the data can be fetched independently.
// ❌ Sequential waterfall
async function Page() {
const user = await fetchUser();
const posts = await fetchPosts(user.id);
return <div>{/* render */}</div>;
}Prefer boundaries and parallelizable work where possible.
4. Do Not Oversplit the Page
Too many tiny Suspense boundaries can make the UI noisy and harder to reason about.
5. Measure the Right Things
Streaming improves HTML delivery, but also watch:
6. Use Streaming Where It Changes User Experience
If the entire page is fast already, streaming may add complexity with limited payoff. Use it where slow independent regions would otherwise block useful content.