Streaming SSR: Progressive HTML Streaming

Medium

Streaming SSR lets the server progressively send HTML as parts of the render become ready. The main advantage is not that total work disappears, but that the user no longer waits for the slowest region before seeing anything useful.

Quick Decision Guide

Streaming SSR sends HTML in chunks instead of waiting for the entire page to finish rendering.

- Improves how quickly useful HTML reaches the browser - Works best when the page has independent slow sections - Relies on Suspense boundaries to separate shell content from delayed content - Often improves perceived performance more than raw backend speed

Interview signal: streaming helps the browser see content sooner, but hydration and JavaScript cost still determine when the UI actually feels interactive.

What is Streaming SSR?

Streaming SSR sends HTML to the browser progressively as it is produced, instead of waiting for the full page render to complete first.

🔥 Insight

Streaming changes the shape of waiting. The work may still exist, but the browser can start rendering useful UI earlier.

Traditional SSR vs Streaming SSR

Traditional SSR

1. Server receives request

2. Server waits for all blocking data

3. Server renders full page

4. Browser receives one completed HTML response

Problem: the user is effectively gated by the slowest dependency.

Streaming SSR

1. Server receives request

2. Server renders the shell early

3. Browser receives the first chunk quickly

4. Slow regions stream later as their data resolves

5. Browser progressively updates visible content

Benefit: users can see useful structure and content before every slow query completes.

What Streaming Does Not Do

Streaming does not eliminate:

slow data fetching
hydration cost
JavaScript bundle cost
client-side interactivity delays

It mainly improves when HTML arrives, not the total amount of work in the system.

How Streaming Works

React 18 streaming is built around server rendering APIs and Suspense boundaries.

Mental Model

> The page is split into a shell plus slower islands of content.

The shell can be sent early, while delayed regions continue rendering in parallel or resume later.

Suspense Boundaries

Suspense boundaries mark places where React can pause a portion of rendering and continue streaming the rest.

import { Suspense } from 'react';

export default function Page() {
  return (
    <div>
      <Header />

      <Suspense fallback={<ProductSkeleton />}>
        <ProductList />
      </Suspense>

      <Suspense fallback={<ReviewSkeleton />}>
        <ProductReviews />
      </Suspense>
    </div>
  );
}

async function ProductList() {
  const products = await fetchProducts();
  return <div>{/* render products */}</div>;
}

Streaming Sequence

1. Shell content like Header renders first

2. Server sends the first HTML chunk

3. Slower boundaries suspend

4. Fallback UI is sent for those boundaries

5. When data resolves, React streams the completed HTML for that boundary

6. Browser updates the page progressively

Why This Helps

important layout appears earlier
user sees progress instead of a blank wait
slow regions no longer block the entire response

Interview Takeaway

Streaming SSR is most valuable when the page has different latency zones. If everything is equally fast or equally tiny, the benefit is smaller.

Suspense and Parallelism

Streaming works best when slow regions are separated into meaningful boundaries.

🔥 Insight

Suspense is not only a loading-state API. In streaming SSR, it is also a render scheduling boundary.

Example

import { Suspense } from 'react';

export default function Page() {
  return (
    <div>
      <Suspense fallback={<Skeleton1 />}>
        <Component1 />
      </Suspense>

      <Suspense fallback={<Skeleton2 />}>
        <Component2 />
      </Suspense>
    </div>
  );
}

If Component1 and Component2 fetch independently, each boundary can complete on its own timeline.

What Good Boundaries Look Like

Good boundaries usually separate:

critical above-the-fold content
slow but non-blocking content
large secondary regions like reviews, recommendations, sidebars, comments

Bad Boundary Design

Too many tiny boundaries can create noisy loading states and complexity.

Too few boundaries can recreate the original problem: one slow region delays too much of the page.

Mental Model

> Boundaries should match user value, not just component file boundaries.

React Server Components and Next.js Streaming

In modern Next.js App Router architecture, streaming works naturally with Server Components and Suspense.

Server Components

Server Components render on the server and can await data directly.

// app/products/page.jsx
export default async function ProductsPage() {
  const products = await fetchProducts();

  return (
    <div>
      <h1>Products</h1>
      <ProductList products={products} />
    </div>
  );
}

Streaming with Suspense

import { Suspense } from 'react';

export default function ProductsPage() {
  return (
    <div>
      <Header />

      <Suspense fallback={<ProductSkeleton />}>
        <ProductList />
      </Suspense>

      <Suspense fallback={<ReviewSkeleton />}>
        <ProductReviews />
      </Suspense>
    </div>
  );
}

Why This Combination Works Well

Server Components keep more work off the client
Suspense defines streamable boundaries
different data dependencies can resolve independently
client JavaScript can stay smaller when large sections remain server-rendered

Important Clarification

Streaming HTML earlier is not the same as making that HTML interactive earlier. Hydration still happens later on the client, and Client Components still carry JavaScript cost.

Edge Streaming

Streaming can also run in edge runtimes, placing server work closer to users geographically.

export const runtime = 'edge';

export default function ProductsPage() {
  return (
    <Suspense fallback={<Skeleton />}>
      <ProductList />
    </Suspense>
  );
}

Benefits

lower network latency for globally distributed users
faster shell delivery in many regions
good fit for personalized or request-time responses near the user

Trade-offs

runtime limitations compared with full Node environments
stricter API surface in many platforms
memory and CPU limits may be tighter
architecture must account for platform-specific constraints

Interview Takeaway

Edge streaming helps most when latency to the server is a meaningful part of the user’s wait. It does not replace good boundary design or good data-fetching strategy.

Traditional SSR vs Streaming SSR

AspectTraditional SSRStreaming SSR
HTML deliveryOne complete responseMultiple progressive chunks
Waiting behaviorBlocked by slowest dependencyShell can appear before slow regions finish
Perceived performanceOften worse on complex pagesUsually better when boundaries are well chosen
ComplexityLowerHigher
Suspense usefulnessLimited for server response timingCentral to progressive delivery
Best fitSmaller/simple pagesPages with independent slow regions

Best Practices

1. Keep Critical Shell Outside Slow Boundaries

Header, nav, hero, and key layout structure should usually render early.

function Page() {
  return (
    <div>
      <Header />
      <Hero />

      <Suspense fallback={<Skeleton />}>
        <RelatedContent />
      </Suspense>
    </div>
  );
}

2. Use Meaningful Fallbacks

Fallbacks should preserve layout and reduce visual jumpiness.

<Suspense fallback={<ProductCardSkeleton />}>
  <ProductCard />
</Suspense>

3. Avoid Waterfalls

Do not make one slow fetch unnecessarily depend on another when the data can be fetched independently.

// ❌ Sequential waterfall
async function Page() {
  const user = await fetchUser();
  const posts = await fetchPosts(user.id);
  return <div>{/* render */}</div>;
}

Prefer boundaries and parallelizable work where possible.

4. Do Not Oversplit the Page

Too many tiny Suspense boundaries can make the UI noisy and harder to reason about.

5. Measure the Right Things

Streaming improves HTML delivery, but also watch:

hydration cost
JavaScript execution time
input responsiveness
loading-state stability

6. Use Streaming Where It Changes User Experience

If the entire page is fast already, streaming may add complexity with limited payoff. Use it where slow independent regions would otherwise block useful content.

Key Takeaways

1Streaming SSR sends HTML progressively instead of waiting for the full page render to complete
2Its main benefit is earlier delivery of useful HTML, not elimination of total work
3Suspense boundaries define where React can pause and later resume streamed content
4Streaming works best when the page has independent slow regions rather than one monolithic render
5Server Components and Suspense work well together for streamed server rendering
6Edge streaming can reduce latency, but it does not replace good data-fetching and boundary design
7Streaming improves perceived performance, while hydration still determines when content becomes interactive