Next.js App Router Changed How Your App Renders. Did Your Tests Notice?

Co-Founder and CEO, DevAssure

The test still passes. The alert never renders.

Your team migrated a patient medication dashboard to Next.js App Router. You refactored MedicationList as a Server Component to cut Time to First Byte. You wrapped DrugInteractionAlert in a <Suspense> boundary so the medication list appears immediately while the interaction check runs in parallel. Perceived performance improved.

CI stayed green. Your Playwright suite passed every run.

Two weeks later, a nurse filed a bug report: the drug interaction banner wasn't showing on slow hospital wifi connections. It would flash briefly and disappear, or sometimes never appear at all. The medications rendered. The alert — the one that warns about a contraindicated combination — didn't.

The test wasn't catching it because the test was using waitForLoadState('networkidle'). By the time Playwright declared the page idle, the medication list had streamed in and the interaction check was still in-flight. The test asserted the page, saw no alert, logged a pass.

This is not a hypothetical failure mode. It is a specific class of bug introduced by migrating to App Router without updating your test assumptions — and it is consequential in proportion to how much your UI surfaces clinically relevant information.

This post covers what App Router actually changed in the rendering model, why it breaks existing test suites in ways that are especially dangerous for health-adjacent applications, and how O2 — DevAssure's PR-native testing agent — handles these rendering patterns natively without requiring you to write or maintain a single test script.

What App Router actually changed, and why it breaks your test assumptions

React Server Components don't hydrate — and clinical UI depends on that distinction

In the Pages Router, every component you render gets bundled and sent to the browser. The server renders HTML for the initial paint, and React takes over client-side via hydration. Your Playwright selectors worked because by the time they fired, the component had hydrated and its event handlers were live.

In the App Router, React Server Components (RSCs) never hydrate. They execute on the server, their output is serialized to HTML and Next.js's RSC wire format, and no corresponding JavaScript ships to the browser. There are no onClick handlers, no useState calls, no client-side lifecycle.

In a patient medication dashboard, this pattern is common:

// app/patients/[patientId]/medications/page.tsx
// Server Component — no 'use client', executes only on the server

import { getMedications } from '@/lib/ehr-client';

export default async function MedicationList({ patientId }: { patientId: string }) {
  // This fetch runs on the server. The result is never sent to the browser as JS.
  const medications = await getMedications(patientId);

  return (
    <ul data-testid="medication-list">
      {medications.map((med) => (
        <li key={med.rxcui} data-testid={`med-${med.rxcui}`}>
          <span data-testid="med-name">{med.name}</span>
          <span data-testid="med-dose">{med.dose} {med.unit}</span>
          <span data-testid="med-route">{med.route}</span>
        </li>
      ))}
    </ul>
  );
}

An old Pages Router test for this component might have waited for hydration events or React re-renders to confirm the list populated. Neither of those happen anymore. The component renders once, server-side. If the EHR API returned stale data at render time, there is no client-side mechanism to retry or correct it.

Practical rule: any assertion that depends on a component re-rendering in response to state changes is incorrect for Server Components. Validate the rendered HTML output, not a hydration state transition.

Streaming creates a window where your critical UI hasn't arrived yet

The Pages Router rendered a complete page and sent it. The App Router uses React's Suspense-based streaming. The server sends an initial HTML shell immediately, then streams subsequent chunks — wrapped in <template> tags with injected inline scripts — as their upstream data resolves.

Open DevTools Network on a streaming App Router page. The response Content-Type is text/html but the body doesn't complete immediately. It arrives in chunks over a single open HTTP connection:

<!-- Initial shell — arrives first, fast -->
<div id="root">
  <main>
    <!-- MedicationList Suspense boundary -->
    <!--$?--><template id="B:0"></template><!--/$-->
    <!-- DrugInteractionAlert Suspense boundary -->
    <!--$?--><template id="B:1"></template><!--/$-->
  </main>
</div>

<!-- Streamed chunk 1: MedicationList resolves (~200ms) -->
<div hidden id="S:0">
  <ul data-testid="medication-list">
    <li data-testid="med-3393">...</li>  <!-- Warfarin -->
    <li data-testid="med-41493">...</li> <!-- Amiodarone -->
  </ul>
</div>
<script>$RC("B:0","S:0")</script>

<!-- Streamed chunk 2: DrugInteractionAlert resolves (~800ms) -->
<div hidden id="S:1">
  <div data-testid="drug-interaction-alert" role="alert" aria-live="assertive">
    ⚠ Severe interaction: Amiodarone inhibits CYP2C9 — monitor INR closely
  </div>
</div>
<script>$RC("B:1","S:1")</script>

MedicationList resolves in ~200ms. DrugInteractionAlert calls out to a drug interaction checking service — in real clinical stacks this is commonly a third-party API like First Databank (FDB) or Multum (Wolters Kluwer) — and resolves in ~800ms.

Playwright's waitForLoadState('networkidle') fires after 500ms of no new network activity. The streaming HTTP response is a single open connection — it does not register as a pending network request. Playwright sees zero pending requests at ~250ms (after the initial shell lands), waits 500ms, and declares the page idle at ~750ms. The medication list has streamed in. The drug interaction alert has not.

Your test runs assertions against a page where the clinically significant UI is still in-flight:

// This test will consistently produce a false positive in App Router
await page.goto(`/patients/${patientId}/medications`);
await page.waitForLoadState('networkidle'); // fires at ~750ms — alert not yet streamed

// MedicationList streamed in ✓ — this passes correctly
await expect(page.locator('[data-testid="medication-list"]')).toBeVisible();

// DrugInteractionAlert is still in-flight at this point
// The alert isn't absent because there's no interaction — it just hasn't rendered yet
await expect(page.locator('[data-testid="drug-interaction-alert"]')).not.toBeVisible(); // FALSE PASS

The correct Playwright approach is to wait explicitly for the interaction alert boundary — not just the medication list — before asserting its state:

// Wait for either the alert OR the confirmed-safe state to appear
// Never treat silence as a confirmed negative
await page.waitForSelector(
  '[data-testid="drug-interaction-alert"], [data-testid="no-interactions-confirmed"]',
  { state: 'attached', timeout: 5000 }
);

Note the key pattern: you need the component to emit an explicit signal for the safe state (no-interactions-confirmed), not rely on the absence of the alert element. Absence before the Suspense boundary resolves looks identical to absence after it. This is a component contract issue as much as a test issue.

But here's the maintenance problem: to write this correctly, you need to know which components are Suspense-wrapped, what their upstream dependencies are, and how long they take. That knowledge lives in the code. Every time a developer introduces a new boundary, a human has to update the test.

The TOCTOU race condition that Playwright can't see at all

There is a subtler problem that no amount of correct waitForSelector usage will catch: independent Suspense boundaries make independent server-side fetches.

Consider this layout:

// app/patients/[patientId]/medications/layout.tsx
export default function MedicationsLayout({ children, alert }) {
  return (
    <>
      <Suspense fallback={<AlertSkeleton />}>
        {alert} {/* @alert parallel slot */}
      </Suspense>
      <Suspense fallback={<MedListSkeleton />}>
        {children}
      </Suspense>
    </>
  );
}

// app/patients/[patientId]/medications/@alert/page.tsx
export default async function DrugInteractionAlert({ params }) {
  const medications = await getMedications(params.patientId); // fetch #1
  const interactions = await checkInteractions(medications);
  // ...
}

// app/patients/[patientId]/medications/page.tsx
export default async function MedicationList({ params }) {
  const medications = await getMedications(params.patientId); // fetch #2 — independent call
  // ...
}

getMedications is called twice: once in the alert slot, once in the medication list. These are independent requests to the EHR API. If your EHR vendor uses read replicas — common under high load — these two calls can read from different database snapshots if a write commits between them.

Scenario: A pharmacist adds Amiodarone to a patient's active medications at t=0.

t=50ms: MedicationList fetches — hits the primary, returns Warfarin + Metoprolol + Amiodarone
t=80ms: DrugInteractionAlert fetches — hits a replica with replication lag, returns Warfarin + Metoprolol only
t=100ms: checkInteractions([Warfarin, Metoprolol]) — no interaction found, returns clean
Page renders: Amiodarone appears in the list. No interaction alert renders.

The Warfarin–Amiodarone interaction is a well-documented severe drug-drug interaction: Amiodarone inhibits CYP2C9 and P-glycoprotein, significantly increasing Warfarin plasma concentration and bleeding risk. INR requires close monitoring when the combination is used. The clinical data is sound — the rendering data was not.

This is a TOCTOU (Time-of-Check to Time-of-Use) race condition at the infrastructure level, surfaced by the architectural decision to use separate Suspense boundaries. No E2E test framework observes this — it happens on the server, between two fetches, within the lifetime of a single page render. But a PR-level agent that understands what changed in the component tree can flag that a developer introduced the split that makes it possible.

Parallel routes and patient detail modals

App Router's parallel routes (@folder) are common in patient dashboards: the patient list at /patients with a detail modal at /patients/[id] that overlays rather than fully navigates.

app/
  patients/
    @modal/
      (.)patients/[patientId]/
        page.tsx  ← modal when navigating client-side from /patients
    [patientId]/
      page.tsx    ← full page on direct URL access
    page.tsx

The URL /patients/42 renders differently depending entirely on how the user arrived. Direct URL access shows the full page. Client-side navigation from /patients shows the modal overlay. A Playwright test that navigates directly to /patients/42 will always test the wrong layout — the one clinical staff never see during a shift, because they always arrive from the patient list.

// This tests the full-page layout that staff never actually use
await page.goto('/patients/42');

// This tests the modal that staff actually see
await page.goto('/patients');
await page.click('[data-testid="patient-row-42"]');
await expect(page.locator('[role="dialog"]')).toBeVisible();

Most migrated suites don't make this distinction because nobody mapped which routes converted to parallel routing during the App Router migration.

Why Playwright handles this better than Cypress — but still needs maintenance

Playwright has a meaningful architectural advantage for App Router apps. It auto-waits for elements to be actionable before interacting, which catches some streaming delays implicitly. page.waitForResponse() and page.waitForURL() give you precise control over async events. Playwright runs in real Chromium, so streaming HTML arrives exactly as a real browser receives it.

Cypress's traditional architecture — where tests and app code share the same browser window — creates specific friction with App Router's server-driven model. The RSC wire format (Content-Type: text/x-component, used during client-side RSC navigation) is a Next.js proprietary protocol. Cypress's network interception layer wasn't designed around it. Its component testing story is solid, but the E2E layer assumes a more conventional hydration lifecycle.

That said, Playwright does not solve the maintenance problem. For every Suspense boundary introduced in a PR, someone has to update the test to wait for it. For every component moved from Client to Server, someone has to remove the hydration assertion. For every parallel route added, someone has to rewrite the navigation path.

In a codebase that ships multiple PRs a day, that debt compounds quickly.

How O2 handles App Router rendering — without a single test script

DevAssure O2 is a PR-native testing agent. It doesn't run a pre-written test suite and it doesn't generate TypeScript or Playwright scripts. It reads your PR diff, maps the blast radius of the change across your user flows, and generates tests in plain English — which it then executes against your app. When the rendering model changes, O2 adapts. You don't.

O2 maps impact before generating anything

When a PR opens, O2 traces the code change to the user flows it affects. For a medication dashboard, that means identifying which patient journeys render the changed component — the nurse medication review flow, the prescriber pre-prescription check, the pharmacist verification flow. O2 builds this map from the code, not from a manually maintained flow registry.

This matters for App Router apps specifically because the blast radius of a Suspense refactor isn't obvious from the file diff alone. Wrapping DrugInteractionAlert in a <Suspense> boundary changes how every flow that renders it behaves — the test implications ripple beyond the changed file.

Tests are generated in plain English, not scripts

O2 generates test cases as plain-English descriptions of user behaviour. For a PR that changes how the medication page renders, O2 might generate:

Flow: Nurse reviews active medications for patient with polypharmacy

Log in as a nurse user

Navigate to the patient list

Select patient with ID 42

Verify the patient detail modal opens (not the full-page layout)

Navigate to the Medications tab

Verify the medication list loads and contains the patient's active medications

Wait for the drug interaction check to complete — do not assert on the alert before it resolves

If a severe interaction is flagged, verify the alert banner is visible and contains the interaction details

If no interactions are found, verify the confirmed-safe state is explicitly shown

Flow: Prescriber reviews medications before adding a new prescription

Log in as a prescriber

Open patient 42's record from the worklist

Navigate to active medications

Verify each medication renders with name, dose, and route

Verify the drug interaction alert resolves — do not treat its absence as confirmation of safety

Verify the interaction state is consistent with the displayed medication list

These tests encode the rendering awareness — step 7 in the first flow above is the step that an existing Playwright suite was missing entirely. And O2 generated it by understanding that DrugInteractionAlert is now a separate Suspense boundary that resolves later than MedicationList.

O2 runs the tests, not you

O2 executes the generated tests against your PR's deployed preview environment in a real browser. You don't wire up Playwright, manage browser contexts, or handle async waiting logic. O2 handles the execution layer — including waiting for the correct Suspense boundaries to settle — and posts results directly to the PR.

Existing tests are updated, not replaced

When O2 detects that a change affects a flow that already has test coverage, it updates the existing test rather than duplicating it. If a previous run generated a medication review test that assumed DrugInteractionAlert was synchronous, and the new PR wraps it in Suspense, O2 revises the test to reflect the new rendering behavior. The test stays current with the code automatically.

Real example: the PR that introduced the Warfarin race condition

Here is a concrete walkthrough of a PR that introduced the drug interaction race condition described earlier, and how O2 responds to it.

The PR: A developer refactors the patient medication page to improve Time to First Byte. Previously, MedicationList and DrugInteractionAlert were rendered sequentially in the same Server Component, blocked on both fetches completing. The PR splits them into separate Suspense boundaries so the medication list renders immediately, without waiting for the slower interaction check.

app/patients/[patientId]/medications/page.tsx

@@ -1,11 +1,17 @@

	1	+	import { Suspense } from 'react';
1	2		import MedicationList from './MedicationList';
2	3		import DrugInteractionAlert from './DrugInteractionAlert';
3	4
4	5		export default function MedicationsPage({ params }) {
5	6		return (
6	7		<main>
7		-	<DrugInteractionAlert patientId={params.patientId} />
8		-	<MedicationList patientId={params.patientId} />
	8	+	<Suspense fallback={<AlertSkeleton />}>
	9	+	<DrugInteractionAlert patientId={params.patientId} />
	10	+	</Suspense>
	11	+	<Suspense fallback={<MedListSkeleton />}>
	12	+	<MedicationList patientId={params.patientId} />
	13	+	</Suspense>
9	14		</main>
10	15		);
11	16		}

What O2 does, automatically, on PR open:

1. Reads the diff and maps blast radius. O2 identifies that DrugInteractionAlert has been moved into a new <Suspense> boundary. Previously, the interaction alert and the medication list resolved together — any flow that rendered this page got both or neither. Now they resolve independently. O2 flags this as a rendering model change with blast radius across every flow that surfaces the medication page.

2. Identifies the affected flows. Three flows touch MedicationsPage: nurse medication review, prescriber pre-prescription check, pharmacist verification. All three now have a window where the medication list is visible but the interaction check is still running.

3. Generates plain-English tests for each flow. For the nurse medication review flow, O2 generates a test that explicitly validates that the interaction check reaches a settled state before asserting on the alert. It generates a separate test for the case where a known interaction exists in the test fixture (Warfarin + Amiodarone) to confirm the alert renders correctly, and a test for a patient with no interactions to confirm the explicit safe state appears.

4. Executes and generates a bug report. O2 runs the generated flows against the PR preview. If DrugInteractionAlert doesn't reach a settled state within the expected window, the flow fails. O2 produces a bug report — accessible from the CI run linked on the PR — documenting which flows failed, what the expected behaviour was, and what the agent observed instead. The report is available before any reviewer merges the code.

The developer didn't write a test. The QA engineer didn't update selectors. The structural risk was surfaced at PR time, not after a nurse filed a bug report.

Running O2 in your CI pipeline

O2 integrates into GitHub Actions, GitLab CI, and CircleCI. The CLI command targets a specific branch comparison or commit so execution stays tied to the actual code delta:

# Test a feature branch against main
devassure test --path ./app --head feature/medication-page-streaming --base main

# Or target a specific commit in CI
devassure test --commit ${{ github.sha }} --environment staging

O2 reads your app/ directory to build the component and routing map. No configuration file. No selector inventory. No Playwright setup. The GitHub Action from the marketplace handles CI integration end-to-end.

Closing

App Router changed more than your folder structure. It changed when components run, what reaches the browser, and how the DOM assembles over time. Suspense boundaries that improve perceived performance create testing windows that most suites never account for. Server Components that eliminate client-side JavaScript remove the hydration lifecycle your assertions depended on.

In a general application, a test that misses a streamed component is a UX bug. In a clinical application, it can be a patient safety issue — the kind that passes CI on Friday and gets filed as a support ticket on Monday morning.

The fix isn't to slow down development or maintain a growing library of Playwright scripts. It's to move test generation to where the code change actually happens: the PR. O2 reads the diff, understands the rendering model of what changed, generates tests in plain English, and executes them — without a script in sight.

Test it like it changed.

DevAssure O2 is a PR-native autonomous testing agent. It connects to GitHub, reads your PR diff, maps the blast radius across your user flows, generates plain-English test cases, and executes them in a real browser. No scripts to write. No selectors to maintain. One command in your CI pipeline.

Get started with O2 →

The test still passes. The alert never renders.​

What App Router actually changed, and why it breaks your test assumptions​

React Server Components don't hydrate — and clinical UI depends on that distinction​

Streaming creates a window where your critical UI hasn't arrived yet​

The TOCTOU race condition that Playwright can't see at all​

Parallel routes and patient detail modals​

Why Playwright handles this better than Cypress — but still needs maintenance​

How O2 handles App Router rendering — without a single test script​

O2 maps impact before generating anything​

Tests are generated in plain English, not scripts​

O2 runs the tests, not you​

Existing tests are updated, not replaced​

Real example: the PR that introduced the Warfarin race condition​

Running O2 in your CI pipeline​

Closing​

Links​