Does this work with private GitHub repositories?

Yes. Add DEVASSURE_TOKEN as a repository secret and run devassure-ai/devassure-action@v1 in your private repo the same way as a public one. The agent runs inside your GitHub Actions runner; source stays in your environment per your secret and network policies.

Can I auto-test Next.js, React, or Vue pull requests?

Yes. The agent validates a running app (staging preview, Vercel preview URL, or local URL exposed to CI). Framework does not matter — React, Next.js, Vue, Svelte, and Rails SPAs all work as long as the PR deploy or preview is reachable from the runner.

Do I need existing Playwright or Cypress tests?

No. That is the point of autonomous PR testing: tests are generated from the diff for that PR. You can keep legacy suites in parallel during migration; O2 does not require them to exist first.

How is this different from only adding Playwright to GitHub Actions?

Playwright in Actions runs tests you already wrote and must maintain. An autonomous agent generates and executes tests from the change itself — no checked-in spec files, no selector updates when a designer moves a button.

What if my team already has a full E2E suite?

Many teams run both: legacy suite on a schedule or nightly, and an autonomous agent on every PR for change-scoped validation. That catches gaps where nobody wrote a test for the feature you just shipped.

How to Automatically Test Every Pull Request in 2026 (Without Writing a Single Test)

Divya Manohar

Co-Founder and CEO, DevAssure

Short answer

To automatically test every pull request on GitHub in 2026, add a pull_request workflow in GitHub Actions. The usual path is Playwright or Cypress plus tests you write and maintain. The alternative is an autonomous testing agent that reads the PR diff, generates E2E tests for that change, runs them, and posts a check — with zero test files in your repo.

Every team wants E2E tests on every PR. Almost none actually has them — because someone has to write those tests, fix them when the UI changes, and defend a thirty-minute CI job that still flakes.

If you are the developer opening the PR, that someone is often you, after hours, clicking re-run on a red check you do not trust.

This guide is for engineers who want auto test before merge without turning into the team's unpaid test maintainer. We will walk through the standard run tests on pull request GitHub Actions setup (Playwright and Cypress), name the real cost honestly, then cover the path where you did not write a single test and validation still runs on every PR.

Why PR testing breaks down in practice

The policy is simple: nothing merges without green checks. The reality is messier.

Flaky selectors and false reds

Playwright and Cypress are solid tools. Your tests are not flaky because the framework is bad. They are flaky because they encode today's DOM — and tomorrow's PR changes the DOM.

A renamed data-testid, a loading spinner that sometimes takes 200ms longer, a modal that moved behind a feature flag — the suite goes red. You did not break checkout. The test did.

Developers learn to ignore the check or hit Re-run jobs until it passes. That is worse than no gate.

Maintenance is a second job

E2E tests on every PR only work if the suite stays green. When product ships three UI tweaks a week, QA (or you) spend more time updating locators than writing features.

The backlog looks like this:

14 tests skipped with fixme
6 tests quarantined in a non-blocking job
1 required check everyone hates but leadership will not remove

Slow CI blocks merges

Full regression on every PR does not scale. Teams slice the problem:

Run smoke on PR, full suite nightly
Run E2E only on main
Run tests only when someone touches /e2e

Each compromise means the PR you are merging was not fully validated.

That is the gap. Not missing GitHub Actions — missing sustainable coverage at PR time.

The traditional approach: GitHub Actions + Playwright or Cypress

Tool-agnostic setup guides (including strong posts from teams like Shiplight and Oneuptime) show how to wire browsers into Actions. That is worth doing if you are committed to owning a script suite.

Below is a minimal, production-shaped pattern you can paste today.

Prerequisites

A GitHub repo with pull requests enabled
A deploy preview or staging URL the runner can reach (or a service container running your app in CI)
Node.js project with Playwright or Cypress already scaffolded

Playwright: test every PR on GitHub Actions

Create .github/workflows/e2e-playwright.yml:

name: E2E on pull request

on:
  pull_request:
    branches: [main]

concurrency:
  group: e2e-${{ github.head_ref }}
  cancel-in-progress: true

jobs:
  playwright:
    runs-on: ubuntu-latest
    timeout-minutes: 30
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright browsers
        run: npx playwright install --with-deps

      - name: Run Playwright tests
        run: npx playwright test
        env:
          BASE_URL: ${{ secrets.STAGING_URL }}
          CI: true

      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 7

What this gives you: real browsers, artifacts on failure, PR-triggered runs. What it does not give you: tests. You still author *.spec.ts files, maintain selectors, and decide what belongs in the PR slice versus nightly.

Cypress: run tests on pull request (GitHub Actions)

name: Cypress E2E on PR

on:
  pull_request:
    branches: [main]

jobs:
  cypress:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: cypress-io/github-action@v6
        with:
          build: npm run build
          start: npm run start
          wait-on: 'http://localhost:3000'
          browser: chrome
        env:
          CYPRESS_baseUrl: http://localhost:3000

      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: cypress-screenshots
          path: cypress/screenshots

Cypress can boot your app inside the job (start + wait-on) — useful for Next.js / React / Vue SPAs when you do not have an external preview URL. You still own cypress/e2e/** forever.

Making the check required

In Settings → Branches → Branch protection for main, enable Require status checks to pass and select playwright or cypress (your job name). That is how auto test before merge becomes policy.

For architecture background, see GitHub Actions for faster releases.

The real cost: who writes these tests?

YAML is the easy part. The expensive part is organizational.

Question	What usually happens
Who writes the first 50 E2E tests?	Whoever had time — often a senior dev or a QA hire
Who updates them when Settings moves to a drawer?	Same person, between feature work
Who triages flaky reds at 4 p.m. Friday?	Whoever merged last
Who adds coverage for the PR shipping Monday?	Frequently nobody — merge with fingers crossed

Automatically test pull requests sounds like a CI problem. It is a headcount and attention problem.

Teams hit one of three walls:

No suite — PRs merge with unit tests only; production catches regressions.
Stale suite — tests exist but nobody trusts them; required checks get disabled.
Growing suite — coverage improves until maintenance eats the team; velocity drops.

Posts that stop at “add this workflow” skip the wall every staff engineer has already hit.

If your interview answer for “how do you test PRs?” is still “we should add Playwright,” you are describing wall #1. If your answer is “we have Playwright but it is always red,” you are on wall #2.

The useful question for 2026: what if the tests were generated from the PR itself — and nothing lived in the repo to rot?

The autonomous alternative: tests from the change, not from the repo

Autonomous testing means an agent reads the diff, decides what user-visible behavior could break, generates tests for that scope, runs them in a real browser, and posts results on the PR. No tests/e2e/checkout.spec.ts to update when marketing changes a headline.

That is different from:

Running existing tests in Actions (Playwright/Cypress guides)
Recording tests in a low-code tool (still maintenance)
Asking Copilot to write a spec (same blind spots as the code it helped write)

The agent is independent of the author. It does not share your session context or assumptions. It validates the diff the way a careful reviewer would — by exercising behavior, not by re-reading your TypeScript.

This model is what we call shift smart: keep validation at the PR, remove the maintenance tax shift left accidentally dumped on developers.

For a deeper product overview, see the O2 testing agent. Below is the implementation path on GitHub.

Step-by-step: automatically test PRs with DevAssure O2

DevAssure O2 is an autonomous testing agent built for test every PR github workflows. You add one Actions file and a secret — you do not add a test framework to the repo.

1. Create a DevAssure token

Settings → Secrets and variables → Actions → New repository secret

Name: DEVASSURE_TOKEN
Value: your token

2. Add the workflow (real YAML)

Create .github/workflows/devassure-o2.yml:

name: DevAssure O2

on:
  pull_request:
    branches: [main]

concurrency:
  group: devassure-o2-${{ github.head_ref }}
  cancel-in-progress: true

jobs:
  test:
    runs-on: ubuntu-latest
    timeout-minutes: 45
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: devassure-ai/devassure-action@v1
        env:
          DEVASSURE_TOKEN: ${{ secrets.DEVASSURE_TOKEN }}

Three lines matter for automatically test pull requests correctness:

fetch-depth: 0 — full git history so the agent can diff against main
pull_request — fires when you push to an open PR, not only on open
DEVASSURE_TOKEN — never hard-code; use the secret

Point the agent at a reachable app via repository configuration or environment variables your team already uses for preview deploys (Vercel, Netlify, custom staging). The agent exercises a running URL — same requirement as Playwright with BASE_URL.

Marketplace listing: devassure-ai/devassure-action.

3. What runs when you push

On each PR update, the agent typically:

Reads the diff — files, functions, downstream impact
Maps affected flows — checkout touched, settings probably safe
Generates tests — plain-English scenarios scoped to the change (optional YAML in .devassure/tests/ if you want explicit control)
Executes in headless Chrome — semantic element resolution, not a brittle selector file in your repo
Posts a GitHub check — pass/fail, failure detail, session replay links

Coverage is change-scoped, which is why PR feedback often lands faster than “run all 400 tests.”

For a longer walkthrough with screenshots, see How to set up vibe testing on every pull request.

4. Require the check (optional but recommended)

Branch protection → require status check DevAssure O2. Now auto test before merge is enforced the same way you would enforce Playwright — except nobody is on the hook to fix getByRole('button', { name: /Submit/i }) when the label changes to Continue.

5. Optional: test before push

Install the Invisible (QA) Agent in VS Code or Cursor, or run npm i -g @devassure/cli. Same agent, local feedback, same logic as CI.

What happens on a failing PR

When the agent finds a real regression, the developer sees a failed required check on the PR — same mental model as a broken unit test or Playwright run.

Typical experience in GitHub:

Checks tab: DevAssure O2 — failed, with a link to logs
Summary: how many scenarios ran, which flow failed, plain-language step that did not pass
Artifacts / session: screenshot or replay of the browser state at failure (so you debug behavior, not a stack trace from line 42 of a spec you did not write)

Example failure you might see annotated on the PR:

DevAssure O2 — 26 passed, 1 failed

✗ checkout.apply_promo_on_retry
  Expected: discount visible after payment retry
  Observed: total unchanged after retry path

You fix the product bug, push again, the agent re-runs on the new diff. You do not “update the test” because there is no permanent test file for that promo edge case — the agent regenerates from the new code.

That loop is what makes e2e tests on every PR viable for small teams: the cost of adding coverage for a new edge case is not “create a Jira for QA.”

Contrast with Playwright failure mode: file tests/checkout.spec.ts line 88, selector timeout, you are not sure if the app or the test is wrong.

Playwright/Cypress vs autonomous PR testing

	Playwright/Cypress on PR	Autonomous agent (O2)
Tests in repo	Yes — you own every file	No — generated per PR
Maintenance on UI change	High — locators break	Low — semantic execution
Who authors coverage	Your team	Agent from diff
Setup complexity	Framework + browsers + specs	Workflow + secret
Best when	You need pixel-perfect custom specs	You need reliable PR gates fast

Many teams run both during migration: agent on every PR, legacy suite nightly.

Why vibe-coded PRs break production — PR-stage testing when AI wrote the diff
Quiet death of the test script — why suites stop scaling
How to test Cursor-generated code — independent agent on AI-written diffs
Selenium alternatives in 2026 — escaping script maintenance
DevAssure O2 on GitHub Marketplace

Frequently asked questions

Yes. The agent reads the PR diff, so only packages and paths touched in that change are in scope. In a Turborepo or Nx monorepo, a change limited to apps/web still maps to web flows — you are not forced to run the entire company-wide suite on every PR.

Why PR testing breaks down in practice​

Flaky selectors and false reds​

Maintenance is a second job​

Slow CI blocks merges​

The traditional approach: GitHub Actions + Playwright or Cypress​

Prerequisites​

Playwright: test every PR on GitHub Actions​

Cypress: run tests on pull request (GitHub Actions)​

Making the check required​

The real cost: who writes these tests?​

The autonomous alternative: tests from the change, not from the repo​

Step-by-step: automatically test PRs with DevAssure O2​

1. Create a DevAssure token​

2. Add the workflow (real YAML)​

3. What runs when you push​

4. Require the check (optional but recommended)​

5. Optional: test before push​

What happens on a failing PR​

Playwright/Cypress vs autonomous PR testing​

Related guides​

Frequently asked questions​

Does autonomous PR testing work with monorepos?