Skip to main content

Introducing DevAssure O2 | The Autonomous Testing Agent for Every Pull Request

Divya Manohar
Co-Founder and CEO, DevAssure

Software testing has changed. Your tools should too.

Two years ago, the best we could offer developers was a faster way to write test scripts. Record your actions, generate a script, maintain it when the UI changes, debug it when it breaks.

That model is reaching its limits.

In 2026, AI coding tools write 20% or more of new code at companies like Google and Microsoft. Developers ship multiple PRs per day. Release cycles have compressed from weeks to days. And the old approach — write scripts, maintain scripts, fix flaky scripts — can't keep pace with how fast code moves.

DevAssure started as a low-code test automation platform. We built a recorder, a visual test builder, self-healing locators, and test data management. Hundreds of teams used it to automate faster than they could with Selenium or Playwright alone.

But we kept hearing the same thing from our users:

"The automation is faster, but we're still spending 30–40% of our time maintaining tests."

That feedback led us to rethink the problem entirely. The result is O2 Agent — an autonomous testing agent that reads your code changes, generates targeted tests, and executes them in a real browser. No scripts. No selectors. No maintenance.

This post introduces what DevAssure O2 is, how it works, and why we built it.

What is DevAssure O2?

DevAssure O2 is an autonomous testing agent that tests your web application through the browser — the same way a human tester would — but driven by AI instead of scripts.

You write test cases in plain English. O2 reads them, navigates your application, interacts with UI elements, and validates outcomes. It finds functional bugs (features that don't work correctly) and usability bugs (UI issues that confuse or block users).

O2 is available as:

  • A CLI tool (@devassure/cli on npm) — run tests from your terminal or CI/CD pipeline
  • A VS Code extension — write and execute tests directly from your editor
  • A GitHub Action (devassure-ai/devassure-action@v1) — test every PR automatically before merge

Why we moved beyond scripts

Traditional test automation — whether Selenium, Playwright, Cypress, or our own original platform — follows a common pattern:

  1. A human writes a test script with specific selectors (CSS, XPath, test IDs)
  2. The script runs against the application
  3. The UI changes → selectors break → a human fixes the script
  4. Repeat forever

This creates a second codebase — a body of test scripts that mirrors your application but doesn't ship features, doesn't generate revenue, and requires constant maintenance.

Our data from early DevAssure customers showed:

  • 30–40% of QA team time went to maintaining existing test scripts, not writing new ones
  • ~15% of E2E tests were flaky — failing not because of real bugs, but because of stale selectors, slow page loads, or environment inconsistencies
  • Teams learned to ignore red pipelines because flaky tests eroded trust in the CI process

The scripts themselves became the problem. Not because they were poorly written, but because scripts are the wrong abstraction for a codebase that changes continuously.

O2 takes a different approach. Instead of scripts that encode how to interact with the UI (click this selector, wait for this element, assert this text), O2 uses intent-driven testing: you describe what the user does, and the agent figures out how to execute it.

How O2 works

Step 1: Install and initialize

npm install -g @devassure/cli
devassure login
devassure init

This creates a .devassure/ folder in your project with configuration files.

Step 2: Configure your application

Tell O2 where your app lives and how to authenticate:

# .devassure/test_data.yaml
default:
url: 'https://staging.yourapp.com'
users:
default:
user_name: 'testuser@yourapp.com'
password: 'test-password'

Describe your application so O2 understands the context:

# .devassure/app.yaml
description: >
A project management SaaS app. Users can create projects,
add tasks, assign team members, track progress, and generate
reports. Key flows include signup, project creation, task
management, team invitations, and billing.
rules:
- All projects must have a name and at least one member
- Tasks require a title and an assigned user
- Billing page should display the current plan and usage

Step 3: Write tests in plain English

# .devassure/tests/task-management/create_task.yaml
summary: Create a new task and verify it appears in the project board
steps:
- Log in with default credentials
- Navigate to the "Marketing Campaign" project
- Click "Add Task"
- Enter "Design landing page mockup" as the task title
- Set the priority to "High"
- Assign the task to "Priya Sharma"
- Set the due date to next Friday
- Click "Create"
- Verify that the task "Design landing page mockup" appears on the board
- Verify that the task shows "High" priority and is assigned to "Priya Sharma"
priority: P0
tags:
- task-management
- smoke
- regression

No selectors. No locators. No page.click('#btn-create-task-v2'). Just plain English that describes what a user does.

Step 4: Run

# Run all tests
devassure run-tests

# Run by tag
devassure run-tests --tag=smoke

# Run by priority
devassure run-tests --priority=P0

# Run a specific folder
devassure run-tests --folder=task-management

Step 5: Review results

devassure open-report --last

O2 opens a detailed report showing what passed, what failed, and session recordings of each test execution.

What makes O2 different from scripted automation

1. No selectors, no locators, no maintenance

Scripted tools require you to tell them exactly which DOM element to interact with: #checkout-btn, .cart-item >> nth=0, [data-testid="submit"]. When the UI changes — a CSS class is renamed, a component is refactored, a design system is updated — these selectors break.

O2 uses visual reasoning to identify elements. When your test says "Click the Submit button," O2 finds the Submit button the same way a human would — by looking at the rendered page. If the button moves, changes color, or gets a new CSS class, O2 still finds it. Because the intent hasn't changed.

This is why O2 works on platforms that are notoriously hard to automate with traditional tools:

  • Salesforce Lightning — Shadow DOM, dynamic IDs, Aura/LWC hybrid rendering
  • Flutter Web — the entire UI is pixels inside a <canvas>, no DOM elements to select
  • Canvas-heavy applications — dashboards, charts, drag-and-drop interfaces

2. Intent-driven, not instruction-driven

A Playwright test says:

await page.locator('#coupon-input').fill('SAVE20');
await page.locator('#apply-btn').click();
const total = await page.locator('#total').textContent();
expect(total).toBe('$80.00');

An O2 test says:

- Apply coupon code "SAVE20"
- Verify the discount is applied
- Verify the total shows the correct amount after discount

Both test the same thing. But when the coupon input ID changes from #coupon-input to .promo-field, the Playwright test breaks. The O2 test doesn't — because "Apply coupon code" is still "Apply coupon code" regardless of the DOM structure.

3. Finds functional and usability bugs

O2 interacts with your application the way a real user does — clicking buttons, filling forms, scrolling, navigating, and resizing viewports. This means it catches bugs that scripted tests miss:

  • A button that renders but doesn't respond to clicks (event handler not wired up)
  • A form that submits without required field validation (missing error message)
  • A layout that breaks on mobile viewports (text overflow, hidden buttons)
  • A loading state that never resolves (infinite spinner after form submission)
  • A success message that shows when the action actually failed (UI not reflecting server state)

These are the bugs your users hit on day one. O2 finds them before merge.

4. Works with TestRail

If your team manages test cases in TestRail, O2 integrates directly:

devassure run-tests --provider testrail --post-results --add-defects --attach-videos

O2 fetches test cases from your TestRail run, executes them in a real browser, posts results back, creates defects for failures with full context, and attaches session recordings. Your workflow in TestRail doesn't change — it just gets an execution engine.

5. CI/CD native

For teams using GitHub Actions:

# .github/workflows/ci.yml
name: CI
on: [pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: devassure-ai/devassure-action@v1

One line. Every PR tested automatically before merge. Results appear as a GitHub status check. O2 is also available on the GitHub Marketplace.

For other CI systems (Jenkins, CircleCI, GitLab CI, TeamCity):

devassure add-token $DEVASSURE_TOKEN
devassure run-tests --tag=regression --archive=./reports

Who is O2 for?

For developers

You shouldn't have to write and maintain Playwright scripts alongside your production code. O2 lets you write test intent in plain English. When your code changes, the tests don't break — because there are no selectors to go stale.

Add O2 to your GitHub Actions workflow and every PR gets tested before merge. You review O2's results instead of manually clicking through features.

For QA engineers

O2 doesn't replace QA — it replaces the repetitive parts of QA. Instead of spending 40% of your time maintaining test scripts and debugging flaky selectors, you spend that time on:

  • Quality strategy — deciding what matters most to test and why
  • Exploratory testing — the creative, curiosity-driven testing that finds bugs no automated tool would think to look for
  • Domain expertise — understanding that a rounding error in a fintech app isn't just a bug, it's a compliance violation

Your role shifts from script maintenance to quality leadership.

For engineering leaders

O2 gives you a testing layer that scales with your team's output. As your developers adopt AI coding tools and ship more PRs, O2 keeps up — testing each PR without requiring additional QA headcount or a growing test maintenance backlog.

Key metrics from early O2 deployments:

  • Test creation time: Days → automated per PR
  • Maintenance overhead: Down 80%+
  • Production incidents from missed regressions: Reduced significantly within the first month
  • QA time reallocation: From 40% maintenance to 95% strategic quality work

Supported platforms and integrations

PlatformHow O2 works with it
Any web applicationBrowser-based testing via CLI or VS Code extension
GitHub ActionsOne-line YAML integration, PR-native testing
TestRailFetch cases, execute, post results, log defects, attach videos
SalesforceFlows and LWC tested through the UI — handles Shadow DOM natively
Flutter WebVisual reasoning on canvas-rendered UI — no DOM needed
Jenkins / CircleCI / GitLab CICLI integration via devassure run-tests
VS CodeWrite and run tests from your editor

Security and compliance

DevAssure is SOC 2 Type II certified. We take data security seriously:

  • Test credentials are stored encrypted and never leave your environment
  • Session recordings are stored in your configured archive location
  • O2 runs in an isolated browser context per test session
  • No application data is sent to DevAssure's servers beyond what's required for AI reasoning

Read our full security policy →

Getting started

Option 1: CLI (developers)

npm install -g @devassure/cli
devassure login
devassure init
# Write your first test in .devassure/tests/
devassure run-tests

Option 2: GitHub Action (teams)

- uses: devassure-ai/devassure-action@v1

Option 3: VS Code Extension (visual workflow)

Install the DevAssure extension from the VS Code marketplace. Write tests in YAML, run them from the editor, view results inline.

Try O2 free

We're offering $50 in free credits for 30 days — no credit card required. That's enough to test your critical flows across multiple runs.

DevAssure is built in Chennai, India, by a team that spent a decade fighting the same testing problems we now solve. We're backed by Eximius Ventures and trusted by engineering teams across fintech, SaaS, and enterprise.

Questions? Email me at divya@devassure.io — I read every message.

Frequently asked questions

DevAssure O2 is an autonomous testing agent that tests your web application through the browser — the same way a human tester would — but driven by AI instead of scripts. You write test cases in plain English; O2 navigates your application, interacts with UI elements, and validates outcomes.