App QA with AI: ADB + Spec-First catches 90% of production bugs

You build the app, tap through it, find a bug, fix it, tap through it again. How many times have you done that loop?

Manual QA is slow, but that's not the real problem. People naturally test the happy path. Does login work? Does the button tap? Does the screen load? You spend all your time confirming what already works.

But where do bugs actually happen? When the network drops. When someone enters an empty string. When they double-tap a button. Bugs live in the failure cases. There are too many scenarios to reproduce manually.

This post covers how to automate mobile QA with ADB — and the key ingredient that makes it actually work: Spec-First.

Manual QA

👀

Slow loop
Build → install → tap → check

☑

Happy path only
Confirms what works, misses what doesn't

❌

Human blind spots
Network, permissions, concurrency — all missed

~30%
error coverage

Spec-First + ADB

⚡

Instant loops
Run the script once, verify everything

☑

Full error case coverage
Every defined scenario, no exceptions

☑

Code never forgets
No lazy skips, no missed edge cases

90%+
error coverage

ADB: a remote control for your app

ADB (Android Debug Bridge) lets you control an Android device with code. Connect via USB or Wi-Fi to an emulator, and code can operate the app instead of human hands.

The essential commands:

Launch and quit

# Launch app
adb shell am start -n com.example.app/.MainActivity

# Force quit
adb shell am force-stop com.example.app

# Clear app data (back to logged-out state)
adb shell pm clear com.example.app

Simulate user input

# Tap (x=540, y=960)
adb shell input tap 540 960

# Swipe left
adb shell input swipe 800 500 300 500

# Type text
adb shell input text "hello@test.com"

# Back button
adb shell input keyevent 4

Verification tools

# Take screenshot
adb exec-out screencap -p > screenshot.png

# Dump UI structure of current screen
adb shell uiautomator dump /sdcard/ui.xml
adb pull /sdcard/ui.xml

# Check app logs
adb logcat -v time | grep "com.example.app"

Environment simulation

# Turn off Wi-Fi (test network error)
adb shell svc wifi disable

# Turn Wi-Fi back on
adb shell svc wifi enable

# Revoke permission (test permission denial)
adb shell pm revoke com.example.app android.permission.CAMERA

Combine these and you can replicate any manual QA action in a script. Taps, text input, swipes, screenshots, network drops — all of it.

But here's the thing.

What are you actually testing?

Knowing ADB gets you automation. But without knowing what to test, it's useless.

The common mistake: you write ADB scripts and only run happy path tests. Login success, screen transition success, data load success. You confirm what works and call it "QA automation done."

That's just manual QA written in code.

The real value of QA automation is exhaustively verifying failure cases. The stuff people skip because it's tedious. The edge cases they forget. The scenarios too numerous to try manually. Code needs to do those. That's what makes automation worth it.

The bottleneck isn't the tool. It's test design. If you don't define what to test first, no tool will save you.

QA Coverage Comparison

Happy Path Error Cases (7 categories)

ADB
tool only

Happy Path

Spec-First
+ ADB

Happy Path + All Error Cases

Tool doesn't determine QA quality. Design does.

Spec-First: the spec is the test

Spec-First is simple. Write the feature spec before the code. And in that spec, define not just the happy path — define every error case too.

Step 1: Define the happy path

Start with the scenario where everything works as expected.

## Feature: Login

### Happy Path
1. Launch the app
2. Enter a valid email in the email field
3. Enter the correct password in the password field
4. Tap the "Login" button
5. Home screen appears (within 3 seconds)

Everyone does this part. The difference is what comes next.

Step 2: Error case taxonomy

Categorize error cases systematically and you won't miss anything. Mobile app errors fall into 7 buckets.

7 Mobile Error Categories

⌨

Input

Empty, wrong format, extreme length

🔌

Network

Disconnected, timeout, slow response

🔒

Permissions

Camera, location denied

🔄

State

Background, screen rotation

⚔

Concurrency

Double tap, action during loading

🔑

Auth

Token expiry, session conflict

📄

Data

Empty array, null, unexpected format

Category	Description	Examples
Input	Invalid format, empty value, extreme value	Empty email, special chars only, 256+ chars
Network	Disconnected, timeout, slow response	Wi-Fi off, 3G conditions, dropped mid-request
Permissions	System permission denied	Camera denied, location denied
State	App lifecycle related	Background → foreground, screen rotation
Concurrency	Duplicate actions, timing issues	Double-tap login, back button during loading
Auth	Session/token related	Token expired, logged out from another device
Data	Unexpected server response	Empty array, null field, unexpected format

Step 3: Write the error definition table

Apply this taxonomy to each feature and you get a table like this.

ID	Category	Scenario	Input/Condition	Expected behavior
E1	Input	Empty email	""	"Please enter your email" inline error
E2	Input	Invalid email format	"abc"	"Please enter a valid email address"
E3	Input	Empty password	""	"Please enter your password"
E4	Input	Wrong password	Valid email + wrong password	"Incorrect password. Please try again."
E5	Network	Wi-Fi off	Valid input	"Check your internet connection" + retry button
E6	Network	Server timeout	Valid input	"Server not responding" + retry button
E7	Concurrency	Double-tap login	Tap twice fast	Only 1 request fires, button disabled
E8	Auth	Account locked	5 failed attempts	"Account locked. Reset your password."
E9	State	Background during loading	Home button while login request is in-flight	Result applied correctly on return
E10	State	Screen rotation during loading	Rotate during login request	No crash, state preserved

One spec row = one automated test. Define 10 error cases, get 10 automated tests. Anything not defined doesn't get tested. That's why this table is the heart of QA.

In practice: Spec → ADB test script

With an error definition table, writing ADB scripts is mechanical. And mechanical work is exactly what AI is good at.

Happy path script

#!/bin/bash
# happy_path_login.sh

echo "=== Login Happy Path ==="

# Clear app data & launch
adb shell pm clear com.example.app
adb shell am start -n com.example.app/.LoginActivity
sleep 2

# Enter email
adb shell input tap 540 400    # tap email field
adb shell input text "user@test.com"

# Enter password
adb shell input tap 540 550    # tap password field
adb shell input text "Password123"

# Tap login button
adb shell input tap 540 700
sleep 3

# Verify: check if home screen is showing
adb shell uiautomator dump /sdcard/ui.xml
adb pull /sdcard/ui.xml
if grep -q "home_screen" ui.xml; then
  echo "PASS: Home screen reached"
else
  echo "FAIL: Home screen not reached"
  adb exec-out screencap -p > fail_happy_path.png
fi

Error case scripts

E1 (empty email) and E5 (network off) from the spec table, converted to ADB:

#!/bin/bash
# error_cases_login.sh

echo "=== E1: Empty email ==="
adb shell pm clear com.example.app
adb shell am start -n com.example.app/.LoginActivity
sleep 2
# Enter password only, skip email
adb shell input tap 540 550
adb shell input text "Password123"
adb shell input tap 540 700    # login button
sleep 1
# Verify: check inline error message
adb shell uiautomator dump /sdcard/ui.xml
adb pull /sdcard/ui.xml
if grep -q "Please enter your email" ui.xml; then
  echo "E1 PASS"
else
  echo "E1 FAIL"
  adb exec-out screencap -p > fail_E1.png
fi

echo ""
echo "=== E5: Login with Wi-Fi off ==="
adb shell pm clear com.example.app
adb shell am start -n com.example.app/.LoginActivity
sleep 2
# Turn off Wi-Fi
adb shell svc wifi disable
sleep 1
# Enter valid credentials and try login
adb shell input tap 540 400
adb shell input text "user@test.com"
adb shell input tap 540 550
adb shell input text "Password123"
adb shell input tap 540 700
sleep 2
# Verify: check network error message
adb shell uiautomator dump /sdcard/ui.xml
adb pull /sdcard/ui.xml
if grep -q "internet connection" ui.xml; then
  echo "E5 PASS"
else
  echo "E5 FAIL"
  adb exec-out screencap -p > fail_E5.png
fi
# Restore Wi-Fi
adb shell svc wifi enable

See the pattern? Each row in the spec table becomes one test block. The Input/Condition column maps to ADB commands. The Expected behavior column maps to verification logic.

Hand this to AI. Give it the spec table and say "convert these error cases to ADB scripts." Whether it's 10 cases or 50, it generates them mechanically.

Spec → ADB Conversion

📋

Feature Spec Table

Error ID, scenario, expected behavior

🤖

AI Conversion

Mechanical 1:1 mapping

💻

ADB Script

Input simulation + verification

1 spec row = 1 test

How to prompt AI for this

The important thing is how you ask.

Convert these error cases from the feature spec into ADB scripts.

- Each error case gets its own independent test block
- Clear app data before each test (adb shell pm clear)
- Verify using uiautomator dump to check UI text
- Save screenshot on failure
- Always restore Wi-Fi after network tests

Precise spec → precise scripts. Spec quality determines test quality.

Emulator makes it even better

No physical device needed. Running on an emulator has extra benefits.

# Run emulator headless (no screen)
emulator -avd Pixel_8 -no-window -no-audio &

# Wait until emulator is ready
adb wait-for-device

# Restore a specific state from snapshot in 2 seconds
emulator -avd Pixel_8 -snapshot logged_in_state

The snapshot feature is especially powerful. Save "logged in state" as a snapshot. No need to repeat the login flow before every test. Restore to exactly where you need to be in 2 seconds and start testing immediately.

Spec-First + ADB catches 90% of errors

The bottom line:

Design determines QA quality, not tools.

ADB is a remote control. Doesn't matter how well you use the remote if you don't know what to press. The feature spec is the document that defines what to press.

Spec-First QA Pipeline

Write Feature Spec

Define the happy path

Categorize Errors

Systematic 7-category breakdown

Error Definition Table

Scenario + input + expected behavior

AI Generates ADB Scripts

Spec table → auto-conversion

Run on Emulator

Headless mode + snapshots

90%+ error coverage

Now you can focus on the remaining 10%

The process:

Write the feature spec — start with the happy path
Categorize error cases — input, network, permissions, state, concurrency, auth, data
Write the error definition table — scenario, input, expected behavior as a table
Have AI generate ADB scripts — spec table → automated conversion
Run on the emulator — headless mode, use snapshots

The key is exhaustively defining error cases and not missing any. Input boundaries, network failures, permission denials, state changes, concurrency issues — list them systematically, and 90% of what breaks in production is somewhere in that list.

The other 10% is rare timing bugs or device-specific issues. Hard to catch with any methodology upfront. But when you can automatically catch 90%, you finally have the headroom to focus on that 10%.

Define failures first. Verify with ADB. The tools are already there.

If you want to keep exploring this kind of thing together, sign up. I'll keep sharing what I find by actually trying it. 😊