Testing Reference

Diátaxis: How-to + Reference

Sources: docs/archive/2026-Q2/plans/e2e-test-plan.md, docs/specs/2026-04-20-phase-e-multi-role-e2e-spec.md, docs/research/2026-04-19-multi-role-e2e-deep-dive.md, CLAUDE.md

Test Pyramid

Layer	Tool	Count (current)	Scope
Mobile unit	Jest	354	Components, hooks, utils
API unit	Jest	195	Services, repositories, helpers
API integration E2E	Jest + Supertest	45	HTTP routes, DB, auth
Browser E2E	Playwright	122	Web app, critical user flows. `webServer` auto-boots `@ideony/api` for local runs (TEST_ENV=local); staging/prod skip and point `API_BASE_URL` at the deployed host.
Mobile E2E	Maestro	Phase E	iOS/Android orchestrated flows
Multi-role	Maestro + Playwright + Node.js orchestrator	Phase E	2-actor concurrent scenarios

Commands

# All tests
pnpm test

# API unit (with coverage)
pnpm --filter @ideony/api test:cov

# API integration E2E
pnpm --filter @ideony/api test:e2e

# Smoke tests against deployed env
SMOKE_TEST_URL=https://api.ideony.is-a.dev pnpm --filter @ideony/api smoke

# Mobile unit
pnpm --filter @ideony/mobile test

# Demo walkthroughs (M1, track C — 3 specs, <2 min total)
pnpm --filter @ideony/e2e test:demo

# Full browser E2E (Playwright, scenario-01..06 + smoke)
pnpm --filter @ideony/e2e test:web

# Multi-role harness (M4 track B — Phase E §4 scenarios driven by Node.js orchestrator)
# Requires ENABLE_TEST_ENDPOINTS=true + TEST_HARNESS_SECRET on the API.
pnpm --filter @ideony/e2e test:multi-role

# Just the S1 scenario (consumer booking ↔ pro API transitions)
pnpm --filter @ideony/e2e test:s1

# API-level E2E (Supertest-style Playwright against local API)
pnpm --filter @ideony/e2e test:api

# Post-CD smoke against deployed URL (M2, track A — wired into scripts/deploy.sh)
PROD_FE_URL=https://app.ideony.is-a.dev pnpm --filter @ideony/e2e test:prod-smoke

Why @ideony/e2e has no default test script: e2e runners are explicit by design — pre-push / turbo test would otherwise boot Expo + Postgres + Redis just to reach the first assertion, doubling push time. Explicit scripts (test:demo, test:api, test:web) keep intent clear and CI cadence predictable (see E2E strategy spec §4).

The test:demo script runs e2e/web/demo/*.spec.ts against the local Expo web build. Auto-boots Expo on 8081; requires CLERK_SECRET_KEY + E2E_CONSUMER_EMAIL in e2e/.env.local (see setup/global.setup.ts for the full variable list).

Lint scope

Biome lints apps/** + packages/** source + config. Generated code under packages/api-client/src/generated/ is EXCLUDED — it comes from @hey-api/openapi-ts pipeline and would be regenerated on every pnpm generate:sdk, so linting it creates noise on every SDK refresh. Non-generated code in packages/api-client (e.g. the barrel file or custom helpers) is still linted.

Recent test-shape changes

2026-04-21 (B20) — test/app/(consumer)/home.test.tsx CategoryChips suite rewritten: chips now navigate directly on tap (router.push('/results?category=<slug>')) instead of acting as a toggle-filter against the prompt+send flow. Old 3-case toggle/deselect matrix replaced with 2-case direct-nav matrix.
2026-04-21 (B20) — test/app/results.test.tsx empty-state assertion updated. useLocalSearchParams mock sets category: "plumbing", so the EmptyState subtitle resolves to common.no_category_results (the per-category copy A6 introduced in apps/mobile/app/results.tsx). Non-category navigations still fall back to common.try_different_search — no production behavior change, test-only alignment.
2026-04-21 (B23) — apps/api/test/unit/job-parser.graph.spec.ts two LLM-mock tests now use keyword-free inputs ("request xyz" instead of "Ho un tubo che perde..."). The new keyword pre-classifier short-circuits at confidence ≥ 0.9 before the LLM node runs, so tests that assert on LLM-mock output must avoid trigger words. Fallback path behavior is still covered in the third test.
2026-04-21 (M4 S1 dry-run) — five consecutive runs landed the multi-role harness setup contract. Fixes on main: e2e/global-setup.ts (clerkSetup() bootstrap), globalSetup + dotenv pre-load in playwright.multi-role.config.ts, +clerk_test subaddress emails + EXPO_PUBLIC_API_URL=http://localhost:3000 in e2e/.env.local, and new e2e/scripts/seed-clerk-users.ts (autonomous Clerk BAPI user seed — no dashboard). Open delta: first-login /tour redirect still blocks S1 final assertion — tracked in docs/plans/2026-04-21-m4-s1-diagnostic.md.
2026-04-21 (M4 S5) — e2e/web/multi-role/s5-credentials.spec.ts refactored: multi-role scenario covering pro credential submission (P_IVA) + admin approval + trust score update. Requires ENABLE_TEST_ENDPOINTS=true, TEST_HARNESS_SECRET, CLERK_SECRET_KEY, E2E_CONSUMER_EMAIL, E2E_PRO_EMAIL; skips otherwise. Admin seed helper absent → test.fail(true, reason) documents gap without blocking spec parse. e2e/multi-role/helpers/credentials.ts adds waitForCredentialStatus polling helper (500ms interval, 15s ceiling). e2e/package.json gains @types/node + @types/pg devDependencies.

Test Conventions

Behavior, not implementation. Tests verify what code does, not how. Refactor should not break passing tests.
Edges and errors, not just happy path. Empty inputs, boundaries, malformed data, missing files, network failures.
Mock boundaries only. Mock network, filesystem, time, uncontrolled external services. Never mock internal business logic.
TDD order. RED (write failing test) → GREEN (implement) → REFACTOR. No impl before test.
Coverage target. 80%+ per package.
Fail fast. --bail=1 default for multi-role orchestrator runs.

E2E strategy + coverage

Canonical coverage map + milestones + env matrix → docs/specs/2026-04-21-e2e-strategy.md. Rationale / alternatives → ADR 0026. Multi-role harness architecture (TestModule, routes, 6 canonical scenarios S1–S6) → Phase E spec (superseded-by-scope; §3 + §4 remain authoritative). SOTA prior art (Uber/Airbnb/DoorDash/Bolt/Glovo patterns) → docs/research/2026-04-19-multi-role-e2e-deep-dive.md.

Clerk BAPI — Test User Creation

Use Backend API (not FAPI) for E2E test user creation — avoids FAPI rate limits:

// Create test user via Clerk BAPI
const user = await clerkClient.users.createUser({
  emailAddress: ['test+<uuid>@ideony.test'],
  password: process.env.E2E_TEST_PASSWORD,
  firstName: 'Test',
  lastName: 'User',
})
// Create session JWT
const token = await clerkClient.sessions.getToken(session.id, 'jwt')

Selector Conventions (updated R9 #175)

Preferred locator priority for e2e/web/ specs:

getByTestId(id) — maps to data-testid on Expo web
getByRole(role, { name }) — ARIA semantics
getByPlaceholder / getByText — last resort

Home screen testID inventory (added during R9):

Component	`testID`	Purpose
`HomeTopBar`	`home-top-bar`	Root bar container
`PromptCard`	`prompt-input`	Prompt `TextInput`
`CategoryChips`	`category-chips`	Scroll container
`CategoryChips` chip	`category-chip-{slug}`	Individual chip (e.g. `category-chip-plumbing`)

CI gating

Cadences + scopes live in the E2E strategy spec §4. Pre-merge runs on every branch (lint, typecheck, unit, API-integ, Playwright smoke); post-CD runs a subset against the deployed URL; nightly runs the full suite (Playwright + Maestro iOS + Maestro Android + multi-role) on Hetzner cron post-MVP0.

GH Actions provides postgres + redis service containers for CI unit + E2E runs when billing is enabled; the local gate scripts/verify-local.sh substitutes while billing is off.