Testing Reference
Testing Reference
Section titled “Testing Reference”Diátaxis: How-to + Reference
Sources: docs/archive/2026-Q2/plans/e2e-test-plan.md, docs/specs/2026-04-20-phase-e-multi-role-e2e-spec.md, docs/research/2026-04-19-multi-role-e2e-deep-dive.md, CLAUDE.md
Test Pyramid
Section titled “Test Pyramid”| Layer | Tool | Count (current) | Scope |
|---|---|---|---|
| Mobile unit | Jest | 354 | Components, hooks, utils |
| API unit | Jest | 195 | Services, repositories, helpers |
| API integration E2E | Jest + Supertest | 45 | HTTP routes, DB, auth |
| Browser E2E | Playwright | 122 | Web app, critical user flows. webServer auto-boots @ideony/api for local runs (TEST_ENV=local); staging/prod skip and point API_BASE_URL at the deployed host. |
| Mobile E2E | Maestro | Phase E | iOS/Android orchestrated flows |
| Multi-role | Maestro + Playwright + Node.js orchestrator | Phase E | 2-actor concurrent scenarios |
Commands
Section titled “Commands”# All testspnpm test
# API unit (with coverage)pnpm --filter @ideony/api test:cov
# API integration E2Epnpm --filter @ideony/api test:e2e
# Smoke tests against deployed envSMOKE_TEST_URL=https://api.ideony.is-a.dev pnpm --filter @ideony/api smoke
# Mobile unitpnpm --filter @ideony/mobile test
# Demo walkthroughs (M1, track C — 3 specs, <2 min total)pnpm --filter @ideony/e2e test:demo
# Full browser E2E (Playwright, scenario-01..06 + smoke)pnpm --filter @ideony/e2e test:web
# Multi-role harness (M4 track B — Phase E §4 scenarios driven by Node.js orchestrator)# Requires ENABLE_TEST_ENDPOINTS=true + TEST_HARNESS_SECRET on the API.pnpm --filter @ideony/e2e test:multi-role
# Just the S1 scenario (consumer booking ↔ pro API transitions)pnpm --filter @ideony/e2e test:s1
# API-level E2E (Supertest-style Playwright against local API)pnpm --filter @ideony/e2e test:api
# Post-CD smoke against deployed URL (M2, track A — wired into scripts/deploy.sh)PROD_FE_URL=https://app.ideony.is-a.dev pnpm --filter @ideony/e2e test:prod-smokeWhy @ideony/e2e has no default test script: e2e runners are explicit by design — pre-push / turbo test would otherwise boot Expo + Postgres + Redis just to reach the first assertion, doubling push time. Explicit scripts (test:demo, test:api, test:web) keep intent clear and CI cadence predictable (see E2E strategy spec §4).
The test:demo script runs e2e/web/demo/*.spec.ts against the local Expo web build. Auto-boots Expo on 8081; requires CLERK_SECRET_KEY + E2E_CONSUMER_EMAIL in e2e/.env.local (see setup/global.setup.ts for the full variable list).
Lint scope
Section titled “Lint scope”Biome lints apps/** + packages/** source + config. Generated code under packages/api-client/src/generated/ is EXCLUDED — it comes from @hey-api/openapi-ts pipeline and would be regenerated on every pnpm generate:sdk, so linting it creates noise on every SDK refresh. Non-generated code in packages/api-client (e.g. the barrel file or custom helpers) is still linted.
Recent test-shape changes
Section titled “Recent test-shape changes”- 2026-04-21 (B20) —
test/app/(consumer)/home.test.tsxCategoryChips suite rewritten: chips now navigate directly on tap (router.push('/results?category=<slug>')) instead of acting as a toggle-filter against the prompt+send flow. Old 3-case toggle/deselect matrix replaced with 2-case direct-nav matrix. - 2026-04-21 (B20) —
test/app/results.test.tsxempty-state assertion updated.useLocalSearchParamsmock setscategory: "plumbing", so the EmptyState subtitle resolves tocommon.no_category_results(the per-category copy A6 introduced inapps/mobile/app/results.tsx). Non-category navigations still fall back tocommon.try_different_search— no production behavior change, test-only alignment. - 2026-04-21 (B23) —
apps/api/test/unit/job-parser.graph.spec.tstwo LLM-mock tests now use keyword-free inputs ("request xyz"instead of"Ho un tubo che perde..."). The new keyword pre-classifier short-circuits at confidence ≥ 0.9 before the LLM node runs, so tests that assert on LLM-mock output must avoid trigger words. Fallback path behavior is still covered in the third test. - 2026-04-21 (M4 S1 dry-run) — five consecutive runs landed the multi-role harness setup contract. Fixes on main:
e2e/global-setup.ts(clerkSetup()bootstrap),globalSetup+ dotenv pre-load inplaywright.multi-role.config.ts,+clerk_testsubaddress emails +EXPO_PUBLIC_API_URL=http://localhost:3000ine2e/.env.local, and newe2e/scripts/seed-clerk-users.ts(autonomous Clerk BAPI user seed — no dashboard). Open delta: first-login/tourredirect still blocks S1 final assertion — tracked indocs/plans/2026-04-21-m4-s1-diagnostic.md. - 2026-04-21 (M4 S5) —
e2e/web/multi-role/s5-credentials.spec.tsrefactored: multi-role scenario covering pro credential submission (P_IVA) + admin approval + trust score update. RequiresENABLE_TEST_ENDPOINTS=true,TEST_HARNESS_SECRET,CLERK_SECRET_KEY,E2E_CONSUMER_EMAIL,E2E_PRO_EMAIL; skips otherwise. Admin seed helper absent →test.fail(true, reason)documents gap without blocking spec parse.e2e/multi-role/helpers/credentials.tsaddswaitForCredentialStatuspolling helper (500ms interval, 15s ceiling).e2e/package.jsongains@types/node+@types/pgdevDependencies.
Test Conventions
Section titled “Test Conventions”- Behavior, not implementation. Tests verify what code does, not how. Refactor should not break passing tests.
- Edges and errors, not just happy path. Empty inputs, boundaries, malformed data, missing files, network failures.
- Mock boundaries only. Mock network, filesystem, time, uncontrolled external services. Never mock internal business logic.
- TDD order. RED (write failing test) → GREEN (implement) → REFACTOR. No impl before test.
- Coverage target. 80%+ per package.
- Fail fast.
--bail=1default for multi-role orchestrator runs.
E2E strategy + coverage
Section titled “E2E strategy + coverage”Canonical coverage map + milestones + env matrix → docs/specs/2026-04-21-e2e-strategy.md.
Rationale / alternatives → ADR 0026.
Multi-role harness architecture (TestModule, routes, 6 canonical scenarios S1–S6) → Phase E spec (superseded-by-scope; §3 + §4 remain authoritative).
SOTA prior art (Uber/Airbnb/DoorDash/Bolt/Glovo patterns) → docs/research/2026-04-19-multi-role-e2e-deep-dive.md.
Clerk BAPI — Test User Creation
Section titled “Clerk BAPI — Test User Creation”Use Backend API (not FAPI) for E2E test user creation — avoids FAPI rate limits:
// Create test user via Clerk BAPIconst user = await clerkClient.users.createUser({ emailAddress: ['test+<uuid>@ideony.test'], password: process.env.E2E_TEST_PASSWORD, firstName: 'Test', lastName: 'User',})// Create session JWTconst token = await clerkClient.sessions.getToken(session.id, 'jwt')Selector Conventions (updated R9 #175)
Section titled “Selector Conventions (updated R9 #175)”Preferred locator priority for e2e/web/ specs:
getByTestId(id)— maps todata-testidon Expo webgetByRole(role, { name })— ARIA semanticsgetByPlaceholder/getByText— last resort
Home screen testID inventory (added during R9):
| Component | testID | Purpose |
|---|---|---|
HomeTopBar | home-top-bar | Root bar container |
PromptCard | prompt-input | Prompt TextInput |
CategoryChips | category-chips | Scroll container |
CategoryChips chip | category-chip-{slug} | Individual chip (e.g. category-chip-plumbing) |
CI gating
Section titled “CI gating”Cadences + scopes live in the E2E strategy spec §4. Pre-merge runs on every branch (lint, typecheck, unit, API-integ, Playwright smoke); post-CD runs a subset against the deployed URL; nightly runs the full suite (Playwright + Maestro iOS + Maestro Android + multi-role) on Hetzner cron post-MVP0.
GH Actions provides postgres + redis service containers for CI unit + E2E runs when billing is enabled; the local gate scripts/verify-local.sh substitutes while billing is off.