E2E Strategy — Full-Stack Test Framework
E2E Strategy — Full-Stack Test Framework
Section titled “E2E Strategy — Full-Stack Test Framework”Date: 2026-04-21
Status: Locked
Supersedes: docs/specs/2026-04-20-phase-e-multi-role-e2e-spec.md (multi-role harness carried forward as M4 track; Phase E spec’s §3 TestModule + §4 scenarios remain authoritative).
What this spec is: the WHAT — platforms, envs, cadences, coverage map, milestones.
What this spec is NOT: the WHY (see ADR 0026) and the HOW-TO (see docs/testing.md).
Contents
Section titled “Contents”- Locked choices
- Environment matrix
- Test layers + shipped-surface coverage
- CI cadences
- Tooling
- Milestones M1–M7
- Risks + mitigations
- Out of scope
1. Locked choices
Section titled “1. Locked choices”Rationale + alternatives live in ADR 0026. Summary only here.
| # | Choice |
|---|---|
| Q1 — env matrix | local + dev + staging + prod. Real test users on prod via test_tenant UUID (Phase E §3). |
| Q2 — CI cadences | Pre-merge smoke / post-CD smoke / nightly full. Details §4. |
| Q3 — build order | M1 → M2 → M3 → M4 → M5 → M6 → M7 (demo → deploy → gaps → harness → mobile → nightly → device cloud). |
| Q4 — mobile | DEFERRED post-MVP0. Web-only coverage for MVP 0 (user decision 2026-04-21) — test the Expo web build from every angle first; iOS + Android Maestro authoring resumes when native platform testing moves to the critical path. |
| Q5 — device cloud | DEFERRED post-MVP0 alongside Q4 — mobile device cloud only matters once mobile test authoring starts. Re-activate with BrowserStack App Live ($99/mo) post-revenue. |
Framing (user, 2026-04-21): cover everything shipped; expect failures where impl is partial (chat, static maps); red tests = bug backlog.
2. Environment matrix
Section titled “2. Environment matrix”Same spec files, different targets. Switched via TEST_ENV=<local|dev|staging|prod>.
| Env | API_BASE_URL | DB | User pool | Clerk | Stripe |
|---|---|---|---|---|---|
| local | http://localhost:3000 | local Postgres (5433) | ephemeral, cleaned per run | dev | test |
| dev | https://api.ideony.is-a.dev (currently same host as prod) | dev Postgres | tenant-scoped, persistent until cleanup | dev | test |
| staging | (future) https://staging-api.ideony.is-a.dev | staging DB | tenant-scoped | staging | test |
| prod | https://api.ideony.is-a.dev | prod DB | tenant-scoped test users, isolated via test_tenant UUID; smoke subset only | prod | live (test-mode accounts only) |
Prod safety: ENABLE_TEST_ENDPOINTS=true in Dokploy env, toggled on only during smoke windows. TestGuard double-gates (env flag + signed X-Test-Tenant header). /test/cleanup is idempotent + tenant-scoped — never touches rows without test_tenant. Prod suite = subset: auth + health + one happy-path booking.
Test-route contract (POST /test/tenant/create, /test/cleanup, /test/geo-feed, /test/advance-time, /test/seed/:scenario, GET /test/state/..., /test/ws-tap/...) defined in Phase E spec §3.1 — unchanged.
3. Test layers + shipped-surface coverage
Section titled “3. Test layers + shipped-surface coverage”Pyramid
Section titled “Pyramid”| Layer | Tool | Count (current) |
|---|---|---|
| Mobile unit | Jest + @testing-library/react-native | 354 |
| API unit | Jest + @nestjs/testing | 195 |
| API integration | Jest + Supertest | 45 |
| Browser E2E | Playwright | 122 |
| Mobile E2E | Maestro (iOS + Android) | — M5 |
| Multi-role E2E | Orchestrator + Playwright/Maestro | — M4 |
Shipped-feature coverage
Section titled “Shipped-feature coverage”❓ = implementation partial → test expected to fail (diagnostic).
| Feature | Unit | API-integ | Browser E2E | Mobile E2E | Multi-role |
|---|---|---|---|---|---|
| Auth (Clerk email + social) | ✅ | ✅ | ✅ | ✅ | — |
| Consumer onboarding | ✅ | — | ✅ | ✅ | — |
| Professional onboarding + credentials | ✅ | ✅ | ✅ | ✅ | S5 |
| Home + search + category chips | ✅ | — | ✅ | ✅ | — |
| Booking create | ✅ | ✅ | ✅ | ✅ | S1 |
| Booking manage | ✅ | ✅ | ✅ | ✅ | S1, S3 |
| Reviews | ✅ | ✅ | ✅ | ✅ | S6 |
| Chat ❓ | ✅ | ✅ | ❓ | ❓ | S4 ❓ |
| SOS cascade | ✅ | ✅ | ✅ | ✅ | S2 |
| Payments (Stripe Connect) | ✅ | ✅ | ✅ | — (WebView) | S3 |
| Webhooks (Clerk svix + Stripe) | ✅ | ✅ (apps/api/test/e2e/{clerk,stripe}-webhook.e2e-spec.ts + rejection surface in e2e/tests/11-webhooks/) | — | — | — |
| Rate-limit | — | M3c — blocked on throttler impl | — | — | — |
| i18n (IT + EN) | ✅ | ✅ (e2e/tests/12-i18n/) | ✅ (e2e/web/i18n/language-switch.spec.ts) | deferred post-MVP0 (mobile native) | — |
| Map (static) ❓ | ✅ | — | ✅ | ✅ | — |
| Notifications (Novu) | ✅ | ✅ | — | — | S1, S4 |
| Admin (credentials) | ✅ | ✅ | ✅ | — | S5 |
P0 gaps status:
- Webhook E2E — ✅ closed (API-integ coverage existed; was a stale marker)
- i18n API-integ — ✅ closed (existed; stale marker)
- i18n browser/mobile — M3b: blocked on
testIDinstrumentation onLanguagePicker+ key nav elements; small mobile-repo PR - Rate-limit E2E — M3c: blocked on
@nestjs/throttlerimpl (DEFERRED post-MVP0)
Multi-role scenarios
Section titled “Multi-role scenarios”S1–S6 defined in Phase E spec §4. Unchanged. Each runs across web + iOS + Android once M4/M5 land.
4. CI cadences
Section titled “4. CI cadences”| Cadence | When | Scope | Budget |
|---|---|---|---|
| Pre-merge | every PR | lint + typecheck + unit (549) + API-integ (45) + 10 Playwright smoke flows | ~4 min |
| Post-CD | after scripts/deploy.sh succeeds | 5 Playwright flows vs prod URL; fail → rollback | <1 min |
| Nightly | 03:00 UTC, Hetzner cron | full Playwright (122) + Maestro iOS + Maestro Android + S1–S6 × 3 platforms | ~25 min |
Status:
- Pre-merge — ⚠️ GH Actions billing off; local
scripts/verify-local.shsubstitutes. Options to resume: self-hosted Hetzner runner (€0) OR restore GH billing. - Post-CD — ✅ partial (health + rollback live); expand to 5-flow Playwright smoke in M2.
- Nightly — ❌ blocked on M5 (Maestro flows) + M6 (cron setup). Post-MVP0.
MVP minimum shipped before anything else: local on-demand (pnpm e2e:all) + post-CD 5-flow smoke. Nightly is not a launch blocker.
5. Tooling
Section titled “5. Tooling”| Layer | Tool | Notes |
|---|---|---|
| Browser E2E | Playwright | webServer auto-boots API on local; multi-context for 2-actor web |
| Mobile E2E | Maestro | YAML flows; iOS sim + Android emu via local toolchains |
| Orchestrator | Node.js TS in test/e2e/multi-role/orchestrator/ | polls /test/state/... + /test/ws-tap/...; drives N actors |
| BE test routes | TestModule (Phase E §3.1) | gated by ENABLE_TEST_ENDPOINTS=true + signed X-Test-Tenant header |
| Tenancy isolation | test_tenant UUID + Prisma extension | DoorDash pattern; scopes reads + writes on tagged tables |
| Clock | ClockService + DelayService + POST /test/advance-time | SOS 30s countdown testable |
| Device cloud | LambdaTest free → BrowserStack App Live | see ADR 0026 Q5 |
Per ecosystem-plugin rule: @nestjs/* first in BE test code (e.g., @nestjs/testing), expo-* first in mobile test helpers.
6. Milestones M1–M7
Section titled “6. Milestones M1–M7”Each row blocks the next. M1–M3 = MVP minimum.
| ID | Milestone | Deliverable | Est. days |
|---|---|---|---|
| M1 | Demo walkthroughs (track C) | 3 Playwright specs for cofounder/investor paths | 0.5 |
| M2 | Deploy smoke (track A) | 5 Playwright flows in scripts/deploy.sh post-CD | 0.5 |
| M3 | Gap closure (audit) | Confirmed existing coverage: webhook signed-path + rejection E2E + i18n API-integ. Remaining gaps carved out: M3b browser i18n (needs testID instrumentation), M3c rate-limit E2E (blocked on throttler). | 0.25 |
| M3b | Browser i18n smoke | ✅ e2e/web/i18n/language-switch.spec.ts — consumer profile → picker → IT↔EN swap, asserted on profile.language string. testID added to LanguagePicker modal + options + profile menu row. | shipped |
| M3c | Rate-limit E2E | Land after @nestjs/throttler impl (deferred post-MVP0). | 0.25 |
| M4 | Multi-role harness (track B) | TestModule + Phase E M-E1…M-E7 + S1–S6 on Playwright web (web-only per Q4) | 2.5 |
| M5 | Maestro mobile flows | DEFERRED post-MVP0 (per Q4 — web-only MVP). Brief + YAMLs stay in Phase E spec for when native testing resumes. | — |
| M6 | Nightly cron | Hetzner runner + Slack webhook + report emitter (web suite only in MVP0) | 0.5 |
| M7 | Free-tier device cloud | DEFERRED post-MVP0 (per Q5 — mobile-specific). | — |
Total: ~7.5 dev days. M1–M3 pre-launch (~2 days); rest post-launch.
7. Risks + mitigations
Section titled “7. Risks + mitigations”| Risk | Severity | Mitigation |
|---|---|---|
| Prod test-user collision with real data | HIGH | test_tenant UUID + Prisma extension + double-gated TestGuard + scoped /test/cleanup |
| Flaky chat tests block CI | MEDIUM | test.fail(reason) expected-fail flag; surface in report, don’t block merge |
| Maestro iOS/Android flow drift | MEDIUM | Run both in nightly; treat divergence as first-class bug |
| GH Actions billing stays off | MEDIUM | Self-hosted Hetzner runner (€0) — reuses CAX11 |
test_tenant retrofit late | LOW | Add in M4 pre-scenario work, before specs depend on it |
8. Out of scope
Section titled “8. Out of scope”- Visual regression (separate track, post-UI-freeze)
- 3-actor scenarios (admin surface too small for MVP0)
- Load testing (k6 / artillery, post-MVP0)
- Accessibility E2E (axe-playwright, post-MVP0)
- Chaos engineering (post-Series-A)
- AI-authored tests (TestRigor, Mabl — see ADR 0026 §Alternatives)
Change log
Section titled “Change log”| Date | Change |
|---|---|
| 2026-04-21 | Initial lock; supersedes Phase E spec (scope widened); rationale moved to ADR 0026. |