Phase E — Multi-Role E2E Test Harness Spec
Phase E — Multi-Role E2E Test Harness Spec
Section titled “Phase E — Multi-Role E2E Test Harness Spec”Date: 2026-04-20
Status: Superseded by docs/specs/2026-04-21-e2e-strategy.md — the multi-role harness (Q1–Q10 locked decisions, 6 canonical scenarios, TestModule design, milestones M-E1–M-E7) is carried forward verbatim as the M4 track of the new spec. Read this doc for deep architectural rationale; read the 2026-04-21 spec for the full-stack strategy (web + iOS + Android + env matrix + CI cadences).
Superseded-by: docs/specs/2026-04-21-e2e-strategy.md
Scope: Build the automated multi-role (consumer ↔ professional) E2E harness for Ideony MVP 0 — deterministic tenancy, clock control, GPS injection, 6 canonical scenarios on Playwright web + a direct Socket.IO integration layer.
Phase mapping: Phase E of the Ideony MVP 0 blueprint — unblocks confident pre-demo regression + post-demo nightly gate.
Related docs:
/Users/acidrums7/.claude/projects/-Users-acidrums7-Documents-Coding-Lavoro-Projects-Ideony/memory/project_multi_role_e2e_decisions.md(locked Q1–Q10)plans/research/2026-04-19-multi-role-e2e-deep-dive.md(5565-word SOTA deep-dive — rationale only)plans/specs/2026-04-20-ux-phase-c-design.md,plans/specs/2026-04-20-ux-phase-d-design.md(style reference)/Users/acidrums7/Documents/Coding/Lavoro/Projects/Ideony/CLAUDE.md(monorepo + conventions)
Table of Contents
Section titled “Table of Contents”- Executive Summary
- Locked Architecture (Q1–Q10)
- Test Harness Components
- Six Canonical Scenarios
- Milestones M-E1 … M-E7
- Directory Structure
- CI + Local Integration
- Risk List + Mitigations
- NOT In Scope
- Change Log
1. Executive Summary
Section titled “1. Executive Summary”Phase E delivers an automated multi-role E2E harness that catches consumer ↔ pro interaction bugs — booking state races, SOS cascade dispatch ordering, chat delivery across WebSocket, credential-trust ripple — that single-actor tests structurally miss. The harness uses Playwright TypeScript driving two browser contexts against a shared NestJS backend on the Hetzner dev instance, coordinated by a test-only module that exposes deterministic seeding, test_tenant isolation, FakeClock time control, and PostGIS geo injection. Six canonical 2-actor scenarios cover the marketplace’s critical interaction surface. Local-only for MVP 0 (pre-demo); nightly cron activates post-demo on the same Hetzner dev box. Zero ongoing cost (Playwright OSS, self-hosted runner, dev Clerk/Stripe/Novu already configured).
2. Locked Architecture (Q1–Q10)
Section titled “2. Locked Architecture (Q1–Q10)”Every decision below is locked. Full rationale lives in the memory file — do not reopen during impl.
| # | Question | Choice | One-line rationale |
|---|---|---|---|
| Q1 | Tenancy model | test_tenant UUID column + TenantMiddleware on shared dev DB | Avoids ephemeral-DB 30s startup penalty; Hetzner prod is repurposed as dev → no extra infra |
| Q2 | GPS injection | Hybrid — BE POST /test/geo-feed default + 3 device-level smoke tests (web/iOS/Android) | Fast + deterministic for dispatch matching; 3 tests preserve client GPS upload coverage |
| Q3 | Clock control | ClockService DI + DelayService BullMQ wrapper + POST /test/advance-time | SOS 30s countdown runs in BullMQ; per-request header alone misses cron/WS emitters |
| Q4 | Test DSL | Maestro (mobile, deferred) + Playwright (web) — no TestRigor | €300/mo pre-revenue unjustified; lock-in risk; Maestro YAML already ~70% English-readable |
| Q5 | Execution location | Local-only MVP 0; cron nightly post-demo on dev instance | Fast-dev philosophy; Mac M-series handles 4 pairs parallel; €0 |
| Q6 | Visual regression | Dedicated single-actor suite, deferred post-MVP 0 | Separation of signal — multi-role snapshots flake; 44 screens baseline cost high |
| Q7 | Mobile E2E orchestration | Playwright web only for MVP 0; Maestro added on mobile traction | Expo web renders 95% of mobile logic; Maestro Cloud €39–449/mo unjustified |
| Q8 | Fail-fast vs run-all | Hybrid — --bail=1 default, --no-bail via :full script | Dev iteration wants fast feedback; nightly + pre-demo want full triage |
| Q9 | External services | All real dev envs (Stripe test, Clerk dev, Novu dev) | Mock drift is real maintenance tax; real integrations catch webhook-signature/JWT-claim/template drift |
| Q10 | 3-actor scenarios | Defer — 2-actor + single-actor admin integration tests | MVP 0 admin surface small (credentials only); 3-actor orchestration complexity > value |
Infra side-effect from Q1: Hetzner 178.104.154.74 CAX11 is now the dev environment. Rename Dokploy project env production → development. Prod spins up fresh when revenue exists.
3. Test Harness Components
Section titled “3. Test Harness Components”3.1 Backend test module — apps/api/src/modules/test/
Section titled “3.1 Backend test module — apps/api/src/modules/test/”Gated entirely by env flag ENABLE_TEST_ENDPOINTS=true. Module is conditionally imported in AppModule — in production the entire module tree is absent from the bundle. Double-gate with a TestGuard that re-checks the env flag + a X-Test-Tenant header signed with a shared secret (TEST_HARNESS_SECRET, rotated per env).
Routes (all prefix /test):
| Route | Method | Body / Query | Response | Purpose |
|---|---|---|---|---|
/test/tenant/create | POST | {} | { test_tenant: string } (UUID v4) | Allocate a fresh tenant namespace for a scenario run |
/test/cleanup | POST | ?tenant=<uuid> | { deleted: { users, bookings, quotes, reviews, credentials, stripe_accounts, clerk_users, novu_subscribers } } | Sweep all rows + external-service artefacts tagged with the tenant |
/test/geo-feed | POST | { professionalId, lat, lng } | { ok: true, updatedAt } | Write PostGIS point into ProfessionalProfile.location for dispatch/matching |
/test/advance-time | POST | ?ms=<int> | { now: ISO8601, jobsFired: int } | Advance FakeClock by ms, trigger every BullMQ job whose runAt ≤ now |
/test/seed/:scenario | POST | ?tenant=<uuid> | { seeded: object } | Materialise a named scenario fixture (6 built-ins, see §4) |
/test/state/booking/:id | GET | — | { status, … } | Polling endpoint for state convergence (sync primitive B from research §4.4) |
/test/ws-tap/:channel | GET | — | SSE stream | Proxy Redis pub/sub events for a tenant so Playwright can assert on events without WS client setup |
Guard chain: TestGuard → TenantMiddleware → handler. TenantMiddleware rejects any request without X-Test-Tenant header on /test/* routes. On non-test routes, if the header is present, the middleware scopes all Prisma queries via a ClsModule-stashed tenant ID; Prisma client extensions inject WHERE test_tenant = $1 OR test_tenant IS NULL into reads, and set test_tenant = $1 on writes for tagged tables.
Files to add:
apps/api/src/modules/test/test.module.tsapps/api/src/modules/test/test.controller.tsapps/api/src/modules/test/test.service.tsapps/api/src/modules/test/test.guard.tsapps/api/src/modules/test/fixtures/*.ts(6 scenario seeders)apps/api/src/common/tenant/tenant.middleware.tsapps/api/src/common/tenant/tenant.cls-store.ts(ClsModule-backed)apps/api/src/common/prisma/tenant-extension.ts(Prisma client extension scoping reads/writes)
3.2 Clock abstraction — apps/api/src/common/clock/
Section titled “3.2 Clock abstraction — apps/api/src/common/clock/”ClockService wraps every new Date() / Date.now() in business logic. Two implementations:
SystemClockService(prod, default) — returnsnew Date()verbatim;advance()throwsForbiddenException.FakeClockService(test, activated byENABLE_TEST_ENDPOINTS=true) — holds an internal offset + mutex,now()returnsnew Date(Date.now() + offsetMs),advance(ms)bumps the offset + emits an internalClockAdvancedevent.
Refactor surface (~40 call sites): BookingService, QuoteService, SOSService, AvailabilityService, ReviewService, CredentialsService, BullMQ processors. Replace every new Date() + Date.now() with this.clock.now() / this.clock.ms().
3.3 Delay abstraction — apps/api/src/common/delay/
Section titled “3.3 Delay abstraction — apps/api/src/common/delay/”DelayService.schedule(queue, jobData, delayMs) wraps queue.add(…, { delay }) calls. In test mode, registers each pending delay against FakeClockService — when /test/advance-time fires, DelayService promotes every delay whose dueAt ≤ clock.now() into an immediate queue.add(…, { delay: 0 }). This covers:
- SOS 30s countdown (
sos-countdownqueue) - Booking reminders (
booking-remindersqueue, 24h + 1h before) - Quote expiry (
quote-expiryqueue, 48h TTL) - Review prompts (
review-promptsqueue, post-completion)
Key invariant: production SystemClockService + direct queue.add path is unchanged; DelayService becomes a pure pass-through in prod (no test hooks loaded).
3.4 Tenancy column migration
Section titled “3.4 Tenancy column migration”Prisma migration 20260420_add_test_tenant adds:
test_tenant UUID NULLcolumn on:User,ProfessionalProfile,Booking,Quote,Review,Credential,ChatMessage,ChatThread,Dispatch,Notification,PushToken.- Index
CREATE INDEX idx_<table>_test_tenant ON <table>(test_tenant) WHERE test_tenant IS NOT NULL;(partial index — prod rows have NULL, zero overhead). - NOT added to:
Category,Service,I18nKey, other reference tables that are shared across tenants by design.
Prod rows = test_tenant IS NULL → invisible to tenant-scoped reads + safe from tenant-scoped deletes.
3.5 Tenant middleware
Section titled “3.5 Tenant middleware”TenantMiddleware (Nest NestMiddleware) runs on every request. Behavior:
- If
X-Test-Tenantheader missing → pass through (prod path). - If present → validate as UUID, validate
TEST_HARNESS_SECRETsignature header, push tenant into ClsModule store for the request lifecycle. - Prisma client extension reads from CLS store and injects
test_tenantinto create/update payloads,WHEREclauses on find/findMany,WHEREclauses on delete/deleteMany — for every tagged table. - External service calls (Stripe, Clerk, Novu) prefix resources with
test_tenant:<uuid>:in metadata/subscriber IDs so/test/cleanupcan sweep them.
3.6 Test-harness HTTP client — apps/mobile/test/e2e-web/lib/test-api.ts
Section titled “3.6 Test-harness HTTP client — apps/mobile/test/e2e-web/lib/test-api.ts”Thin wrapper around the BE test endpoints used by Playwright specs + shared fixtures:
class TestApi { createTenant(): Promise<string> cleanupTenant(id: string): Promise<void> seedScenario(name: ScenarioName, tenant: string): Promise<SeedResult> advanceClock(ms: number): Promise<void> feedGeo(proId: string, lat: number, lng: number): Promise<void> getBookingState(id: string): Promise<BookingState> waitForBookingStatus(id: string, status: BookingStatus, opts?: { timeoutMs }): Promise<void> waitForWsEvent(channel: string, predicate: (ev) => boolean, opts?: { timeoutMs }): Promise<Event>}Exposes higher-level sync helpers (waitForBookingStatus, waitForWsEvent) implementing the research-doc §4.4 primitive split — polling for state convergence, SSE tap for event-fired assertions.
3.7 Cleanup contract
Section titled “3.7 Cleanup contract”Every scenario calls cleanupTenant(id) in afterAll. The endpoint performs:
DELETE FROM <tagged_table> WHERE test_tenant = $1(11 tables, FK-safe order via Prisma cascade rules).stripe.accounts.del(…)for every Connect account whosemetadata.test_tenant === id.clerk.users.deleteUser(…)for every user whoseprivateMetadata.test_tenant === id.novu.subscribers.delete(…)for every subscriber ID prefixedtest_tenant:<id>:.- BullMQ:
queue.clean(0, 1000, 'delayed' | 'waiting' | 'active')filtered byjobData.test_tenant === id.
Sweep is idempotent — safe to call twice (e.g. after a crash).
4. Six Canonical Scenarios
Section titled “4. Six Canonical Scenarios”Each scenario = 1 Playwright spec file + 1 named seed fixture. All 2-actor (consumer + pro). Runtime target ≤ 2 min each on M-series laptop.
4.1 Scenario 01 — Booking full cycle
Section titled “4.1 Scenario 01 — Booking full cycle”File: apps/mobile/test/e2e-web/scenarios/01-booking-full-cycle.spec.ts
Seed: booking_full_cycle_rome — 1 consumer, 1 pro (verified trust tier), 1 category (plumbing), pro at 41.9028, 12.4964 (Rome Colosseum).
Actors:
- Consumer: logs in, searches “idraulico”, picks pro, books instant slot.
- Pro: receives booking, accepts via dashboard, marks arrived, marks completed.
Happy path:
- Consumer opens
/, searches category → ProCard for seeded pro visible. - Consumer taps pro →
/professional/[id]→ “Prenota” →/book/[professionalId]→ picks slot (next hour) → pays via Stripe test card4242…→ booking created inPENDING_ACCEPTANCE. waitForWsEvent('booking:new', { bookingId })fires on pro’s WS channel within 5s.- Pro dashboard → Requests tab → card shows → taps “Accetta” → confirmation bottom sheet → booking →
ACCEPTED. advanceClock(3600_000)→ clock is now scheduled start time.- Pro taps “Sono arrivato” → booking
IN_PROGRESS. - Pro taps “Completa” → booking
COMPLETED, Stripe PaymentIntent captured. - Consumer sees receipt + review prompt within 3s.
Assertions:
- Booking status transitions:
CREATED → PENDING_ACCEPTANCE → ACCEPTED → IN_PROGRESS → COMPLETEDin DB (verify via/test/state/booking/:id). - Stripe PaymentIntent state
requires_capture → succeeded. - Novu event log contains
booking.accepted+booking.completedfor consumer subscriber. - Pro earnings tab shows the amount in “pending” bucket.
- Cleanup sweeps all of the above.
Timing: 1 clock advance (+1h for scheduled start). No geo feed needed.
4.2 Scenario 02 — SOS cascade dispatch
Section titled “4.2 Scenario 02 — SOS cascade dispatch”File: scenarios/02-sos-cascade.spec.ts
Seed: sos_burst_pipe_rome — 1 consumer at 41.9028, 12.4964, 3 pros (p1, p2, p3) at concentric distances (1 km, 3 km, 8 km), all plumbing-qualified, all online.
Actors:
- Consumer: opens SOS flow, describes “tubo scoppiato”, confirms dispatch.
- Pro1 (nearest): receives first offer, ignores (times out).
- Pro2 (middle): receives cascaded offer, accepts.
- Pro3 (farthest): never sees the offer.
Happy path:
- Consumer:
/(welcome)→/→ holds SOS tab →/sos→ describes problem → confirm. feedGeo('p1', 41.9040, 12.4960)etc. — positions set via BE, dispatch matching picks p1 first.waitForWsEvent('sos:offer', { proId: 'p1' }).advanceClock(30_000)— p1 countdown expires; BullMQsos-countdownjob fires; dispatch cascades to p2.waitForWsEvent('sos:offer', { proId: 'p2' }).- Pro2 (2nd Playwright context) taps “Accetta” on SOS screen.
waitForBookingStatus(booking, 'ACCEPTED').- Pro2 enters live-tracking flow —
feedGeocalled 5× over simulated 10min (clock advanced 2min per tick) to simulate travel. - Pro2 taps “Sono arrivato” →
IN_PROGRESS.
Assertions:
Dispatchrows: p1 statusOFFERED → EXPIRED, p2 statusOFFERED → ACCEPTED, p3 status never created.- Booking status:
CREATED → DISPATCHING → ACCEPTED → IN_PROGRESS. - Pro3 WebSocket never received
sos:offerfor this booking. - Consumer receives exactly one
booking.acceptedNovu notification (not two from cascade race).
Timing: 1 geo seed (3 points), 6 clock advances (30s expiry + 5× 2min travel). Heaviest scenario.
4.3 Scenario 03 — Consumer cancel with refund
Section titled “4.3 Scenario 03 — Consumer cancel with refund”File: scenarios/03-cancel-refund.spec.ts
Seed: Same as scenario 01 but booking pre-seeded in ACCEPTED state, scheduled 3h from test clock start.
Actors:
- Consumer: opens booking detail, cancels.
- Pro: receives cancellation notice, sees updated calendar.
Happy path:
- Consumer
/booking/[id]→ “Annulla prenotazione” → confirmation sheet → confirm. - BE applies cancellation policy (> 2h notice = full refund).
- Stripe refund created.
- Pro’s WS
booking:cancelledevent fires. - Pro’s calendar slot freed.
Assertions:
- Booking status:
ACCEPTED → CANCELLED_BY_CONSUMER. - Stripe refund exists, amount = full booking price.
- Pro
Availabilitytable shows the slot no longer blocked. - Both actors see identical cancellation reason + amount refunded.
Timing: No clock advance needed (policy check uses clock.now() which test harness sets relative to scheduled time via seed).
4.4 Scenario 04 — Chat delivery under disconnection
Section titled “4.4 Scenario 04 — Chat delivery under disconnection”File: scenarios/04-chat-delivery.spec.ts
Seed: Booking in ACCEPTED state; chat thread auto-created.
Actors:
- Consumer: sends messages, briefly disconnects, reconnects.
- Pro: receives all messages in order.
Happy path:
- Consumer sends M1 “A che ora arrivi?” → pro receives within 1s.
- Pro sends M2 “Entro 15 min” → consumer receives within 1s.
- Consumer closes browser tab (Playwright
context.close()), opens fresh tab, re-auths. - While offline, pro sends M3 + M4.
- Consumer reconnects →
chat:syncloads M3 + M4 in chronological order. - Consumer sends M5 with optimistic UI → confirmed delivered within 1s.
Assertions:
- DB
ChatMessagerows: 5 total, chronologicalcreatedAt. - No duplicate messages (test the idempotency key).
- Read receipts fire bidirectionally.
- Both actors’ UI shows same last-message + unread counts.
Timing: No clock advance. Connection-level test.
4.5 Scenario 05 — Credential submission → approval → trust tier ripple
Section titled “4.5 Scenario 05 — Credential submission → approval → trust tier ripple”File: scenarios/05-credential-approval.spec.ts
Seed: Pro with trustTier = BASIC (score 10), no credentials yet. One consumer browsing.
Actors:
- Pro (actor 1): submits P_IVA + INSURANCE credentials via upload flow.
- Single-actor admin integration test runs as part of this scenario via BE direct API call (per Q10 — admin is NOT a 2nd orchestrated browser).
- Consumer (actor 2): searches, sees pro, observes trust badge before + after approval.
Happy path:
- Pro:
/credentials→ upload P_IVA →/credentials/me/:id/upload-url→ S3 presigned PUT → statusPENDING. - Pro uploads INSURANCE similarly.
- Admin: BE direct call
POST /admin/credentials/:id/approvefor both, via test harness with admin service token. - Trust engine re-computes: P_IVA (30) + INSURANCE (25) + base (10) = 65 → tier
VERIFIED. - Consumer (fresh tab, same tenant) searches category → pro card now shows
VerifiedBadge. - Novu notification
credential.approveddelivered to pro’s subscriber.
Assertions:
Credential.status = APPROVEDfor both rows.ProfessionalProfile.trustScore = 65,trustTier = VERIFIED.- Consumer search response payload includes
trustTier: 'VERIFIED'. - Rate limit sanity: 3rd upload attempt within same day → 429 (Redis counter at
rate:credential-upload:<proId>:<YYYY-MM-DD>).
Timing: No clock advance. Tests the trust-ripple path Q10 flagged as needing coverage.
4.6 Scenario 06 — Rating + review round-trip
Section titled “4.6 Scenario 06 — Rating + review round-trip”File: scenarios/06-review-round-trip.spec.ts
Seed: Completed booking from scenario 01’s end state (can chain or re-seed).
Actors:
- Consumer: submits 5-star review.
- Pro: sees rating reflected on profile + dashboard.
Happy path:
- Consumer:
/review/[bookingId]→ 5 stars → “Ottimo lavoro, puntuale.” → submit. - DB review row created; pro’s aggregate rating recomputed.
- Pro dashboard polls → rating badge updates.
- Consumer profile shows the review in their history.
advanceClock(604_800_000)(+7 days) → review “editable window” closes.- Consumer attempts to edit review → 403 forbidden.
Assertions:
Review.rating = 5,Review.commentmatches.ProfessionalProfile.ratingAvg+ratingCountupdated atomically.- Review appears in public
/professional/[id]page for a 3rd unauthenticated browser context (verifies cache invalidation). - Post-clock-advance edit returns 403 with i18n-keyed error message.
Timing: 1 clock advance (+7 days).
5. Milestones M-E1 … M-E7
Section titled “5. Milestones M-E1 … M-E7”All independently PR-mergeable. Dependency order strict. Each milestone targets ≤ 2 working days.
M-E1 — Test module scaffold + TenantMiddleware + migration (2 days)
Section titled “M-E1 — Test module scaffold + TenantMiddleware + migration (2 days)”Scope:
- Prisma migration
20260420_add_test_tenant— column + partial indexes on 11 tables. apps/api/src/modules/test/— module, controller, service, guard (env + secret gate).TestControllerroutes:POST /test/tenant/create,POST /test/cleanup(rows only — external sweeps deferred to M-E6).TenantMiddleware+ ClsModule wiring + Prisma client extension for tenant scoping.- Env var
ENABLE_TEST_ENDPOINTS+TEST_HARNESS_SECRETadded to.env.example+ Dokploy dev config. - Unit tests for middleware (tenant-scoped reads, tenant-tagged writes, prod-path bypass when header absent).
Exit criteria:
- Migration applies cleanly on dev DB (verify partial-index presence).
- With
ENABLE_TEST_ENDPOINTS=false→ module tree absent from the Nest registry; routes return 404. - With flag on →
POST /test/tenant/createreturns UUID;POST /test/cleanupdeletes tenant-tagged rows only. - Tests green, no prod-path regression.
Deps: none (foundational).
M-E2 — ClockService + DelayService refactor (2 days)
Section titled “M-E2 — ClockService + DelayService refactor (2 days)”Scope:
apps/api/src/common/clock/—ClockServiceabstract,SystemClockService,FakeClockService.apps/api/src/common/delay/—DelayService.schedule()wrapper.- Refactor ~40 call sites across Booking, Quote, SOS, Availability, Review, Credentials + 4 BullMQ processors to use
clock.now()+delay.schedule(). POST /test/advance-time?ms=Nendpoint wired toFakeClockService.advance()+DelayService.flushDueBy(clock.now()).- Unit tests: FakeClock monotonicity, DelayService job promotion, production throw-on-advance.
Exit criteria:
- Zero
new Date()/Date.now()calls in business-logic paths (enforced via Biome custom rule OR ripgrep CI gate). /test/advance-timefires due BullMQ jobs within same request cycle.- All 240+ existing BE tests still green.
Deps: M-E1 (test module for endpoint).
M-E3 — Geo feed + SSE event tap + seed fixtures (1.5 days)
Section titled “M-E3 — Geo feed + SSE event tap + seed fixtures (1.5 days)”Scope:
POST /test/geo-feedwrites PostGIS point viaProfessionalService.updateLocation().GET /test/ws-tap/:channelSSE endpoint — subscribes to Redisevents:*pattern scoped by tenant, streams filtered events to Playwright.- 6 scenario seed fixtures under
apps/api/src/modules/test/fixtures/— each idempotent, tenant-scoped. POST /test/seed/:scenario?tenant=<uuid>dispatcher.
Exit criteria:
- Manually:
curl POST /test/geo-feed→SELECT ST_AsText(location) FROM professional_profilereturns updated point. curlSSE stream against tap receives Redis-published events in real time.- All 6 scenarios seed without FK violations; rerun on same tenant is idempotent.
Deps: M-E1, M-E2.
M-E4 — Playwright harness + test-api.ts + Scenario 01 (2 days)
Section titled “M-E4 — Playwright harness + test-api.ts + Scenario 01 (2 days)”Scope:
apps/mobile/test/e2e-web/workspace —playwright.config.ts,package.json(new scriptstest:e2e,test:e2e:full),tsconfig.json.lib/test-api.tsHTTP client (§3.6).lib/sync.ts—waitForBookingStatus(polling) +waitForWsEvent(SSE consumer).lib/fixtures.ts— Playwright fixtures:testTenant,testApi,consumerPage,proPage.lib/artifacts.ts— failure artefact bundler (screenshots + WS transcript + BE log snapshot).scenarios/01-booking-full-cycle.spec.ts.- Husky pre-push does not run E2E (too slow; unit tests only per Q5).
Exit criteria:
pnpm test:e2e -- --scenario 01passes on local Mac against dev BE.- Two browser contexts coordinate, test cleans up after itself (verify DB row count = 0 for tenant post-run).
- Failure run bundles artefacts to
apps/mobile/test/e2e-web/artifacts/<tenant-id>/.
Deps: M-E1, M-E2, M-E3.
M-E5 — Scenarios 02 (SOS) + 03 (cancel) + 04 (chat) (2 days)
Section titled “M-E5 — Scenarios 02 (SOS) + 03 (cancel) + 04 (chat) (2 days)”Scope:
- 3 scenario specs + seeds.
lib/stripe-test-cards.tshelper.- Disconnection/reconnection utilities in
lib/sync.ts. - SOS-specific: cascade-ordering assertion helper.
Exit criteria:
- All 3 scenarios pass 10 consecutive runs locally (flake < 1%).
- SOS cascade: p3 never sees offer (negative assertion).
Deps: M-E4.
M-E6 — Scenarios 05 + 06 + external service cleanup (2 days)
Section titled “M-E6 — Scenarios 05 + 06 + external service cleanup (2 days)”Scope:
- Scenarios 05 + 06 specs + seeds.
- Extend
/test/cleanupto sweep Stripe (accounts + customers), Clerk (users), Novu (subscribers) scoped by tenant metadata prefix. - Admin direct-API helper in
lib/admin-api.ts(service-token auth for credential approve/reject).
Exit criteria:
- Scenarios 05 + 06 green.
- Post-cleanup verification: Stripe account list filtered by
metadata.test_tenant=<id>returns zero; same for Clerk users; same for Novu subscribers. - Twilio spend-alert webhook configured (Q9 budget guard, €20/mo threshold).
Deps: M-E5.
M-E7 — Scripts + docs + pre-demo smoke (1 day)
Section titled “M-E7 — Scripts + docs + pre-demo smoke (1 day)”Scope:
- Root
pnpm test:e2e→pnpm --filter @ideony/mobile-e2e-web test:e2e(bail on first fail per Q8). - Root
pnpm test:e2e:full→ same without bail. apps/mobile/test/e2e-web/README.md— how to run, how to add a scenario, how to read failure artefacts.scripts/e2e-smoke.sh— pre-demo invoker using:fullvariant + Slack webhook on completion.- CLAUDE.md update (Testing section) documenting the new commands.
Exit criteria:
pnpm test:e2eend-to-end green, wall time < 15 min for all 6 scenarios on laptop.- Smoke script integrates with demo-prep runbook.
- Post-demo TODO ticket filed: “Add cron
0 3 * * *on Hetzner dev instance running/opt/ideony/scripts/e2e-smoke.shw/ Slack alerts.”
Deps: M-E6.
Total: 7 milestones, ~12.5 working days.
6. Directory Structure
Section titled “6. Directory Structure”apps/api/├── prisma/│ └── migrations/│ └── 20260420_add_test_tenant/migration.sql├── src/│ ├── common/│ │ ├── clock/│ │ │ ├── clock.service.ts # abstract│ │ │ ├── system-clock.service.ts # prod impl│ │ │ └── fake-clock.service.ts # test impl│ │ ├── delay/│ │ │ └── delay.service.ts # BullMQ wrapper│ │ ├── prisma/│ │ │ └── tenant-extension.ts # client extension│ │ └── tenant/│ │ ├── tenant.cls-store.ts│ │ └── tenant.middleware.ts│ └── modules/│ └── test/│ ├── test.module.ts # conditional import│ ├── test.controller.ts│ ├── test.service.ts│ ├── test.guard.ts│ └── fixtures/│ ├── booking-full-cycle-rome.ts│ ├── sos-burst-pipe-rome.ts│ ├── cancel-with-refund.ts│ ├── chat-thread.ts│ ├── credential-trust-ripple.ts│ └── review-round-trip.ts└── test/ └── integration/ ├── admin/ # single-actor admin tests (Q10) │ └── credential-approval.spec.ts └── multi-role/ └── ws-multiclient.spec.ts # research-doc §4.2 fast layer (optional mini-add)
apps/mobile/└── test/ └── e2e-web/ # Playwright workspace (Q7 — web only) ├── package.json # name: @ideony/mobile-e2e-web ├── playwright.config.ts ├── tsconfig.json ├── README.md ├── lib/ │ ├── test-api.ts # HTTP client to /test/* │ ├── admin-api.ts # service-token admin calls │ ├── sync.ts # waitForBookingStatus / waitForWsEvent │ ├── fixtures.ts # Playwright fixtures │ ├── artifacts.ts # failure artefact bundler │ └── stripe-test-cards.ts ├── scenarios/ │ ├── 01-booking-full-cycle.spec.ts │ ├── 02-sos-cascade.spec.ts │ ├── 03-cancel-refund.spec.ts │ ├── 04-chat-delivery.spec.ts │ ├── 05-credential-approval.spec.ts │ └── 06-review-round-trip.spec.ts └── artifacts/ # .gitignored — failure bundles
scripts/└── e2e-smoke.sh # pre-demo invokerWorkspace packaging: apps/mobile/test/e2e-web is a distinct pnpm workspace package (not nested in apps/mobile’s package.json) so Playwright deps don’t inflate the mobile app bundle. Root pnpm-workspace.yaml adds apps/mobile/test/e2e-web.
7. CI + Local Integration
Section titled “7. CI + Local Integration”7.1 Local run (MVP 0 primary path per Q5)
Section titled “7.1 Local run (MVP 0 primary path per Q5)”# One-time setuppnpm installpnpm --filter @ideony/mobile-e2e-web exec playwright install chromium
# Run all 6 scenarios, bail on first failpnpm test:e2e
# Run all, report all (for pre-demo smoke)pnpm test:e2e:full
# Run single scenariopnpm test:e2e -- --grep "Scenario 01"Target env: apps/mobile/test/e2e-web/.env.test points at dev BE:
TEST_API_URL=https://api.ideony.is-a.dev(post-named-tunnel) or Quick Tunnel URLTEST_HARNESS_SECRET=<shared secret, rotated>STRIPE_TEST_CARD=4242424242424242CLERK_FRONTEND_API=humble-garfish-77.clerk.accounts.dev
Husky pre-push: does NOT run E2E (unit tests only). Multi-role suite is manually triggered pre-demo / pre-merge.
7.2 Post-demo cron (Q5 activation)
Section titled “7.2 Post-demo cron (Q5 activation)”Target: 2026-04-21+ (after demo). Add on Hetzner dev instance (178.104.154.74):
0 3 * * * /opt/ideony/scripts/e2e-smoke.sh >> /var/log/ideony-e2e.log 2>&1Script body:
cd /opt/ideony && git pull --rebasepnpm install --frozen-lockfilepnpm test:e2e:full --reporter=json > /tmp/e2e-report.json- On fail: POST Slack webhook with failed scenarios + link to artefact tarball uploaded to R2.
7.3 GitHub Actions (deferred — not MVP 0)
Section titled “7.3 GitHub Actions (deferred — not MVP 0)”Structure sketched but not implemented in Phase E. When activated post-revenue:
- Self-hosted ARM64 runner (already exists for build/deploy).
- Matrix over 6 scenarios,
fail-fast: false, 15-min timeout, artefact upload. - Trigger:
workflow_dispatch+pull_requestfor pathsapps/api/src/modules/{booking,sos,credentials,chat,reviews,dispatch}/**+apps/mobile/app/**.
8. Risk List + Mitigations
Section titled “8. Risk List + Mitigations”| # | Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|---|
| R1 | test_tenant column forgotten on a new table → cross-tenant leakage | Medium | High | Biome custom rule flagging Prisma models missing test_tenant; PR checklist item; M-E1 includes snapshot test of tagged-table list |
| R2 | ClockService refactor misses a call site → flaky timing test | Medium | Medium | Biome rule banning new Date() + Date.now() in apps/api/src/modules/** w/ exceptions list; M-E2 exit criteria enforces zero |
| R3 | External service (Stripe/Clerk/Novu) rate limit hit during test run | Low | Medium | Tenant isolation spreads creations; Twilio €20/mo budget alert (Q9); retry-with-backoff in test-api.ts |
| R4 | Dev DB schema drift between migrations and E2E scenario seeds | Medium | Medium | Seed fixtures import Prisma client types directly (compile-time guarantee); CI job runs pnpm prisma migrate deploy before E2E |
| R5 | Playwright browser context auth races (both actors using same Clerk session) | Medium | High | lib/fixtures.ts creates two Clerk users per test, uses BAPI session creation pattern (reference_clerk_e2e memory) for independent JWTs |
| R6 | SSE tap drops events on reconnection → waitForWsEvent hangs | Low | Medium | Server-side buffer last 50 events per tenant in Redis; tap replays on connect |
| R7 | FakeClock.advance() + BullMQ job firing race condition | Medium | High | DelayService.flushDueBy() awaits all promoted jobs’ completed event before resolving; scenario tests add assertion after advance |
| R8 | Cleanup doesn’t sweep new external resources added in future modules | Medium | Medium | Cleanup service uses reflect-metadata-driven registry — any module that adds external resources must register a sweep callback; lint rule enforces |
| R9 | Test run on local machine blocks dev (port conflicts, Clerk rate) | Low | Low | Playwright uses dev BE (not local); no local BE needed; rate-limit risk covered in R3 |
| R10 | Scenarios depend on seed data that conflicts with each other in parallel runs | Medium | Medium | Every scenario allocates its own test_tenant UUID — no shared seeds; parallel safe by design |
9. NOT In Scope
Section titled “9. NOT In Scope”Explicit exclusions — do not implement in Phase E. Add to post-MVP 0 backlog if/when justified:
- 3-actor scenarios (consumer + pro + admin-live). Q10 deferred — admin flows covered via single-actor integration tests in
apps/api/test/integration/admin/. Revisit when admin surface grows (disputes, moderation, SOS override). - Mobile-native E2E (iOS/Android Maestro). Q7 deferred — Expo web covers 95% of mobile UI logic; native-specific bugs (push tokens, deep links, file-picker) caught manually via EAS preview builds + TestFlight. Add Maestro when mobile traction + revenue justify.
- Visual regression inside multi-role flows. Q6 deferred — separation of signal mandate; multi-role is async + flake-prone, visual snapshots amplify flake. Dedicated single-actor visual suite post-MVP 0 once 10+ visual bugs surface.
- TestRigor / other AI-authored DSL. Q4 rejected — €300+/mo pre-revenue + lock-in. Revisit in 6mo if cofounders explicitly blocked from authoring PR reviews on Playwright TS.
- City simulator / algorithm validation (Bolt/Glovo style). Research §1.4 — premature; no historical data yet. Phase F+ concern.
- Synthetic canaries in prod (Checkly style). Research §1.6 — post-v1 only; needs prod env first.
- GitHub Actions CI integration. Local-only per Q5 — add in post-revenue infra hardening phase.
- WebSocket multi-client integration tests (research §4.2 fast layer). Optional — M-E1 directory structure reserves
apps/api/test/integration/multi-role/for future adds, but not shipped in Phase E.
10. Change Log
Section titled “10. Change Log”- 2026-04-20 — Spec created. All 10 locked decisions consolidated. 6 canonical scenarios detailed. 7 milestones scoped (~12.5 days). 10 risks catalogued. Ready for M-E1 kickoff.