Skip to content

0026 — E2E strategy: full-stack web + iOS + Android, 4 envs, 3 cadences

0026 — E2E strategy: full-stack web + iOS + Android, 4 envs, 3 cadences

Section titled “0026 — E2E strategy: full-stack web + iOS + Android, 4 envs, 3 cadences”

Status: Accepted Date: 2026-04-21 Supersedes: parts of the locked decisions in docs/specs/2026-04-20-phase-e-multi-role-e2e-spec.md (Q5, Q7). Q1–Q4, Q6, Q8–Q10 from that spec remain Accepted — this ADR widens scope without re-opening them. Related spec: docs/specs/2026-04-21-e2e-strategy.md (the WHAT; this ADR is the WHY).

Cofounder demo went well on 2026-04-20. Phase E spec scoped multi-role E2E to web-only, local-only, 2-actor, on the Hetzner dev box. Since then the product grew: Maestro-ready screens on iOS + Android, a prod deploy surface that already serves traffic, several endpoints with no rate-limit coverage, and a chat surface whose implementation is known-incomplete. The Phase E scope is now too narrow — we need a single strategy that covers:

  1. What platforms to test — web, iOS, Android — and whether to stagger or launch together.
  2. Which environments the suite runs against — local only, or also dev / staging / prod.
  3. When tests run in CI — per-commit, per-deploy, or nightly.
  4. How much to spend on device cloud infrastructure pre-revenue.
  5. How to order the implementation milestones given the cofounder-demo pressure has just lifted.

The answers need to be locked so we stop re-brainstorming every time a new surface ships.

Five locked decisions:

Run the same spec files against local + dev + staging + prod. Switch via TEST_ENV env var mapped to API_BASE_URL. Seed real test users on prod too, isolated via test_tenant UUID column (Phase E spec §3 Prisma extension, already designed).

  • Chosen: 4-env matrix with prod smoke subset
  • Rejected: local-only (Phase E Q5 original choice) — missed that the single Hetzner instance IS our current prod, and that a post-CD smoke against the deployed URL catches deploy-time regressions local runs cannot
  • Rejected: separate staging-only suite — double maintenance, drift risk
  • Safety: ENABLE_TEST_ENDPOINTS=true gated per env in Dokploy; /test/cleanup idempotent + tenant-scoped (never touches rows without test_tenant set); prod smoke = subset (auth + health + one happy-path booking), not the 6 multi-role scenarios

Three stacked cadences:

CadenceWhenScopeBudget
Pre-merge smokeevery PRunit + API-integ + 10 Playwright flows<5 min
Post-CD smokeafter scripts/deploy.sh succeeds5 Playwright flows against prod URL<1 min
Nightly full03:00 UTC cron on Hetznerfull Playwright + Maestro iOS + Maestro Android + 6 multi-role<30 min
  • Chosen: three cadences, each triggered by different signal
  • Rejected: single “run everything on every push” — GH Actions billing pain; pushes the slow suite to block fast feedback
  • Rejected: pre-merge only — misses deploy regressions and env drift
  • MVP minimum (while GH Actions billing off): local on-demand via pnpm e2e:all + post-CD smoke already in scripts/deploy.sh. Pre-merge returns via self-hosted Hetzner runner (€0) when billing blocks persist, or via GH Actions when they clear. Nightly activates post-MVP0.

Implement milestones C → A → B:

  1. C (M1): demo walkthroughs — 3 Playwright specs for cofounder/investor paths
  2. A (M2): deploy smoke — 5 Playwright flows integrated into scripts/deploy.sh post-CD step
  3. B (M4): multi-role harness — carries Phase E M-E1..M-E7 as-is
  • Chosen: demo value first, deploy confidence second, deep harness third
  • Rejected: harness first — highest engineering effort but lowest demo value until the walkthrough tests exist to prove the harness works
  • Note: post-demo the priority shifts: M2 (deploy smoke) + M3 (webhook + i18n + rate-limit gap) become most urgent because they close MVP0 launch risks

Test iOS + Android both from day 1 via Maestro against local simulators + emulators.

  • Chosen: both platforms in every Maestro milestone
  • Rejected: iOS-first-then-Android — the 20% of bugs that are platform-divergent (push notifs in background, biometrics, deep-links, keyboard behavior, back-button semantics) only surface when both run. Deferring Android means finding these bugs right before launch, when the cost is highest.
  • Rejected: web-only until mobile traction (Phase E Q7 original choice) — overrides because Expo renders both mobile platforms from our single codebase and Maestro flows are declarative YAML cheap to author per platform
  • Cost: €0 — local Xcode simulator + Android Studio emulator

Free tier now, SOTA post-revenue.

  • Chosen (now): local simulators + LambdaTest free tier (5 min/day real devices) for weekly sanity
  • Chosen (growth, post-cofounder): BrowserStack App Automate ($29/mo) — 2 parallel CI runs
  • Chosen (SOTA, post-revenue): BrowserStack App Live ($99/mo) — unlimited manual + automation, 100+ devices
  • Rejected: Sauce Labs Enterprise ($500+/mo) — over-provisioned pre-revenue
  • Rejected: BrowserStack App Live at MVP 0 — $99/mo unjustified until we have paying customers

Positive:

  • One strategy, four envs, three cadences — no more per-surface re-brainstorming
  • Real test users on prod catch webhook-signature drift, Clerk-claim changes, Stripe endpoint-ID mismatches that dev envs miss
  • Both mobile platforms in tests from day 1 → divergence bugs caught in development, not launch week
  • MVP0 path unblocked even with GH Actions billing off (local + post-CD is enough)

Negative:

  • Prod test-user pool requires operational discipline — TEST_HARNESS_SECRET rotation, Dokploy flag toggling, test_tenant enforcement via Prisma extension. Mitigated by double-gated TestGuard (env + signed header).
  • Nightly cron on Hetzner is another thing to monitor. Mitigated by making Slack webhook report the FIRST deliverable of M6 — no silent failures.
  • 20% more test authoring time (Android alongside iOS) — accepted because the alternative is finding Android bugs at launch.
  • docs/specs/2026-04-21-e2e-strategy.md — canonical spec (WHAT)
  • docs/specs/2026-04-20-phase-e-multi-role-e2e-spec.md — Phase E architecture spec (Superseded-by footer; multi-role harness carried forward as M4 of new spec)
  • docs/plans/2026-04-21-rate-limit-throttler-audit.md — rate-limit impl plan (M3 dependency; deferred post-MVP0)
  • docs/testing.md — commands + conventions (how-to)
  • docs/roadmap.md — E2E M1–M7 milestones
  • ADR 0025 — LTS deps only (affects Maestro + Playwright version pins)