@hainenber correctly pointed out we should match docker/entrypoints/
run-server.sh. The 4-worker config also exposed timing races in two
SQL Lab → Explore playwright specs ("creates a dataset from query
results", "should navigate to Explore from SQL Lab query results")
that don't reproduce on master or with 1 worker.
1 worker × 20 gthread threads still handles concurrent test load fine
and is what production runs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Per @Copilot's PR review:
1. cypress-run-all now polls /health for up to 60s before launching
Cypress, mirroring playwright-run. Avoids a race where the first
spec hits the server before gunicorn finishes binding.
2. cypress-run-all now installs an EXIT trap that emits the gunicorn
log and kills the process — so a test-runner failure under set -e
still leaves us with the backend log and no orphan gunicorn.
3. Renamed flasklog / flaskProcessId to serverlog / serverPid and
updated log group titles ("Flask log" -> "gunicorn log") in both
functions. The descriptive comments still mention `flask run` for
historical context.
Both `cypress-run-all` and `playwright-run` started the Superset backend
with `flask run --no-debugger -p $port`. The Flask development server is
single-threaded and has no crash-recovery, so heavy tests — most notably
`playwright/tests/dashboard/export.spec.ts:61` (Export YAML) and
`dashboard-list.spec.ts:266` (Import zip) — can knock the backend offline
for the rest of the run. Subsequent tests then cascade-fail with
`ECONNREFUSED`, `socket hang up`, `Missing CSRF token`, and
`page.goto: net::ERR_ABORTED; maybe frame was detached`.
Across the last 50 master runs of the E2E workflow, 6 failed (12%),
every single one with this signature.
Switch both runners to gunicorn with the same shape used in
`docker/entrypoints/run-server.sh`:
- `--workers 4 --worker-class gthread --threads 20` — concurrency that
matches what the real product runs.
- `--timeout 120` — kill stuck workers instead of letting them hang the
entire suite.
- `--max-requests 500 --max-requests-jitter 50` — recycle workers
periodically so memory accumulation from long suites doesn't OOM the
process.
- `--access-logfile - --error-logfile -` — keep the same per-run log
capture pattern.
Only frontend (JS) coverage is captured in E2E (verified — bashlib.sh
only instruments the JS assets), so multi-worker gunicorn doesn't break
the existing coverage path.
* chore: Adding pip-compile-multi et al
* Specify requirements.txt path for fossa
* [ci] Fixing CI
Co-authored-by: John Bodley <john.bodley@airbnb.com>
Co-authored-by: Jesse Yang <jesse.yang@airbnb.com>
* build: collect code coverage from Cypress
Collect frontend code coverage reports from Cypress tests and add
proper tagging for all tests.
* Fix bash script lint error from shellcheck
* Revert Cypress to 4.3.0 to see if it fixes a failing test