Compare commits

..

39 Commits

Author SHA1 Message Date
Mike Bridge
a0193dbab6 fix(versioning): replace session-state guard with InvalidRequestError catch
The previous attempt (d0520f6766) was too aggressive: skipping when
parent is in session.dirty/new/deleted bypassed the
persistent-and-clean case the hook EXISTS for. Some upstream code
paths put the dataset in session.dirty *before* this listener fires
(API controllers touching audit fields, etc.), so the
session-membership pre-check made us silently no-op on the very
scenario the hook needs to handle. CI symptom:
test_dataset_column_edit_creates_parent_version showed before=317,
after=317 (parent shadow not written).

Restore the unconditional flag_modified and catch the specific
InvalidRequestError that fires only for the session.new case
(uuid default callable hasn't populated state yet). Other states
fall through to the original behavior:
- persistent + clean → flag_modified succeeds, parent goes dirty,
  Continuum picks it up, SkipUnmodifiedPlugin keeps the row via
  _has_dirty_versioned_children. ✓
- persistent + dirty → flag_modified is harmless (already dirty).
- session.new → InvalidRequestError, skip (parent INSERTs anyway).
- session.deleted → flag_modified may or may not raise; if it does,
  we skip; if not, the delete dominates.

Should unblock test_dataset_column_edit_creates_parent_version,
test_get_version_returns_historical_snapshot_with_children, and
test_restore_with_column_edits_reverts_columns.
2026-05-20 16:47:12 -06:00
Mike Bridge
4c99cd68b6 chore(versioning): apply pre-commit autofixes (ruff + auto-walrus)
- ruff: import sort + E501 reflow on the parent-state guard in
  baseline.py
- ruff format: function-signature collapse and join-chain reflow in
  queries.py
- auto-walrus: two ``entity_kind = …; if … is not None:`` patterns
  in queries.py converted to assignment-expressions
2026-05-20 16:18:55 -06:00
Mike Bridge
0c79581ee9 fix(versioning): retention task FK violation on cross-entity transactions
When one ORM flush touches multiple versioned entities (dashboard +
slice + dataset all save at tx=X), each gets a shadow row sharing
that tx. If only the dashboard is later edited at tx=Y, the
dashboard row at tx=X is closed (end_tx=Y) while slice/dataset rows
stay live at tx=X. Retention then preserves tx=X (slice/dataset are
live there) and prunes tx=Y. The dashboard's closed row at tx=X
survives step 1, then its end_transaction_id=Y trips the FK when
step 2 deletes version_transaction row Y.

Fix: extend the shadow-row delete to also match end_transaction_id
IN tx_ids. Live rows have end_tx=NULL so they're never matched by
either predicate. Closed rows that touch a pruned tx at either
endpoint are pruned together — consistent with retention semantics
(any tx in the row's lifespan is gone, so the row's chain is broken
anyway).

Unblocks test_retention_prunes_old_rows on sqlite, mysql, postgres.
2026-05-20 16:17:21 -06:00
Mike Bridge
d0520f6766 fix(versioning): skip non-clean parents in force-parent-dirty hook
The force-parent-dirty listener was calling attributes.flag_modified
on every parent reachable from a dirty child — including parents
themselves in session.new (e.g. brand-new SqlaTable + brand-new
TableColumns from POST /api/v1/dataset/). flag_modified rejects
unloaded attributes, and a session.new SqlaTable's uuid (default=uuid4
fires at flush time) is unloaded until then. CI caught this with
InvalidRequestError cascading into 422s across dataset creation /
upload / Playwright dataset specs.

The hook is only needed for the persistent-and-clean case (child
edited, parent's own scalars untouched, dropdown otherwise empty).
Anything in session.new will flush anyway; anything in session.dirty
is already flagged; session.deleted shouldn't be touched. Short-
circuit before the flag_modified call.

Unblocks test-sqlite, test-mysql, test-postgres (previous), and
playwright dataset specs.
2026-05-20 16:13:03 -06:00
Mike Bridge
77236afa14 refactor(versioning): apply cross-PR review feedback (#39977 H1/M3/M5)
Three small follow-ups surfaced by aminghadersohi's review of the
SoftDeleteMixin PR (#39977) that apply equally here:

- H1: cache _child_to_parent_registry() with functools.cache. Called
  twice per save flush; mapping depends only on import-time model
  classes, so unbounded cache is the right shape (no invalidation).
- M5: tighten _CHILD_BASELINE_HANDLERS type from dict[str, Any] to
  dict[str, Callable[[Session, Any, int], None]] via a named alias.
  Mypy now catches a future broken handler signature.
- M3/M4: explain the inline-import pattern once in the module
  docstrings of baseline.py and changes.py. Both modules use
  pylint disable=import-outside-toplevel uniformly because they
  load during init_versioning() before mappers are configured;
  the per-callsite "why" comments would just repeat the same
  reason. Module-level explanation + a hint to comment unusual
  cases is the cleaner shape.

M6 (listener placement) doesn't apply — init_versioning() already
runs inside init_app_in_ctx(). M8 (loose OpenAPI schema in
*/api.py docstrings) is real but its own change.
2026-05-20 14:12:02 -06:00
Mike Bridge
9d5a459840 docs(versioning): record why SkipUnmodifiedPlugin doesn't clean up orphan version_transaction rows inline
Extends the existing docstring note ("the orphan is swept by retention")
with the reasoning behind not cleaning it up in the same flush. The
inline-delete is appealing in principle but would couple this plugin
to the change-records listener's buffer state via the ON DELETE
CASCADE on ``version_changes.transaction_id``: both listeners would
have to agree that the flush produced nothing before the version_transaction
row could be dropped safely. The orphan's ~40-byte storage cost +
retention's correct-by-construction handling (orphans have no parent
shadow, so they're never in the "preserve" set) make the coordination
overhead not worth it.

Captures the design decision in the file where the next reader will
look for it.
2026-05-19 18:42:06 -06:00
Mike Bridge
1ac9e50836 tidy(versioning): reading-order shuffle in baseline.py (newspaper-article order)
Pure file shuffle, zero behaviour change. Reorders ``baseline.py`` so it
reads top-down by level of abstraction (newspaper-article rule): the
public entry point at the top, supporting helpers descending below.

Before: 14 private helpers, then ``register_baseline_listener`` at the
bottom. A reader opening the file met the leaf builders first and had
to accumulate context before finding the call site.

After (top-down):

  - Entry point: ``register_baseline_listener`` + inner ``capture_baseline``
  - High-level helpers used by ``capture_baseline``:
      ``_force_parent_dirty_on_child_change``,
      ``_collect_parents_to_baseline``,
      ``_child_to_parent_registry``,
      ``_version_table_for``,
      ``_shadow_row_count``,
      ``_insert_baseline_and_children``
  - Mid-level builders:
      ``_insert_baseline_row``,
      ``_baseline_children_for_parent``
  - Per-entity child handlers + their dispatch table:
      ``_baseline_dataset_children``,
      ``_baseline_dashboard_children``,
      ``_CHILD_BASELINE_HANDLERS``
  - Leaf builders:
      ``_insert_child_baseline_rows``,
      ``_baseline_attached_slices``,
      ``_insert_synthetic_slice_baseline``

Three section-divider comments mark the abstraction levels. The
``_CHILD_BASELINE_HANDLERS`` dict literal stays after its referenced
handlers (module-level literals evaluate at import time and need names
already bound); a comment now flags this constraint.

Function bodies are byte-for-byte unchanged; ``git log -L`` on any
function shows only its relocation. 96 unit tests pass.
2026-05-19 18:42:06 -06:00
Mike Bridge
80b8891e39 tidy(versioning): extract read_row_outside_flush helper
baseline.py:_insert_baseline_row and changes.py:_read_pre_state both
issued the same "read a single row through ``session.connection()``
inside ``with session.no_autoflush:``" pattern. Same five-line block,
same intent ("read the pre-flush state without triggering the in-flight
edit's flush").

Promoted to ``superset.versioning.utils.read_row_outside_flush(session,
table, entity_id)``. Companion to ``single_flush_scope`` — they sit
next to each other in utils.py and frame the two directions of the
"don't autoflush mid-listener" pattern.

Returns ``dict[str, Any]`` (or ``None``) so callers can't accidentally
hold a cursor-bound ``RowMapping`` past the listener boundary. Both
call sites get shorter by ~5 lines.

Also picks up Decimal stringification in the changes.py docstring
update (was listed in the W4 commit but the docstring still said
"(datetime, UUID, bytes)" — now matches the implementation).

Behaviour unchanged. 96 unit tests pass.
2026-05-19 18:42:06 -06:00
Mike Bridge
77c373616e tidy(versioning): extract shared helpers between list_versions and get_version
After the SRP split (8c9cf36) put both functions in the same module
~150 lines apart, their overlap became visible: same JOIN of
version_table → version_transaction → ab_user, same baseline-first
ordering, same user-row → ``changed_by`` projection, same lookup
``_ENTITY_KIND_BY_CLASS_NAME.get(model_cls.__name__)``. About 30 lines
of duplication.

Five small helpers extracted at the module top:

- ``_resolve_version_tables(model_cls)`` returns ``(ver_tbl, tx_tbl, user_tbl)``
- ``_version_with_tx_user_join(ver_tbl, tx_tbl, user_tbl)`` builds the join
- ``_baseline_first_ordering(ver_tbl)`` returns the order-by tuple
- ``_user_select_cols(user_tbl)`` returns the user-column list with
  ``user_id`` as the stable label (normalises the prior asymmetry
  where ``list_versions`` labelled it ``user_id`` and ``get_version``
  labelled it ``_user_id`` to dodge a column-name collision — the
  ``user_id`` label collides with neither)
- ``_changed_by_from_row(row)`` projects user columns onto the API shape
- ``_entity_kind_for(model_cls)`` resolves the change-records taxonomy lookup

Both call sites get shorter and read what they do (build query / project
user / build row) rather than how. Behavior unchanged; no test changes.

Also two small inline tidyings while in the file:

- Replace the ternary
  ``changes_by_tx = list_change_records_batch(...) if entity_kind else {}``
  with an explicit two-line if-statement in both functions. The ternary
  buries the decision; the if-statement reads as one thought.
- Inline the one-shot ``meta_cols`` set declaration in ``get_version``
  into the ``if col.name in {...}`` check that uses it three lines later.

Net: about 110 lines → about 80 lines across the two functions, plus
a small helper section at the top.
2026-05-19 18:42:06 -06:00
Mike Bridge
40653d52da refactor(versioning): sqlalchemy-review follow-ups (W1–W8)
Cleanup pass from the SQLAlchemy + migration code review. Eight items,
all in the "warnings / suggestions" tier — no behaviour change visible
to the API, but each closes a real correctness, perf, or maintainability
concern surfaced in review.

baseline.py
- Delete unused ``_get_user_id`` (W1). The function wrapped a broad
  ``except Exception:  # noqa: S110`` swallow that hid bugs; grep
  confirmed no callers anywhere. The legitimate audit-field paths
  (``row.get("changed_by_fk")`` etc.) already drive the
  ``version_transaction.user_id`` write.
- Batch ``_baseline_attached_slices`` from O(N) round-trips to
  three queries (W2): one membership SELECT, one existing-shadow
  SELECT, one bulk live-row SELECT for the missing ids. The previous
  per-slice ``COUNT(*)`` + ``SELECT`` was a measurable first-save
  hotspot on dashboards with many charts. Drops the now-unused
  ``_slice_has_shadow`` helper.
- Pick a stable column name for ``flag_modified`` in
  ``_force_parent_dirty_on_child_change`` (W3). ``uuid`` is on all
  three versioned parent classes and excluded by none, so the
  flagged attribute is deterministic across SQLAlchemy versions /
  mapper-config orders instead of depending on
  ``versioned_column_properties(parent)[0]``. Falls back to the
  first available column for forks that exclude ``uuid``.

changes.py
- Add ``Decimal`` handling to ``_jsonable`` (W4) — ``json.dumps``
  rejects ``Decimal``, so any numeric column (e.g. ``SqlMetric.currency``
  contents, or fork/plugin Decimal columns) would crash the bulk
  insert. Stringify rather than ``float()`` to preserve precision;
  the diff engine compares ``from_value`` / ``to_value`` by string
  equality after this coercion so both sides round-trip identically.

queries.py
- Promote the inline ``{0: "baseline", 1: "update", 2: "delete"}``
  dict to module-level ``_OP_TYPE_LABELS`` (W7). The literal was
  duplicated across ``list_versions`` and ``get_version``; the third
  caller is one bug fix away.
- Comment on ``resolve_version_uuid``'s Python-side ``derive_version_uuid``
  loop (W8) — no portable SQL form for UUIDv5 across PostgreSQL /
  MySQL / SQLite, iteration count is bounded by the retention
  window. Flags the place to revisit if retention is ever disabled
  (``=0``) on a heavily-edited entity.

migrations/2026-05-01_23-36 (composite-PK)
- Belt-and-braces guard in ``_downgrade_mysql_table`` (W6): asserts
  ``t.name in AFFECTED_TABLES`` before interpolating into the
  backtick-quoted ALTER statements. The invariant was already
  structurally implied (callers iterate ``AFFECTED_TABLES``), but
  making it load-bearing means a future refactor can't slip an
  arbitrary table name through.

(W5 was verified-no-change: grepped ``tests/`` for ``metadata.create_all``
callers that exercise versioning tables; none. The cascade-FK
gap on ``version_changes.transaction_id`` is already documented
in ``tests/integration_tests/versioning/change_records_tests.py:27-32``.)

62 versioning unit tests pass.
2026-05-19 18:42:06 -06:00
Mike Bridge
59045f8cfe refactor(versioning): split VersionDAO into queries + restore modules
VersionDAO carried five distinct concerns under one class — UUID
derivation, version metadata queries, change-record loading,
single-version snapshot retrieval, and restore orchestration. Bob's
"and" test (the clean-code review flagged this as the next structural
fix after the dead-code purge) gives ~600 lines of "queries about
versioned state of one entity AND the workflow that mutates it."

Splits the read and write sides into purpose-built modules:

- ``superset/versioning/queries.py`` — UUID derivation
  (``VERSION_UUID_NAMESPACE``, ``derive_version_uuid``) + read-side
  helpers (``find_active_by_uuid``, ``current_version_number``,
  ``current_live_transaction_id``, ``current_live_version_uuid``,
  ``list_versions``, ``resolve_version_uuid``, ``get_version``,
  ``list_change_records_batch``). ~475 lines.

- ``superset/versioning/restore.py`` — write-side (``restore_version``,
  ``_stamp_audit_fields_for_restore``, ``_RESTORE_RELATIONS``).
  ~140 lines. Depends only on ``queries.find_active_by_uuid`` and
  ``utils.single_flush_scope``.

- ``superset/daos/version.py`` — collapsed to an ~85-line backward-compat
  façade that re-exports both modules under a single ``VersionDAO``
  class via ``staticmethod`` aliases. The module also re-exports
  ``VERSION_UUID_NAMESPACE`` and ``derive_version_uuid`` at module level
  so the ~10 existing callers (api.py handlers, command classes, the
  ETag emitter, integration tests) don't have to change their imports.
  New code is encouraged to import from the sub-modules directly.

The functions themselves are unchanged byte-for-byte aside from
internal call sites being rewritten from ``VersionDAO.foo`` to the bare
function name (since they now live as module-level functions, not
class methods).

One unit-test mock target moved: ``test_restore_version_returns_none_for_unknown_entity``
now patches ``superset.versioning.restore.find_active_by_uuid`` (the
actual call site) instead of ``VersionDAO.find_active_by_uuid`` (which
is now just an alias).

Each of the three modules now has one reason to change. When the
sc-103157 soft-delete pass adds the ``deleted_at IS NULL`` filter to
``find_active_by_uuid``, it touches only ``queries.py``. When a
per-entity-type restore Strategy replaces the string-keyed
``_RESTORE_RELATIONS`` dispatch, it touches only ``restore.py``.
2026-05-19 18:42:06 -06:00
Mike Bridge
76bbb18fdb temp(versioning): strip URL params from dashboard restore navigation; regen lockfile
DashboardList demo dropdown previously instructed the user to "Reload
the page to see the change" after a restore. The URL the user
returns to may still carry ``?native_filters_key=…`` /
``permalink_key`` / ``form_data_key`` from a prior session — those
point at server-cached snapshots (in ``key_value`` and the
filter-state cache) captured before the restore. On rehydration the
cached state is merged on top of the restored ``json_metadata``,
masking the rollback (e.g. dashboard-level colour-scheme restore
appears not to take effect).

Replaces the alert + manual reload with a direct ``window.location.href``
navigation to ``/superset/dashboard/<uuid>/`` — drops all URL params,
forcing hydration from the freshly restored DB state.

Also regenerates ``package-lock.json`` to pick up the ``zod 4.4.1 →
4.4.3`` bump that master's ``package.json`` already reflects.

(``temp(versioning)`` prefix per the demo dropdown's status — this
file is not part of V1 scope per ADR-005; the V2 UI SIP owns the
actual restore UI surface.)
2026-05-19 18:42:06 -06:00
Mike Bridge
f4a18cfe98 refactor(versioning): rename find_active_by_uuid public + collapse restore commands onto BaseRestoreVersionCommand
Two coupled clean-code review fixes:

(1) Rename ``VersionDAO._find_active_entity_by_uuid`` →
``find_active_by_uuid``. The leading-underscore + three
``# pylint: disable=protected-access`` suppressions in the restore
commands were the smell of a wrongly-private API. The method is a
perfectly reasonable public DAO operation; dropping the underscore
removes the suppressions.

(2) Collapse ``RestoreChartVersionCommand``, ``RestoreDashboardVersionCommand``,
``RestoreDatasetVersionCommand`` onto a shared
``BaseRestoreVersionCommand`` (``superset/commands/version_restore.py``).
The three classes were textbook copy-paste — identical except for
the model class and three exception types. Each subclass now declares
``model_cls`` + ``not_found_exc`` + ``forbidden_exc`` and overrides
``run()`` with one ``@transaction(reraise=<failed_exc>)``-decorated
line delegating to ``self._do_restore()``. ~80 lines per file →
~45 lines per file; one shared workflow instead of three drift sources.

The api.py imports of ``RestoreChartVersionCommand`` /
``RestoreDashboardVersionCommand`` / ``RestoreDatasetVersionCommand`` are
unchanged — public class names preserved.
2026-05-19 18:42:06 -06:00
Mike Bridge
18abb81fe7 refactor(versioning): purge dataset_snapshots dead code + fix get_version bug
The full-Continuum spike (ADR-004 revised) replaced the JSON-snapshot
restore path with Continuum's native Reverter and removed the
``dataset_snapshots`` / ``dashboard_snapshots`` tables from the
migration chain. Seven VersionDAO methods and two module-level
helpers that read/wrote those tables stayed in the code anyway and
went unused — dead code that looked live.

Worse, ``VersionDAO.get_version`` still read from
``dataset_snapshots`` in its SqlaTable branch. On any environment
where the snapshot tables don't exist (current production behavior),
``GET /api/v1/dataset/<uuid>/versions/<version_uuid>/`` raised
``OperationalError``. The branch is rewritten to read column and
metric state from Continuum's child shadow tables
(``table_columns_version`` / ``sql_metrics_version``) via the
existing ``_shadow_rows_valid_at`` helper.

Deleted:
- ``_deserialize_snapshot_value`` (module helper)
- ``_coerce_snapshot_list`` (module helper)
- ``RESTORE_EXCLUDE_FIELDS`` (constant — only referenced by deleted code
  and a docstring)
- ``VersionDAO._restore_dataset_children``
- ``VersionDAO._parse_slice_ids_json``
- ``VersionDAO._apply_dashboard_slices``
- ``VersionDAO._restore_dashboard_children``
- ``VersionDAO._apply_snapshot_children``

The corresponding ~17 unit tests in
``tests/unit_tests/daos/test_version_dao.py`` are removed alongside.

Stale docstring references in ``versioning/changes.py`` and
``versioning/diff.py`` that pointed at the retired snapshot tables are
also cleaned up.

Also strips an 8-line comment block in ``restore_version`` that
duplicated the docstring of ``_stamp_audit_fields_for_restore``.

Net: −290 lines from ``daos/version.py``; a production-shape bug
fixed; dead code that looked live is gone.
2026-05-19 18:42:06 -06:00
Mike Bridge
9e580c699d refactor(versioning): single_flush_scope context manager + single-revert restore
VersionDAO.restore_version previously called Continuum's Reverter
once per relation in a split-revert loop with flush + expire between
calls. That closed an autoflush race in the Reverter when multiple
relations were reverted at once, but split one logical restore across
multiple Continuum transactions — and once the change-records listener
was wired up, the listener's tx-dedup guard skipped the second pass,
silently dropping child-addition records from version_changes. A
restore that re-added a calculated column would render as an empty
"Baseline" entry in the dropdown.

Replaces the split-revert with a single ``target_version.revert(relations=relations)``
call wrapped in a new ``single_flush_scope(db.session)`` context
manager (``superset/versioning/utils.py``). The context manager
suppresses autoflush inside the block and issues one trailing flush
on clean exit; on exception, the trailing flush is skipped so the
session's normal rollback path handles cleanup. Same autoflush window
closed, one Continuum transaction instead of N, the change-records
listener sees the complete shadow state in one after_flush pass.

The wrapper carries the full autoflush-race / cascade-add rationale
in its docstring so the restore_version call site can be a short
6-line block referencing it.

Integration coverage: ``test_restore_emits_full_child_diff_in_one_transaction``.
2026-05-19 18:42:06 -06:00
Mike Bridge
a62d85d798 feat(versioning): force-parent-dirty on versioned-child change
SQLAlchemy doesn't mark a parent as dirty when only its children
(``TableColumn`` / ``SqlMetric`` on ``SqlaTable``) are modified.
Continuum's UnitOfWork only creates operations for entities in
``session.dirty``, so a column-only edit produces shadow rows in
``table_columns_version`` but no parent shadow row in
``tables_version``. ``VersionDAO.list_versions`` queries the parent
shadow, so the version dropdown is empty for child-only saves —
exactly the failure mode reported when "I edited a column description
but no version appeared."

Extends ``register_baseline_listener`` with a new before-flush hook
``_force_parent_dirty_on_child_change`` that walks the existing
``_child_to_parent_registry`` and ``attributes.flag_modified(parent,
<first non-excluded versioned column>)`` whenever a versioned child
is dirty / new / deleted but the parent's own scalars haven't been
touched. The flag puts the parent in ``session.dirty`` so Continuum's
UoW creates a parent UPDATE operation; the resulting shadow row's
scalar columns mirror the previous version (only the children
actually changed), and the row exists to anchor the transaction in
the parent's version chain.

``SkipUnmodifiedPlugin._is_no_op_update`` is updated in this commit's
predecessor to recognize the "scalars match but children dirty" case
via ``_has_dirty_versioned_children`` so the forced parent UPDATE
isn't skipped.

Integration coverage: ``test_dataset_column_edit_creates_parent_version``.
2026-05-19 18:42:06 -06:00
Mike Bridge
8a46573018 feat(versioning): SkipUnmodifiedPlugin audit-key normalize for Dashboard.json_metadata
Continuum's no-op suppression compared post-flush column values
byte-for-byte against the previous live shadow row. For
``Dashboard.json_metadata`` that produced false-positive version rows
on saves where the user authored nothing — the frontend re-stamps
``map_label_colors`` (regenerated from the ``LabelsColorMap``
singleton) on every save, plus ``chart_configuration`` /
``global_chart_configuration`` / ``show_chart_timestamps`` /
``color_namespace`` (derived from the current chart set), so two
consecutive identical saves produce different bytes for the column.
The diff engine already excluded those keys via
``DASHBOARD_JSON_METADATA_AUDIT_KEYS`` when computing change records;
the skip-plugin diverged.

Adds a ``_COLUMN_NORMALIZERS`` registry keyed on
``(class_name, column_name)`` that maps to a per-column normalizer
applied to both pre- and post-image before equating. The first
entry parses ``Dashboard.json_metadata`` as JSON and drops the
audit-key set before comparing. The same registry is the extension
point for analogous transient fields on charts and datasets.

Promotes ``_DASHBOARD_JSON_METADATA_AUDIT_KEYS`` to a public name
(``DASHBOARD_JSON_METADATA_AUDIT_KEYS``) so the skip-plugin can import
it from ``superset.versioning.diff`` without reaching across a
leading-underscore boundary.

Integration coverage: ``test_map_label_colors_only_change_does_not_create_version``.
2026-05-19 18:42:06 -06:00
Mike Bridge
a0546b8a43 fix(importer): use ORM relationship assignment for dashboard_slices
The v1 import pipeline previously wrote dashboard ↔ chart membership
via raw Core DML (``db.session.execute(delete(dashboard_slices)…)`` +
``db.session.execute(insert(dashboard_slices)…)``). With Continuum's
M2M tracker enabled by the versioning feature, those Core writes
emit malformed shadow INSERTs into ``dashboard_slices_version`` —
the tracker can't see the composite-PK columns through the Core
layer and produces rows with only ``(transaction_id, operation_type)``
populated, triggering a ``NOT NULL`` violation on
``(dashboard_id, slice_id)``.

Rewrites both import paths (``ImportAssetsCommand._import`` in
``commands/importers/v1/assets.py`` and ``ImportDashboardsCommand._import``
in ``commands/dashboard/importers/v1/__init__.py``) to use ORM-level
``dashboard.slices = [...]`` reassignment followed by an explicit
``db.session.flush()``. The explicit flush is necessary to land the
M2M rows before any subsequent autoflush fires an inner-flush event
handler that would reset the relationship change (cf. the SAWarning
``Attribute history events accumulated on N previously clean instances
within inner-flush event handlers have been reset``).

The unit tests previously called ``_import`` directly twice in the same
session — production wraps ``run()`` in ``@transaction`` so each invocation
gets its own DB+Continuum transaction. Added ``db.session.commit()`` between
calls in ``test_import_adds_dashboard_charts``,
``test_import_removes_dashboard_charts``, and
``test_dashboard_import_with_overwrite_replaces_charts`` so the tests
mirror production semantics; otherwise the second call's M2M shadow
inserts conflict with the first call's on
``UNIQUE (dashboard_id, slice_id, transaction_id)``.
2026-05-19 18:42:06 -06:00
Mike Bridge
0afeda46a0 temp(versioning): demo version-history dropdowns + French i18n
Adds debug-only ``VersionHistoryDropdown`` widgets to the chart,
dashboard, and dataset list pages so the version surface can be
exercised from the UI during the spike. Each row's actions column
gets a clock-icon dropdown that fetches ``/api/v1/{resource}/<uuid>/
versions/`` on click, lists the ten most recent versions with a
formatted change-log summary, and offers per-version restore via
``POST .../versions/<uuid>/restore``.

Strings are wrapped in ``t('...')`` with placeholder formatting
(e.g. ``t('Added %(kind)s "%(name)s"', { kind, name })``) so
translators can reorder verbs and nouns rather than concatenating
fragments. ``KIND_LABELS`` is a static map keying English layout
kinds (``chart``, ``row``, ``column``, ``tab``, ``markdown``, etc.)
to ``t(...)``-extractable labels. Empty change lists render as
"Baseline" rather than "No changes recorded" since the empty case
is overwhelmingly the ``operation_type=0`` baseline row.

Locale-aware date rendering: ``new Date(iso).toLocaleString(lang)``
where ``lang`` comes from ``document.documentElement.lang`` (set
by ``src/views/App.tsx`` from the bootstrap ``locale``), so dates
follow the user's chosen Superset locale rather than the browser's.

French translations for the new strings are appended to
``superset/translations/fr/LC_MESSAGES/messages.po`` (Ajouté,
Supprimé, Modifié, Version initiale, kind labels, …). Run
``npm run build-translation`` and ``pybabel compile -l fr`` to
regenerate the JSON / MO packs.

This commit is **demo-only** per ADR-005 (V1 is backend-only). It
is intentionally marked ``temp`` so it can be reverted before the
PR splits — the production V1 ships without UI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:06 -06:00
Mike Bridge
7ce5f1d0e7 test(versioning): integration tests for SkipUnmodifiedPlugin (FR-026)
Locks in the no-op-suppression behavior implemented by
``SkipUnmodifiedPlugin`` (which lives in ``superset/versioning/factory.py``
shipping with the foundation commit). Five integration tests:

1. Owners-only edit doesn't mint a version row — exercises the
   case where every dirty column is an excluded relationship.
2. Re-save with identical scalar values doesn't mint a row —
   exercises the json_metadata re-serialise path where
   ``set_dash_metadata`` rewrites the column to a different byte
   sequence with identical parsed content; the plugin must compare
   post-flush values against the prior shadow row to detect this.
3. Real scalar change DOES mint a row — guards against the plugin
   over-suppressing.
4. Same assertion on a Slice (covers the ``String`` column path on
   a different entity type).
5. ``json_metadata`` sub-key edit DOES mint a row — covers the
   ``MediumText`` column path past the plugin's content-equality
   check.

Tests are designed so a column-type change in the parent entities
(e.g. flipping ``json_metadata`` from ``MediumText`` to ``JSON``)
will fail one of these if the plugin's Python ``!=`` comparison
breaks for the new type.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:06 -06:00
Mike Bridge
9bc95ef819 feat(versioning): ETag helper module + integration tests (T055)
Helper module that derives the strong-validator ``ETag`` value from
an entity's current live ``version_uuid`` and attaches it to a
Flask response. Two functions:

- ``set_version_etag(response, version_uuid)`` — direct path used by
  PUT handlers that already compute ``new_version_uuid`` (see the
  REST API commit two prior). Cheap; no extra query.
- ``set_version_etag_by_uuid(response, model_cls, entity_uuid)`` —
  used by version endpoints that operate on ``entity_uuid``; looks
  up ``entity_id`` then derives ``version_uuid`` via ``VersionDAO``.
  Costs one extra ``SELECT id WHERE uuid = ?``; documented in the
  docstring so callers prefer the cheap variant when they have the
  id already.

Integration tests cover all three entity types and four endpoint
shapes (entity GET, save PUT, version-list GET, single-version GET)
plus the entity-with-no-versions edge case (header is correctly
absent).

The ETag is wired into the API endpoints in the REST-API commit
(group 3) and the CORS ``expose_headers: ["ETag"]`` ships with the
retention commit (group 4) since both touch ``superset/config.py``.
Locking enforcement (``If-Match`` → 412) is explicitly NOT in this
change — deferred to the follow-up UI SIP per Open Question §7.
``ETag`` is informational in v1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:06 -06:00
Mike Bridge
801d58687b feat(versioning): time-based retention via Celery beat (FR-007)
Adds a scheduled Celery task that prunes version history older than
``SUPERSET_VERSION_HISTORY_RETENTION_DAYS`` (default 30; settable
via env var; ``0`` disables retention entirely).

**Task** — ``superset.tasks.version_history_retention.prune_old_versions``:

1. Computes ``cutoff = utcnow() - timedelta(days=N)``.
2. Selects ``version_transaction.id`` rows with ``issued_at <
   cutoff`` and filters out any tx whose parent shadow includes a
   live row (``end_transaction_id IS NULL``). The live row is the
   only preservation rule — closed historical rows including the
   baseline (``operation_type=0``) age out. Per-entity minimum-history
   floor is an open question tracked in ``future-work.md``.
3. Deletes rows owned by surviving txs in each parent shadow
   table (``dashboards_version`` / ``slices_version`` /
   ``tables_version``).
4. Deletes child-shadow rows for the same transactions
   (``table_columns_version`` / ``sql_metrics_version`` /
   ``dashboard_slices_version``).
5. Drops the surviving ``version_transaction`` rows. The
   ``version_changes`` rows cascade via the FK from the previous
   commit.

Idempotent and safely retried on partial failure.

**Schedule** — ``superset/config.py`` adds the task to the default
``CeleryConfig.beat_schedule`` (nightly at 03:00). Operators who
override ``CeleryConfig`` in their ``superset_config.py`` need to
merge this entry — see UPDATING.md.

Also adds ``"expose_headers": ["ETag"]`` to the default
``CORS_OPTIONS`` so cross-origin browser clients can read the
``ETag`` header introduced in the next commit. (Co-located here
because both touch ``superset/config.py``; the ETag mechanism
itself ships in the next commit.)

**Auto-discovery** — ``superset/tasks/celery_app.py`` adds
``version_history_retention`` to its late-imports so Celery's
auto-discovery picks up the task.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:06 -06:00
Mike Bridge
8fe9a8ce4e feat(versioning): REST API endpoints + restore commands
Exposes the version surface as three new endpoints per entity type
(chart, dashboard, dataset), each carrying the standard Superset
decorator stack (``@protect()``, ``@safe``, ``@statsd_metrics``,
``@event_logger.log_this_with_context``) so they appear in FAB's
``action_log`` alongside other audited operations.

| Method | Path | Purpose |
|---|---|---|
| GET  | ``/api/v1/{resource}/<uuid>/versions/`` | List version history (oldest-first; per entry: ``version_uuid``, ``version_number``, ``transaction_id``, ``operation_type``, ``issued_at``, ``changed_by``, ``changes`` array) |
| GET  | ``/api/v1/{resource}/<uuid>/versions/<version_uuid>/`` | Read-only snapshot of the entity at the requested version (scalar fields plus ``columns`` / ``metrics`` for datasets) |
| POST | ``/api/v1/{resource}/<uuid>/versions/<version_uuid>/restore`` | Replay the snapshot onto the live entity via Continuum's ``Reverter`` (non-destructive — produces a new version row stamping the restoring user via the standard save path) |

``<version_uuid>`` is a deterministic ``UUIDv5(entity_uuid,
transaction_id)`` so it's stable across replicas and retention
pruning. Authorisation reuses the resource's existing ``can_write``
permission; workspace admins can list / restore any entity.

**Restore commands** — ``superset/commands/{chart,dashboard,dataset}/
restore_version.py`` wrap ``VersionDAO.restore_version`` in the
standard ``@transaction()`` boundary. The command resolves the
``Reverter`` once per related collection (split-revert pattern, with
``flush + expire`` between calls) so a multi-relation restore
doesn't trip Continuum's autoflush race that would otherwise mark
half the collection as ``state.deleted=True`` mid-revert.

**Save responses** — ``PUT /api/v1/{resource}/<pk>`` is updated to
include ``old_version`` / ``new_version`` (0-based numbers),
``old_transaction_id`` / ``new_transaction_id`` (stable across
pruning), and ``old_version_uuid`` / ``new_version_uuid`` body
fields so callers can correlate a save with its resulting version
row. The ``ETag`` response header in the next commit is built on
top of this, but the body fields stay — they predate the header
and remain useful for clients that don't read response headers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:06 -06:00
Mike Bridge
f7d73e2e1b feat(versioning): change records + diff engine
Adds a structured per-field change log alongside the foundational
shadow tables. Each save flush emits zero or more ``version_changes``
rows describing what changed relative to the previous version, with
shape ``[{kind, path, from_value, to_value, sequence}]`` keyed to
``version_transaction.id`` (FR-016 .. FR-021).

**Schema** — ``version_changes`` table, FK to ``version_transaction``
with ``ON DELETE CASCADE`` so retention drops dependent records
without explicit cleanup. Composite unique index on
``(transaction_id, entity_kind, entity_id, sequence)`` so the
listener can write monotonically and downstream readers see a
deterministic order.

**Diff engine** (``superset/versioning/diff.py``) — pure-function
diffing of pre-/post-state pairs:

- ``diff_scalar_fields`` for ordinary columns; emits one record per
  changed field with JSON-safe ``from_value`` / ``to_value``.
- ``diff_json_field`` for ``json_metadata`` and ``params``, walking
  the parsed structure and emitting per-sub-key records. Honours
  an ``exclude_keys`` set
  (``_DASHBOARD_JSON_METADATA_AUDIT_KEYS``: ``chart_configuration``,
  ``global_chart_configuration``, ``map_label_colors``,
  ``show_chart_timestamps``, ``color_namespace``;
  ``_CHART_PARAMS_AUDIT_KEYS``) so frontend-stamped sub-keys that
  mutate on every save don't dominate the change log (FR-022).
- ``diff_dashboard_layout`` walks ``position_json`` structurally
  and emits ``[verb, kind, id]`` records (verbs ``add``, ``remove``,
  ``move``, ``edit``; kinds from a ``CHART``/``ROW``/``COLUMN``/etc.
  → english map) so a UI can render "Added chart 'Foo'" without
  re-parsing JSON. ``HEADER_ID`` is suppressed because it duplicates
  the ``dashboard_title`` scalar record.
- ``fold_dashboard_layout_with_chart_changes`` deduplicates layout
  records against M2M / chart-membership records by UUID so an
  add-and-attach doesn't appear twice.
- ``_values_equivalent`` treats ``None`` and ``""`` as equal; this
  matches the save path's habit of normalising nullable strings to
  the empty string.

**Listener** — ``superset/versioning/changes.py`` registers a
``before_flush`` listener that captures pre-state for each dirty
entity and an ``after_flush`` listener that runs the diff engine
against the post-state and writes ``version_changes`` rows under
the resolved ``transaction_id``. Tracks processed transaction ids
on ``session.info`` so re-firings within a single transaction
(autoflush triggered by mid-commit queries) don't double-insert and
trip the unique constraint. Reads child rows via raw SELECT against
``table_columns`` / ``sql_metrics`` rather than ``dataset.columns``
because the live collection is stale during the restore path's raw
DELETE+INSERT cycle.

**Endpoint surface** — ``VersionDAO.list_change_records_batch``
batches the lookup across multiple transactions with a single
``WHERE transaction_id IN (...)`` query so the version-list
endpoint avoids N+1 round-trips. ``list_versions`` / ``get_version``
return entries with a populated ``changes`` array (empty for
``operation_type=0`` baseline rows).

**Tests** — ``test_diff.py`` covers the diff engine shape (39
unit cases across scalar, JSON, layout, child-collection, and
fold paths). ``change_records_tests.py`` exercises the listener
end-to-end with realistic save flows. ``perf_validation_tests.py``
is the T044 harness for SC-002/3/4 (list endpoint p95 < 1s,
restore < 3s, save overhead < 50ms).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:06 -06:00
Mike Bridge
be01e4552c feat(versioning): foundation — Continuum capture + parent/child shadow tables + VersionDAO
Adds SQLAlchemy-Continuum as a dependency and wires it as the
canonical capture mechanism for chart, dashboard, and dataset edits.

**Schema** — three Alembic migrations, leaving the chain at one
foundation revision plus one child-shadow revision:

- ``version_transaction`` (renamed from Continuum's default
  ``transaction``; SQL-reserved-word workaround) carries the per-save
  ``user_id`` / ``issued_at`` and is the join target for all shadow
  rows. Auto-incrementing PK; user_id has no FK so import / Celery /
  CLI saves can write rows without an active Flask user.
- Parent shadow tables for the three entity types:
  ``dashboards_version``, ``slices_version``, ``tables_version``.
- Child shadow tables for dataset children + dashboard M2M:
  ``table_columns_version``, ``sql_metrics_version``,
  ``dashboard_slices_version`` (composite PK on the M2M shadow,
  matching the live ``dashboard_slices`` reshape from
  sc-105349-composite-association-pks).

**Models** — ``Dashboard``, ``Slice``, ``SqlaTable`` (and dataset
children ``TableColumn`` / ``SqlMetric``) gain ``__versioned__``
class attributes. The exclude lists carry both M2M relationships
(``owners``, ``roles``, ``dashboards``) and the ``AuditMixin``
columns (``changed_on`` / ``created_on`` / ``changed_by_fk`` /
``created_by_fk`` plus ``last_saved_at`` / ``last_saved_by_fk``
on ``Slice``) so auto-bumped audit fields cannot trigger a
version row on their own (FR-025).

**Plugins** — ``superset/versioning/factory.py`` ships three
Continuum plugins:

- ``VersionTransactionFactory`` renames the transaction table and
  appends the unconditional ``user_id`` column.
- ``VersioningFlaskPlugin`` sources the acting user from Superset's
  ``g.user`` rather than ``flask_login.current_user`` (Superset's
  JWT auth populates ``g.user`` but leaves ``current_user``
  anonymous on API routes).
- ``SkipUnmodifiedPlugin`` filters Continuum's UPDATE operations,
  marking content-equivalent re-saves as ``processed=True`` so they
  don't mint no-op shadow rows (FR-026; see follow-up commits for
  the test). Lives in this commit because it shares the file with
  the other plugins.

**Save-path glue** — a ``before_flush`` baseline listener
(``superset/versioning/baseline.py``) inserts an ``operation_type=0``
shadow row the first time a pre-existing entity is saved, including
the slice-baseline-under-dashboard pattern that gives the dashboard
M2M shadow a row to join against. ``UpdateDashboardCommand`` wraps
its body in ``no_autoflush`` so ``process_tab_diff`` /
``process_native_filter_diff`` don't fire intermediate flushes that
would mint extra version rows. ``DatasetDAO.update_columns`` is
rewritten as a natural-key upsert keyed on ``column_name`` so child
edits flow through ORM events Continuum sees.

**DAO** — ``superset/daos/version.py`` exposes the read API used by
the version endpoints in the next commits:
``current_version_number`` (0-based index, unstable under retention
pruning), ``current_live_transaction_id`` (stable across pruning),
``current_live_version_uuid`` (deterministic UUIDv5), plus
``list_versions`` / ``get_version`` / ``restore_version`` and a
batch ``list_change_records_batch`` for N+1 avoidance.

**Initialization** — ``superset/initialization/__init__.py`` wires
``init_versioning()`` after ``make_versioned()`` runs and the
versioned mappers are configured. Registers the baseline listener
plus the change-record listener (the latter's body lives in the
next commit but the registration site is here because it shares
the init function).

**Tests** — version-capture and version-list integration tests for
each entity type, plus a ``VersionDAO`` unit test suite. Retention
test uses a backdated ``issued_at`` so it can drive
``_prune_old_versions_impl`` synchronously.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:06 -06:00
Mike Bridge
0a9fa1ac85 feat(scripts): add --dirty-duplicates-pct to seed_junction_load.py
Extends the stress-test seed script with an optional duplicate-row
injection step, used to measure the empirical cost of the migration's
``_dedupe_by_min_id`` phase.

Usage: after running the normal seed at a given scale, add
``--dirty-duplicates-pct 5`` (or any non-zero value) to inject that
percentage of duplicate ``(fk1, fk2)`` rows into each non-UNIQUE
junction (slice_user, dashboard_user, dashboard_roles —
dashboard_slices is skipped because its UNIQUE constraint, present
both pre- and post-migration, rejects duplicates).

Pre-condition: requires the DB to be at the pre-migration revision
(33d7e0e21daa). The post-migration composite PK rejects duplicates,
so attempting to inject on the upgraded schema errors out.

Empirical result on MySQL @ 10M dashboard_slices + ~2.1M other
junction rows + 105K injected duplicates (5% on the 3 non-UNIQUE
tables):
  Upgrade time: 1m 36s vs clean baseline 1m 37s
  → dedupe cost is within measurement noise; the table-scan that
    the migration already performs dominates whether or not
    duplicates exist.

This empirically confirms what the cost-model predicted: the
``_dedupe_by_min_id`` GROUP BY scan is the dominant cost of that
phase, and the actual per-duplicate DELETE is negligible.

NULL-FK injection deliberately skipped — would require altering the
six non-UNIQUE FK columns from NOT NULL back to nullable (the
migration's downgrade keeps them NOT NULL by design), which adds
per-backend ALTER complexity for a code path that's structurally
identical in cost shape (DELETE WHERE col IS NULL is the same scan
shape as the dedupe scan).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
58a1a1a8d1 build(scripts): add stress-test data generator for migration timing
Add ``scripts/seed_junction_load.py``, a backend-agnostic script that
bulk-inserts synthetic parent rows (dashboards, slices, users, roles,
tables, dbs) and many-to-many junction rows for the four largest
association tables targeted by the composite-PK migration:
``dashboard_slices``, ``slice_user``, ``dashboard_user``,
``dashboard_roles``.

Designed for measuring migration runtime at varying scales — run with
a series of size flags (100K / 1M / 5M / 10M for the target table)
and time the migration at each scale to verify the predicted
``O(N log N)`` extrapolation against real numbers.

Properties:
- **Reproducible**: deterministic cross-product walk through parent IDs
  produces a stable pair sequence; re-running is replayable.
- **Idempotent**: re-running with the same target is a no-op; with a
  higher target, only new rows are added.
- **Backend-agnostic**: connects via Superset's standard ``DATABASE_*``
  env vars (or ``SUPERSET__SQLALCHEMY_DATABASE_URI``). Branches on
  dialect for ``BINARY(16)`` vs ``UUID`` vs TEXT/BLOB UUID columns.
- **Batched**: bulk INSERT 10K rows per statement.
- **Per-phase timing**: logs elapsed wall time for the parents phase,
  the junctions phase as a whole, and per junction-table.
- **Avoidance set**: loads existing junction pairs into a Python set
  so re-runs on top of pre-existing data don't collide on the
  uniqueness constraint.

Usage (inside the Superset container):

    docker exec superset-superset-1 \\
        /app/.venv/bin/python /app/scripts/seed_junction_load.py \\
        --dashboard-slices 1000000

Defaults target a "large multi-team install" shape: 1M
``dashboard_slices``, 100K each ``slice_user`` / ``dashboard_user``,
10K ``dashboard_roles``. Override per-table via flags.

Tested locally on MySQL (the user's current eval stack):
- 200/100/100/50 row mini-run produced expected counts.
- Re-running at the same target is a no-op (idempotent).
- ``--dry-run`` plans without writing.

Junction tables not yet covered (``sqlatable_user``, ``rls_filter_*``,
``report_schedule_user``) are typically small in production and
require additional parent seeding (RLS filters, report schedules)
that wasn't worth the scope here. Adding them is straightforward by
extending ``JUNCTIONS`` and writing the corresponding parent seeder.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
fef0a64b21 fix(docker): MySQL examples DB + EXAMPLES_PORT override (sc-105349)
Fix two follow-on issues reported when starting the dev stack with
docker-compose-mysql.yml:

1. ``superset-init`` step 4 (load-examples) fails with
   ``MySQLdb.OperationalError: (2002, "Can't connect to server on 'db'")``
   because the analytics-examples DB connection inherits ``EXAMPLES_PORT=5432``
   (Postgres port) from ``docker/.env``. The override flipped
   ``DATABASE_DIALECT`` to ``mysql+mysqldb`` but left the EXAMPLES_*
   group on Postgres defaults, so the URI became
   ``mysql+mysqldb://examples:examples@db:5432/examples`` — MySQL
   container has no listener on 5432.

   Fix: add ``EXAMPLES_HOST/PORT/DB/USER/PASSWORD`` and a complete
   ``SUPERSET__SQLALCHEMY_EXAMPLES_URI`` to the ``mysql-env`` anchor.

2. The Postgres init scripts under
   ``docker/docker-entrypoint-initdb.d/`` (``cypress-init.sh``,
   ``examples-init.sh``) get mounted on the MySQL container too —
   compose merges volume lists. They invoke ``psql`` which doesn't
   exist in the MySQL image, abort with ``psql: command not found``,
   and prevent the ``examples`` DB from being created.

   Fix: add a MySQL-specific init script
   ``docker/mysql-init/examples-init.sql`` that creates the
   ``examples`` database and user, and mount it at
   ``/docker-entrypoint-initdb.d`` in the override. Compose's
   later-takes-precedence rule on duplicate volume targets displaces
   the Postgres init dir, so the MySQL container only sees the
   MySQL-compatible script.

   (Used a plain duplicate-target mount rather than the ``!override``
   tag because pre-commit's ``check-yaml`` doesn't recognize Compose's
   custom YAML tags.)

Recovery for an existing failed MySQL stack: ``docker compose -f
docker-compose.yml -f docker-compose-mysql.yml down``, then
``docker volume rm superset_db_home_mysql`` (so the new init script
runs on the next fresh boot), then ``up`` again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
7867f30a23 build(docker): add MySQL compose override for dialect-swap evaluation
Adds ``docker-compose-mysql.yml``, a compose-override file that swaps
the default Postgres metadata DB for MySQL 8 with one extra ``-f``
flag:

  docker compose -f docker-compose.yml -f docker-compose-mysql.yml up

Useful for evaluating dialect-specific behaviour (e.g., the runtime
cost of DDL migrations on a deployment whose production metadata DB
is MySQL — the question raised by review feedback on this PR).

Mirrors the connection settings used by CI's ``test-mysql`` shard:
``mysql+mysqldb`` dialect, charset ``utf8mb4`` with binary_prefix.
Host port defaults to 13306 (configurable via ``DATABASE_PORT_MYSQL``)
to avoid colliding with a native MySQL install on 3306.

A separate volume (``db_home_mysql``) keeps MySQL data isolated from
the Postgres ``db_home`` volume, so switching between the two with
``-f`` flag toggles doesn't corrupt either side.

The Postgres-specific init scripts under
``docker/docker-entrypoint-initdb.d/`` are not mounted on the MySQL
service (they are postgres-only). Examples / cypress fixtures still
load via ``superset-init``'s post-startup steps, which run
``superset load-examples`` against whichever metadata DB is in use.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
118161b0a0 docs(UPDATING): add MySQL-targeted maintenance-window queries (sc-105349)
Mirror of the PostgreSQL diagnostic queries added in 11148779ed,
adapted for MySQL/InnoDB. One important difference: InnoDB rebuilds
the clustered index on every PK change, so all eight tables undergo
a full table rebuild on MySQL — not just the two that go through
the explicit ``recreate="always"`` path. The lock-window estimate
query is updated to cover all eight rather than just two, and the
"migration_path" column makes the rebuild expectation explicit
("direct ALTER (still rebuilds InnoDB clustered index)").

Other notes:
- ``information_schema.TABLES.TABLE_ROWS`` is an InnoDB estimate,
  analogous to PostgreSQL's ``reltuples``; documented inline.
- ``KEY_COLUMN_USAGE`` carries both sides of the FK in a single
  row on MySQL, so the external-FK pre-flight check is simpler
  than the PostgreSQL version (no joins between three views).
- The aggregated dedupe query is portable standard SQL; included
  verbatim for copy-paste convenience.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
3408a6f6c0 docs(UPDATING): add Postgres-targeted maintenance-window queries (sc-105349)
Add a "Sizing the maintenance window on PostgreSQL" sub-section to the
operator runbook. The simple per-table COUNT/duplicate/NULL queries
that were already there are dialect-portable but only count rows;
operators on PostgreSQL with large deployments need to characterize
the migration's runtime cost before scheduling it.

Adds four diagnostic queries:

- Per-table size, row count (from pg_class.reltuples), and which
  migration path each table will take (recreate-rewrite vs direct
  ALTER). Sizes the work concretely.
- Aggregated duplicate-row roll-up: dup_groups + total rows_dropped
  per table. Replaces eight separate per-table queries with one
  consolidated result for audit/dump-before-apply decisions.
- External-FK pre-flight check (the same one the migration runs at
  upgrade time and aborts on). Lets operators surface any blocking
  external reference ahead of the maintenance window. Should be
  empty on a stock install.
- Lock-window estimate for the two full-rewrite tables, using
  pg_relation_size and a conservative 100 MB/s rewrite throughput
  assumption. The other six use direct ALTER and are dominated by
  composite-index build time (seconds for low-millions-of-rows
  tables).

Prompted by reviewer feedback on apache/superset#39859 from a large
deployment asking how to size the maintenance window. The original
pre-flight queries are kept for cross-dialect operators (MySQL,
SQLite) since the new queries use PostgreSQL-specific catalog views.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
254e826307 fix(migration): rebase down_revision onto 33d7e0e21daa (sc-105349)
CI cypress + playwright shards were red with:

  ERROR [flask_migrate] Error: Multiple head revisions are present
  for given argument 'head'

The recent rebase onto master pulled in
``33d7e0e21daa_add_semantic_layers_and_views.py`` (from PR #37815,
"semantic layer extension"), which had been authored against
``ce6bd21901ab`` as its parent — the same parent our migration
referenced. After the rebase both migrations point at
``ce6bd21901ab``, producing two heads and breaking ``flask db
upgrade head`` for any downstream consumer (CI's Cypress / Playwright
shards spin up a real Superset instance via ``superset db upgrade``,
which is why those shards failed first; the integration shards run
against a precomputed schema and didn't surface this).

Fix: chain our migration after the semantic-layer migration by
pointing ``down_revision`` at ``33d7e0e21daa``. The chain is now
linear:

    ... → ce6bd21901ab → 33d7e0e21daa (semantic layers)
                          → 2bee73611e32 (composite PK, this PR)

Verified with ``superset db heads`` (returns single head
``2bee73611e32``) and the local migration test suite (44 passed,
1 skipped).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
9465e3b675 fix(migration): explicit NOT NULL on FK columns for SQLite (sc-105349)
Found by running fresh-install + round-trip against a real SQLite DB:
6 of the 8 affected tables had FK columns that were originally
declared nullable. PostgreSQL and MySQL implicitly promote the
constituent columns of an ``ALTER TABLE ... ADD PRIMARY KEY`` to
``NOT NULL``; SQLite does not (it's a long-standing SQLite quirk —
only ``INTEGER PRIMARY KEY`` enforces NOT NULL on a composite-PK
column). Result: a fresh SQLite install would accept
``INSERT INTO dashboard_slices (NULL, 5)`` despite both columns
being part of the composite PK.

Our integration tests previously masked this: the test fixture seeds
columns with ``nullable=False``, so the post-upgrade NOT NULL
assertion passed regardless of whether the migration enforced it.

Fix: add explicit ``batch_op.alter_column(fk, nullable=False)`` for
both FK columns inside the per-table batch_alter_table block. On
PostgreSQL and MySQL this is a no-op (PK already implies NOT NULL);
on SQLite it adds the missing NOT NULL declaration so a fresh
install matches the data-model.md "After" contract.

Verified end-to-end:
- Postgres + MySQL: column shape unchanged (still NOT NULL)
- SQLite fresh install + round-trip: all 8 tables now have NOT NULL
  on FK columns, ``INSERT (NULL, 5)`` correctly rejected with
  IntegrityError on dashboard_slices, dashboard_user, sqlatable_user

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
65a3491861 fix(migration): MySQL downgrade FK + AUTO_INCREMENT (sc-105349)
Two MySQL-only failures in the downgrade path, found by running the
full migration history against a fresh MySQL 8 container:

1. ``MySQLdb.OperationalError: (1553, "Cannot drop index 'PRIMARY':
   needed in a foreign key constraint")``. InnoDB uses the composite
   PK index to back the FK on the leftmost column. The downgrade
   tried to drop the composite PK before dropping the FKs, orphaning
   the FK's backing index. PostgreSQL and SQLite create separate
   indexes for FK columns and don't trip on this.

2. ``Field 'id' doesn't have a default value`` on subsequent INSERT.
   ``sa.Identity(always=False)`` only emits ``AUTO_INCREMENT`` on
   MySQL when the column is created with ``primary_key=True`` — our
   portable path adds the column first then creates the PK separately,
   so MySQL leaves the column without auto-generation. Existing rows
   would all collide on id=0; future inserts fail because no default.
   Postgres' ``GENERATED BY DEFAULT AS IDENTITY`` and SQLite's
   ``INTEGER PRIMARY KEY`` rowid alias don't have this gap.

Fix: extract ``_downgrade_mysql_table()`` that emits the canonical
MySQL idiom — drop FKs, then a single ALTER combining
``DROP PRIMARY KEY, ADD COLUMN id INT NOT NULL AUTO_INCREMENT,
ADD PRIMARY KEY (id)`` (which backfills existing rows with sequential
ids and preserves AUTO_INCREMENT), restore the redundant UNIQUE on
the 2 tables that originally had it, and re-add the FKs with their
original names. Postgres and SQLite keep the existing portable
``batch_alter_table`` path.

Raw SQL is unavoidable for the combined-ALTER form; per the
constitution it's allowed for dialect-specific DDL with no SQLA
equivalent, with triple-quoted strings for legibility.

Verified end-to-end: upgrade → downgrade → upgrade against a fresh
MySQL 8 container with INSERT-without-id sanity check showing the
restored ``id`` column auto-increments correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
56c36fde54 fix(migration): drop FKs before recreate on MySQL (sc-105349)
CI test-mysql failed with:

  MySQLdb.OperationalError: (1826, "Duplicate foreign key constraint
  name 'fk_dashboard_slices_slice_id_slices'")

Root cause: MySQL scopes foreign-key constraint names per-database,
not per-table (PostgreSQL and SQLite scope per-table). The
``batch_alter_table(... recreate="always", copy_from=...)`` path
used for ``dashboard_slices`` and ``report_schedule_user`` builds
``_alembic_tmp_<table>`` carrying the original FK names from
``copy_from`` while the original table still holds those names — MySQL
rejects the temp-table creation with ERROR 1826.

Fix: on MySQL only, drop the original FK constraints by name before
the ``batch_alter_table`` runs. The ``copy_from`` re-creates them on
the rebuilt table with their original names, so the post-migration
shape is unchanged. On PostgreSQL and SQLite the original code path
still runs unchanged.

Local SQLite tests (44 passed, 1 skipped) still pass; CI will validate
on MySQL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
0d95b41aed refactor(migration): build pre-flight SQL via SQLAlchemy core (review)
Address Beto's review comments on apache/superset#39859: replace
``sa.text(f"...")`` SQL construction in the three pre-flight helpers
(``_delete_null_fk_rows``, ``_dedupe_by_min_id``, ``_assert_no_duplicates``)
with SQLAlchemy core constructs (``sa.delete``, ``sa.select``,
``sa.func``, ``.subquery()``, ``.notin_()``).

A small ``_table_clause()`` helper builds a lightweight ``TableClause``
exposing the columns the queries reference; the three helpers consume
it. Removes all ``# noqa: S608`` comments — they are no longer needed
because there is no string-interpolated SQL.

Verified the compiled SQL is identical on Postgres, MySQL, and SQLite,
including the MySQL ERROR 1093 workaround (the inner aggregation is
wrapped in a derived table via ``.subquery()``, producing
``... NOT IN (SELECT keep_id FROM (SELECT min(id) ...) AS keep_min)``).

Also drops the redundant ``f`` prefix on the two non-interpolating
lines of the ``_check_no_external_fks_to_id`` error message.

44 migration tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
6086d9c52a docs(migration): address SQLAlchemy review follow-ups
Four operator-experience improvements from the second review pass:

1. ``TABLES_WITH_NULLABLE_FKS`` is now explicitly documented as an
   informational set that is not consulted at runtime; the comment
   explains the previous ``dashboard_roles`` omission was the bug
   that motivated the always-run cleanup.
2. ``_delete_null_fk_rows`` docstring updated to match the
   "always run" semantics (was still claiming "called only on tables
   in TABLES_WITH_NULLABLE_FKS").
3. ``_check_no_external_fks_to_id`` now documents its scope
   limitation: ``Inspector.get_table_names()`` returns the default
   schema only, so cross-schema FKs in non-standard multi-schema
   PostgreSQL deployments would not be caught. The single-schema
   case (Superset's documented deployment) is fully covered.
4. ``_dedupe_by_min_id`` now logs a sample of up to 10 discarded
   ``(fk1, fk2, id)`` tuples at WARN before deletion, so operators
   can audit which rows the ``MIN(id)`` policy drops. The keep-
   original policy is correct in practice but discards later
   re-grants on ownership tables; the sample makes that visible.
5. ``UPDATING.md`` documents the upgrade/downgrade primary-key
   name divergence (``pk_<table>`` vs ``<table>_pkey``) so
   operators using schema-comparison tools don't mistake it for
   migration drift.

No schema or runtime-behaviour changes. All 44 migration tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
cc20fe7cae fix(migration): always run NULL-FK cleanup; correct RLS test parent name
Two cleanups from PR review:

1. ``dashboard_roles.dashboard_id`` was created nullable in revision
   e11ccdd12658 but was missing from ``TABLES_WITH_NULLABLE_FKS``. A
   production database with a stray NULL ``dashboard_id`` row would have
   failed the PK-add with a cryptic constraint violation. Fix by running
   the NULL-FK cleanup on every affected table — it is a no-op DELETE on
   tables whose FK columns are already NOT NULL, and it eliminates the
   risk of further drift in the hardcoded set. ``dashboard_roles`` is
   added to the documentation set; the runtime now does not consult it.

2. The unit-test parent-table name for ``rls_filter_roles`` and
   ``rls_filter_tables`` was ``rls_filter`` (does not exist) instead of
   the real parent ``row_level_security_filters``. Test passes either
   way (the in-memory FK is self-consistent), but the parameter is now
   accurate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
Mike Bridge
5958e12fc0 refactor(db): composite PK on M2M association tables (sc-105349)
Replace synthetic id INTEGER PRIMARY KEY with composite PRIMARY KEY (fk1, fk2)
on the eight pure-junction tables: dashboard_roles, dashboard_slices,
dashboard_user, report_schedule_user, rls_filter_roles, rls_filter_tables,
slice_user, sqlatable_user. The redundant UNIQUE(fk1, fk2) on dashboard_slices
and report_schedule_user is dropped (subsumed by the new PK).

Migration handles dialect quirks: copy_from for tables with pre-existing
UNIQUE (so SQLite's anonymous-constraint reflection doesn't matter), wrapped-
subquery dedupe for MySQL (ERROR 1093), sa.Identity(always=False) on downgrade
to backfill the restored id column without NOT NULL violations, and distinct
PK names per direction (pk_<table> on upgrade, <table>_pkey on downgrade) to
avoid round-trip index-name collisions on Postgres.

ORM Table() definitions updated to match. UPDATING.md entry added with
operator runbook (BI-tool impact, pre-flight inventory queries, dedupe-row-
loss notice, pg_dump workaround, FK-NOT-NULL downgrade asymmetry note).

Tests: 8 schema-shape assertions (post-upgrade), 8 duplicate-rejection unit
tests, 8 distinct-pair sanity tests, 1 round-trip + idempotency test
(in-memory SQLite via Alembic MigrationContext).

Continuum-restore verification against the new shape is out of scope for this
PR; it is the responsibility of the versioning epic (sc-103156).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 18:42:05 -06:00
1219 changed files with 50554 additions and 208337 deletions

View File

@@ -77,17 +77,23 @@ github:
# combination here.
contexts:
- lint-check
- cypress-matrix-required
- cypress-matrix (0, chrome)
- cypress-matrix (1, chrome)
- cypress-matrix (2, chrome)
- cypress-matrix (3, chrome)
- cypress-matrix (4, chrome)
- cypress-matrix (5, chrome)
- dependency-review
- frontend-build
- playwright-tests-required
- playwright-tests (chromium)
- pre-commit (current)
- pre-commit (previous)
- test-mysql
- test-postgres-required
- test-postgres (current)
- test-postgres-hive
- test-postgres-presto
- test-sqlite
- unit-tests-required
- unit-tests (current)
required_pull_request_reviews:
dismiss_stale_reviews: false

2
.github/CODEOWNERS vendored
View File

@@ -38,7 +38,7 @@
# Notify translation maintainers of changes to translations
/superset/translations/ @sfirke @rusackas
/superset/translations/ @sfirke
# Notify PMC members of changes to extension-related files

View File

@@ -41,8 +41,8 @@ body:
label: Superset version
options:
- master / latest-dev
- "6.1.0"
- "6.0.0"
- "5.0.0"
validations:
required: true
- type: dropdown

67
.github/SECURITY.md vendored Normal file
View File

@@ -0,0 +1,67 @@
# Security Policy
This is a project of the [Apache Software Foundation](https://apache.org) and follows the
ASF [vulnerability handling process](https://apache.org/security/#vulnerability-handling).
## Reporting Vulnerabilities
**⚠️ Please do not file GitHub issues for security vulnerabilities as they are public! ⚠️**
Apache Software Foundation takes a rigorous standpoint in annihilating the security issues
in its software projects. Apache Superset is highly sensitive and forthcoming to issues
pertaining to its features and functionality.
If you have any concern or believe you have found a vulnerability in Apache Superset,
please get in touch with the Apache Superset Security Team privately at
e-mail address [security@superset.apache.org](mailto:security@superset.apache.org).
More details can be found on the ASF website at
[ASF vulnerability reporting process](https://apache.org/security/#reporting-a-vulnerability)
**Submission Standards & AI Policy**
To ensure engineering focus remains on verified risks and to manage high reporting volumes, all reports must meet the following criteria:
- Plain Text Format: In accordance with Apache guidelines, please provide all details in plain text within the email body. Avoid sending PDFs, Word documents, or password-protected archives.
- Mandatory AI Disclosure: If you utilized Large Language Models (LLMs) or AI tools to identify a flaw or assist in writing a report, you must disclose this in your submission so our triage team can contextualize the findings.
- Human-Verified PoC: All submissions must include a manual, step-by-step Proof of Concept (PoC) performed on a supported release. Raw AI outputs, hypothetical chat transcripts, or unverified scanner logs will be closed as Invalid.
We kindly ask you to include the following information in your report to assist our developers in triaging and remediating issues efficiently:
- Version/Commit: The specific version of Apache Superset or the Git commit hash you are using.
- Configuration: A sanitized copy of your `superset_config.py` file or any config overrides.
- Environment: Your deployment method (e.g., Docker Compose, Helm, or source) and relevant OS/Browser details.
- Impacted Component: Identification of the affected area (e.g., Python backend, React frontend, or a specific database connector).
- Expected vs. Actual Behavior: A clear description of the intended system behavior versus the observed vulnerability.
- Detailed Reproduction Steps: Clear, manual steps to reproduce the vulnerability.
**Out of Scope Vulnerabilities**
To prioritize engineering efforts on genuine architectural risks, the following scenarios are explicitly out of scope and will not be issued a CVE:
- Attacks requiring Admin privileges: (e.g., CSS injection, template manipulation, dashboard ownership overrides, or modifying global system settings). Per the CVE vulnerability definition in CNA Operational Rules 4.1, a qualifying vulnerability must allow violation of a security policy. The Admin role is a fully trusted operational boundary defined by Apache Superset's security policy; actions within this boundary do not violate that policy and are therefore considered intended capabilities 'by design,' not vulnerabilities.
- Brute Force and Rate Limiting: Reports targeting a lack of resource exhaustion protections, generic rate-limiting, or volumetric Denial of Service (DoS) attempts.
- Theoretical attack vectors: Issues without a demonstrable, reproducible exploit path.
- Non-Exploitable Findings: Missing security headers, generic banner disclosures, or descriptive error messages that do not lead to a direct, documented exploit.
**Outcome of Reports**
Reports that are deemed out-of-scope for a CVE but represent valid security best practices or hardening opportunities may be converted into public GitHub issues. This allows the community to contribute to the general hardening of the platform even when a specific vulnerability threshold is not met.
Note that Apache Superset is not responsible for any third-party dependencies that may
have security issues. Any vulnerabilities found in third-party dependencies should be
reported to the maintainers of those projects. Results from security scans of Apache
Superset dependencies found on its official Docker image can be remediated at release time
by extending the image itself.
**Vulnerability Aggregation & CVE Attribution**
In accordance with MITRE CNA Operational Rules (4.1.10, 4.1.11, and 4.2.13), Apache Superset issues CVEs based on the underlying architectural root cause rather than the number of affected endpoints or exploit payloads.
- Aggregation: If multiple exploit vectors stem from the same programmatic failure or shared vulnerable code, they must be aggregated into a single, comprehensive report.
- Independent Fixes: Separate CVEs will only be assigned if the vulnerabilities reside in decoupled architectural modules and can be fixed independently of one another.
Reports that fail to aggregate related findings will be merged during triage to ensure an accurate and defensible CVE record.
**Your responsible disclosure and collaboration are invaluable.**
## Extra Information
- [Apache Superset documentation](https://superset.apache.org/docs/security)
- [Common Vulnerabilities and Exposures by release](https://superset.apache.org/docs/security/cves)
- [How Security Vulnerabilities are Reported & Handled in Apache Superset (Blog)](https://preset.io/blog/how-security-vulnerabilities-are-reported-and-handled-in-apache-superset/)

View File

@@ -24,41 +24,32 @@ runs:
- name: Interpret Python Version
id: set-python-version
shell: bash
env:
INPUT_PYTHON_VERSION: ${{ inputs.python-version }}
run: |
if [ "$INPUT_PYTHON_VERSION" = "current" ]; then
RESOLVED_VERSION="3.11"
elif [ "$INPUT_PYTHON_VERSION" = "next" ]; then
if [ "${{ inputs.python-version }}" = "current" ]; then
echo "PYTHON_VERSION=3.11" >> $GITHUB_ENV
elif [ "${{ inputs.python-version }}" = "next" ]; then
# currently disabled in GHA matrixes because of library compatibility issues
RESOLVED_VERSION="3.12"
elif [ "$INPUT_PYTHON_VERSION" = "previous" ]; then
RESOLVED_VERSION="3.10"
elif printf '%s' "$INPUT_PYTHON_VERSION" | grep -Eq '^[0-9]+\.[0-9]+(\.[0-9]+)?$'; then
RESOLVED_VERSION="$INPUT_PYTHON_VERSION"
echo "PYTHON_VERSION=3.12" >> $GITHUB_ENV
elif [ "${{ inputs.python-version }}" = "previous" ]; then
echo "PYTHON_VERSION=3.10" >> $GITHUB_ENV
else
echo "Invalid python-version: '$INPUT_PYTHON_VERSION'" >&2
exit 1
echo "PYTHON_VERSION=${{ inputs.python-version }}" >> $GITHUB_ENV
fi
echo "python-version=$RESOLVED_VERSION" >> "$GITHUB_OUTPUT"
- name: Set up Python ${{ steps.set-python-version.outputs.python-version }}
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
- name: Set up Python ${{ env.PYTHON_VERSION }}
uses: actions/setup-python@v5
with:
python-version: ${{ steps.set-python-version.outputs.python-version }}
python-version: ${{ env.PYTHON_VERSION }}
cache: ${{ inputs.cache }}
- name: Install dependencies
env:
INPUT_INSTALL_SUPERSET: ${{ inputs.install-superset }}
INPUT_REQUIREMENTS_TYPE: ${{ inputs.requirements-type }}
run: |
if [ "$INPUT_INSTALL_SUPERSET" = "true" ]; then
if [ "${{ inputs.install-superset }}" = "true" ]; then
sudo apt-get update && sudo apt-get -y install libldap2-dev libsasl2-dev
pip install --upgrade pip setuptools wheel uv
if [ "$INPUT_REQUIREMENTS_TYPE" = "dev" ]; then
if [ "${{ inputs.requirements-type }}" = "dev" ]; then
uv pip install --system -r requirements/development.txt
elif [ "$INPUT_REQUIREMENTS_TYPE" = "base" ]; then
elif [ "${{ inputs.requirements-type }}" = "base" ]; then
uv pip install --system -r requirements/base.txt
fi

View File

@@ -26,25 +26,16 @@ runs:
- name: Set up QEMU
if: ${{ inputs.build == 'true' }}
uses: docker/setup-qemu-action@06116385d9baf250c9f4dcb4858b16962ea869c3 # v4.1.0
with:
# Pin the binfmt image to a specific QEMU release. The default
# (`tonistiigi/binfmt:latest`) is a moving target, and drift across
# QEMU's x86_64→aarch64 translator has been the proximate cause of
# intermittent `exit code: 132` (SIGILL) failures during the arm64
# leg of the multi-platform docker build — newer Node native modules
# emit instructions QEMU's user-mode emulation occasionally drops on
# the floor. Pinning a known-good release stabilises that path.
image: tonistiigi/binfmt:qemu-v8.1.5
uses: docker/setup-qemu-action@29109295f81e9208d7d86ff1c6c12d2833863392 # v3.6.0
- name: Set up Docker Buildx
if: ${{ inputs.build == 'true' }}
uses: docker/setup-buildx-action@d7f5e7f509e45cec5c76c4d5afdd7de93d0b3df5 # v4.1.0
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3.12.0
- name: Try to login to DockerHub
if: ${{ inputs.login-to-dockerhub == 'true' }}
continue-on-error: true
uses: docker/login-action@650006c6eb7dba73a995cc03b0b2d7f5ca915bee # v4.2.0
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3.7.0
with:
username: ${{ inputs.dockerhub-user }}
password: ${{ inputs.dockerhub-token }}

View File

@@ -10,7 +10,7 @@ runs:
steps:
- name: Setup Node Env
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6.4.0
uses: actions/setup-node@v4
with:
node-version: '20'
@@ -21,9 +21,8 @@ runs:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
if: ${{ inputs.from-npm == 'false' }}
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
uses: actions/checkout@v4
with:
persist-credentials: false
repository: apache-superset/supersetbot
path: supersetbot

View File

@@ -1,6 +1,7 @@
version: 2
enable-beta-ecosystems: true
updates:
- package-ecosystem: "github-actions"
directory: "/"
ignore:
@@ -9,11 +10,15 @@ updates:
- dependency-name: anthropics/claude-code-action
schedule:
interval: "daily"
cooldown:
default-days: 7
- package-ecosystem: "npm"
ignore:
# TODO: remove below entries until React >= 18.0.0
- dependency-name: "storybook"
update-types: ["version-update:semver-major", "version-update:semver-minor"]
- dependency-name: "@storybook*"
update-types: ["version-update:semver-major", "version-update:semver-minor"]
- dependency-name: "eslint-plugin-storybook"
- dependency-name: "react-error-boundary"
- dependency-name: "@rjsf/*"
# remark-gfm v4+ requires react-markdown v9+, which needs React 18
@@ -36,6 +41,14 @@ updates:
# and confirm the issue https://github.com/apache/superset/issues/39600 is fixed
- dependency-name: "react-checkbox-tree"
update-types: ["version-update:semver-major"]
groups:
storybook:
applies-to: version-updates
patterns:
- "@storybook*"
- "storybook"
update-types:
- "patch"
directory: "/superset-frontend/"
schedule:
interval: "daily"
@@ -44,25 +57,16 @@ updates:
- dependabot
open-pull-requests-limit: 30
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "pip"
directory: "/"
open-pull-requests-limit: 10
# Bump the lower bound to the new version, not just widen the upper
# bound. Without this, a `sqlglot>=28.10.0, <29` constraint upgraded
# to `<30` would keep the stale lower bound forever, dragging
# transitively-resolved versions with it. See #40186 (review thread).
versioning-strategy: increase
schedule:
interval: "weekly"
labels:
- pip
- dependabot
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: ".github/actions"
@@ -70,19 +74,29 @@ updates:
interval: "daily"
open-pull-requests-limit: 10
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/docs/"
ignore:
# TODO: remove below entries until React >= 18.0.0 in superset-frontend
- dependency-name: "storybook"
update-types: ["version-update:semver-major", "version-update:semver-minor"]
- dependency-name: "@storybook*"
update-types: ["version-update:semver-major", "version-update:semver-minor"]
- dependency-name: "eslint-plugin-storybook"
- dependency-name: "react-error-boundary"
groups:
storybook:
applies-to: version-updates
patterns:
- "@storybook*"
- "storybook"
update-types:
- "patch"
schedule:
interval: "daily"
open-pull-requests-limit: 10
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-websocket/"
@@ -92,8 +106,6 @@ updates:
- npm
- dependabot
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-websocket/utils/client-ws-app/"
@@ -104,8 +116,6 @@ updates:
- dependabot
open-pull-requests-limit: 10
versioning-strategy: increase
cooldown:
default-days: 7
# Now for all of our plugins and packages!
@@ -118,8 +128,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-partition/"
@@ -130,8 +138,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-world-map/"
@@ -142,8 +148,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/plugin-chart-pivot-table/"
@@ -157,8 +161,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-chord/"
@@ -169,8 +171,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-horizon/"
@@ -181,8 +181,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-rose/"
@@ -193,8 +191,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-preset-chart-deckgl/"
@@ -205,8 +201,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/plugin-chart-table/"
@@ -220,8 +214,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-country-map/"
@@ -232,8 +224,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-map-box/"
@@ -244,8 +234,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-preset-chart-nvd3/"
@@ -256,8 +244,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/plugin-chart-word-cloud/"
@@ -268,8 +254,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-paired-t-test/"
@@ -280,8 +264,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/plugin-chart-echarts/"
@@ -292,8 +274,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/plugin-chart-ag-grid-table/"
@@ -304,8 +284,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/plugin-chart-cartodiagram/"
@@ -316,8 +294,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/legacy-plugin-chart-parallel-coordinates/"
@@ -328,8 +304,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/plugins/plugin-chart-handlebars/"
@@ -344,8 +318,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/packages/generator-superset/"
@@ -356,8 +328,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/packages/superset-ui-chart-controls/"
@@ -368,8 +338,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/packages/superset-ui-core/"
@@ -385,8 +353,6 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7
- package-ecosystem: "npm"
directory: "/superset-frontend/packages/superset-ui-switchboard/"
@@ -397,5 +363,3 @@ updates:
- dependabot
open-pull-requests-limit: 5
versioning-strategy: increase
cooldown:
default-days: 7

View File

@@ -20,6 +20,10 @@ set -e
GITHUB_WORKSPACE=${GITHUB_WORKSPACE:-.}
ASSETS_MANIFEST="$GITHUB_WORKSPACE/superset/static/assets/manifest.json"
# Rounded job start time, used to create a unique Cypress build id for
# parallelization so we can manually rerun a job after 20 minutes
NONCE=$(echo "$(date "+%Y%m%d%H%M") - ($(date +%M)%20)" | bc)
# Echo only when not in parallel mode
say() {
if [[ $(echo "$INPUT_PARALLEL" | tr '[:lower:]' '[:upper:]') != 'TRUE' ]]; then
@@ -55,15 +59,6 @@ build-assets() {
say "::endgroup::"
}
build-embedded-sdk() {
cd "$GITHUB_WORKSPACE/superset-embedded-sdk"
say "::group::Build embedded SDK bundle for E2E tests"
npm ci
npm run build
say "::endgroup::"
}
build-instrumented-assets() {
cd "$GITHUB_WORKSPACE/superset-frontend"
@@ -180,13 +175,10 @@ cypress-run-all() {
local APP_ROOT=$2
cd "$GITHUB_WORKSPACE/superset-frontend/cypress-base"
# Start the Superset backend via gunicorn (not `flask run`). The Flask
# development server is single-threaded and has no crash-recovery, so
# heavy tests (dashboard import/export, SQL Lab) can knock it offline
# for the rest of the run — surfacing as `ECONNREFUSED` / `socket hang up`
# / `Missing CSRF token` cascades. Gunicorn gives us multiple workers,
# a request timeout, and worker-recycling under load.
local serverlog="${HOME}/superset-cypress.log"
# Start Flask and run it in background
# --no-debugger means disable the interactive debugger on the 500 page
# so errors can print to stderr.
local flasklog="${HOME}/flask.log"
local port=8081
CYPRESS_BASE_URL="http://localhost:${port}"
if [ -n "$APP_ROOT" ]; then
@@ -195,58 +187,8 @@ cypress-run-all() {
fi
export CYPRESS_BASE_URL
# Mirrors the args in docker/entrypoints/run-server.sh (1 worker × 20
# gthread threads) to keep parity with production. Multi-worker
# configurations expose timing-sensitive races in the SQL Lab → Explore
# navigation flow under E2E. We diverge from the entrypoint on:
# --timeout 120: heavy dashboard import/export specs exceed the 60s
# default
# --max-requests / --max-requests-jitter: recycle the worker under
# test load to avoid leaks accumulating across the run
# superset.app:create_app(): explicit factory so we don't depend on
# FLASK_APP being exported
nohup gunicorn \
--bind "127.0.0.1:$port" \
--workers 1 \
--worker-class gthread \
--threads 20 \
--timeout 120 \
--max-requests 500 \
--max-requests-jitter 50 \
--access-logfile - \
--error-logfile - \
"superset.app:create_app()" \
>"$serverlog" 2>&1 </dev/null &
local serverPid=$!
# Ensure the backend is cleaned up and its log is emitted even when the
# test runner fails under `set -e`.
trap '
echo "::group::gunicorn log for Cypress run"
cat "'"$serverlog"'" || true
echo "::endgroup::"
kill '"$serverPid"' 2>/dev/null || true
' EXIT
# Wait for the backend to be ready before launching Cypress; otherwise
# the first spec can race the server bind and see connection errors.
local timeout=60
say "Waiting for gunicorn server to start on port $port..."
while [ $timeout -gt 0 ]; do
if curl -f "http://localhost:${port}${APP_ROOT}/health" >/dev/null 2>&1; then
say "gunicorn server is ready"
break
fi
sleep 1
timeout=$((timeout - 1))
done
if [ $timeout -eq 0 ]; then
echo "::error::gunicorn server failed to start within 60 seconds"
echo "::group::Server startup log"
cat "$serverlog"
echo "::endgroup::"
return 1
fi
nohup flask run --no-debugger -p $port >"$flasklog" 2>&1 </dev/null &
local flaskProcessId=$!
USE_DASHBOARD_FLAG=''
if [ "$USE_DASHBOARD" = "true" ]; then
@@ -258,6 +200,13 @@ cypress-run-all() {
# memoryMonitorPid=$!
python ../../scripts/cypress_run.py --parallelism $PARALLELISM --parallelism-id $PARALLEL_ID --group $PARALLEL_ID --retries 5 $USE_DASHBOARD_FLAG
# kill $memoryMonitorPid
# After job is done, print out Flask log for debugging
echo "::group::Flask log for default run"
cat "$flasklog"
echo "::endgroup::"
# make sure the program exits
kill $flaskProcessId
}
playwright-install() {
@@ -275,55 +224,29 @@ playwright-run() {
local APP_ROOT=$1
local TEST_PATH=$2
# Start the Superset backend via gunicorn from the project root.
# See cypress-run-all() above for the rationale — the Flask dev server
# cannot survive the dashboard import/export tests under load.
# Start Flask from the project root (same as Cypress)
cd "$GITHUB_WORKSPACE"
local serverlog="${HOME}/superset-playwright.log"
local flasklog="${HOME}/flask-playwright.log"
local port=8081
# Use 127.0.0.1 explicitly: `flask run` binds IPv4 only, and Node's DNS
# resolution for `localhost` can return `::1` first (IPv6), which then
# refuses against the IPv4 listener and surfaces as
# `connect ECONNREFUSED ::1:<port>` in API helpers driven from Node
# (e.g., the embedded test app's exposed token fetcher).
PLAYWRIGHT_BASE_URL="http://127.0.0.1:${port}"
PLAYWRIGHT_BASE_URL="http://localhost:${port}"
if [ -n "$APP_ROOT" ]; then
export SUPERSET_APP_ROOT=$APP_ROOT
PLAYWRIGHT_BASE_URL=${PLAYWRIGHT_BASE_URL}${APP_ROOT}/
fi
export PLAYWRIGHT_BASE_URL
# See cypress-run-all() above for the args rationale (1 worker × 20
# gthread threads matching docker/entrypoints/run-server.sh, plus a
# 120s timeout and request-recycling for heavy E2E load).
nohup gunicorn \
--bind "127.0.0.1:$port" \
--workers 1 \
--worker-class gthread \
--threads 20 \
--timeout 120 \
--max-requests 500 \
--max-requests-jitter 50 \
--access-logfile - \
--error-logfile - \
"superset.app:create_app()" \
>"$serverlog" 2>&1 </dev/null &
local serverPid=$!
nohup flask run --no-debugger -p $port >"$flasklog" 2>&1 </dev/null &
local flaskProcessId=$!
# Ensure cleanup on exit (and emit the server log on failure)
trap '
echo "::group::gunicorn log for Playwright run"
cat "'"$serverlog"'" || true
echo "::endgroup::"
kill '"$serverPid"' 2>/dev/null || true
' EXIT
# Ensure cleanup on exit
trap "kill $flaskProcessId 2>/dev/null || true" EXIT
# Wait for server to be ready with health check
local timeout=60
say "Waiting for gunicorn server to start on port $port..."
say "Waiting for Flask server to start on port $port..."
while [ $timeout -gt 0 ]; do
if curl -f ${PLAYWRIGHT_BASE_URL}/health >/dev/null 2>&1; then
say "gunicorn server is ready"
say "Flask server is ready"
break
fi
sleep 1
@@ -331,9 +254,9 @@ playwright-run() {
done
if [ $timeout -eq 0 ]; then
echo "::error::gunicorn server failed to start within 60 seconds"
echo "::group::Server startup log"
cat "$serverlog"
echo "::error::Flask server failed to start within 60 seconds"
echo "::group::Flask startup log"
cat "$flasklog"
echo "::endgroup::"
return 1
fi
@@ -348,6 +271,7 @@ playwright-run() {
if ! find "playwright/tests/${TEST_PATH}" -name "*.spec.ts" -type f 2>/dev/null | grep -q .; then
echo "No test files found in ${TEST_PATH} - skipping test run"
say "::endgroup::"
kill $flaskProcessId
return 0
fi
echo "Running tests: ${TEST_PATH}"
@@ -364,6 +288,13 @@ playwright-run() {
fi
say "::endgroup::"
# After job is done, print out Flask log for debugging
echo "::group::Flask log for Playwright run"
cat "$flasklog"
echo "::endgroup::"
# make sure the program exits
kill $flaskProcessId
return $status
}

View File

@@ -30,8 +30,9 @@ jobs:
pull-requests: write
checks: write
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: true
ref: master

43
.github/workflows/cancel_duplicates.yml vendored Normal file
View File

@@ -0,0 +1,43 @@
name: Cancel Duplicates
on:
workflow_run:
workflows:
- "Miscellaneous"
types:
- requested
jobs:
cancel-duplicate-runs:
name: Cancel duplicate workflow runs
runs-on: ubuntu-24.04
permissions:
actions: write
contents: read
steps:
- name: Check number of queued tasks
id: check_queued
env:
GITHUB_TOKEN: ${{ github.token }}
GITHUB_REPO: ${{ github.repository }}
run: |
get_count() {
echo $(curl -s -H "Authorization: token $GITHUB_TOKEN" \
"https://api.github.com/repos/$GITHUB_REPO/actions/runs?status=$1" | \
jq ".total_count")
}
count=$(( `get_count queued` + `get_count in_progress` ))
echo "Found $count unfinished jobs."
echo "count=$count" >> $GITHUB_OUTPUT
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
if: steps.check_queued.outputs.count >= 20
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Cancel duplicate workflow runs
if: steps.check_queued.outputs.count >= 20
env:
GITHUB_TOKEN: ${{ github.token }}
GITHUB_REPOSITORY: ${{ github.repository }}
run: |
pip install click requests typing_extensions python-dateutil
python ./scripts/cancel_github_workflows.py

View File

@@ -22,7 +22,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
@@ -38,19 +38,6 @@ jobs:
if: steps.check.outputs.python
uses: ./.github/actions/setup-backend/
# Authenticate the Docker daemon so the python:slim pull in
# uv-pip-compile.sh uses our (much higher) authenticated rate limit
# instead of the shared-runner anonymous one. Best-effort: on fork PRs the
# secrets are unavailable, so this no-ops and the pull falls back to
# anonymous (covered by the retry loop in the script).
- name: Login to Docker Hub
if: steps.check.outputs.python
continue-on-error: true
uses: docker/login-action@650006c6eb7dba73a995cc03b0b2d7f5ca915bee # v4.2.0
with:
username: ${{ secrets.DOCKERHUB_USER }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Run uv
if: steps.check.outputs.python
run: ./scripts/uv-pip-compile.sh

View File

@@ -25,9 +25,7 @@ jobs:
pull-requests: write
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Check and notify
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
with:

View File

@@ -6,9 +6,6 @@ on:
pull_request_review_comment:
types: [created]
permissions:
contents: read
jobs:
check-permissions:
if: |
@@ -75,14 +72,13 @@ jobs:
issues: write
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
fetch-depth: 1
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
fetch-depth: 1
- name: Run Claude PR Action
uses: anthropics/claude-code-action@5fb899572b81d2bb648d4d187173a2f423a9677c # beta
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
timeout_minutes: "60"
- name: Run Claude PR Action
uses: anthropics/claude-code-action@5fb899572b81d2bb648d4d187173a2f423a9677c # beta
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
timeout_minutes: "60"

View File

@@ -15,35 +15,9 @@ concurrency:
cancel-in-progress: true
jobs:
changes:
runs-on: ubuntu-24.04
timeout-minutes: 10
permissions:
contents: read
pull-requests: read
outputs:
python: ${{ steps.check.outputs.python }}
frontend: ${{ steps.check.outputs.frontend }}
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
analyze:
name: Analyze
needs: changes
# Skip on PRs that touch neither code group (e.g. docs-only) so the
# analysis runners don't spin up. push/schedule runs always proceed:
# the change-detector returns "all changed" for non-PR events.
if: needs.changes.outputs.python == 'true' || needs.changes.outputs.frontend == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 30
permissions:
actions: read
contents: read
@@ -57,13 +31,17 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
persist-credentials: false
token: ${{ secrets.GITHUB_TOKEN }}
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@8aad20d150bbac5944a9f9d289da16a4b0d87c1e # v4
uses: github/codeql-action/init@9e0d7b8d25671d64c341c19c0152d693099fb5ba # v4
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@@ -74,6 +52,7 @@ jobs:
# queries: security-extended,security-and-quality
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@8aad20d150bbac5944a9f9d289da16a4b0d87c1e # v4
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: github/codeql-action/analyze@9e0d7b8d25671d64c341c19c0152d693099fb5ba # v4
with:
category: "/language:${{matrix.language}}"

View File

@@ -27,9 +27,7 @@ jobs:
runs-on: ubuntu-24.04
steps:
- name: "Checkout Repository"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: "Dependency Review"
uses: actions/dependency-review-action@a1d282b36b6f3519aa1f3fc636f609c47dddb294 # v5.0.0
continue-on-error: true
@@ -51,9 +49,7 @@ jobs:
runs-on: ubuntu-22.04
steps:
- name: "Checkout Repository"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Setup Python
uses: ./.github/actions/setup-backend/

View File

@@ -18,30 +18,9 @@ concurrency:
cancel-in-progress: true
jobs:
changes:
runs-on: ubuntu-24.04
timeout-minutes: 10
permissions:
contents: read
pull-requests: read
outputs:
python: ${{ steps.check.outputs.python }}
frontend: ${{ steps.check.outputs.frontend }}
docker: ${{ steps.check.outputs.docker }}
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
setup_matrix:
runs-on: ubuntu-24.04
timeout-minutes: 5
outputs:
matrix_config: ${{ steps.set_matrix.outputs.matrix_config }}
steps:
@@ -53,13 +32,8 @@ jobs:
docker-build:
name: docker-build
needs: [setup_matrix, changes]
if: >-
needs.changes.outputs.python == 'true' ||
needs.changes.outputs.frontend == 'true' ||
needs.changes.outputs.docker == 'true'
needs: setup_matrix
runs-on: ubuntu-24.04
timeout-minutes: 60
strategy:
matrix:
build_preset: ${{fromJson(needs.setup_matrix.outputs.matrix_config)}}
@@ -70,12 +44,20 @@ jobs:
IMAGE_TAG: apache/superset:GHA-${{ matrix.build_preset }}-${{ github.run_id }}
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Docker Environment
if: steps.check.outputs.python || steps.check.outputs.frontend || steps.check.outputs.docker
uses: ./.github/actions/setup-docker
with:
dockerhub-user: ${{ secrets.DOCKERHUB_USER }}
@@ -83,27 +65,28 @@ jobs:
build: "true"
- name: Setup supersetbot
if: steps.check.outputs.python || steps.check.outputs.frontend || steps.check.outputs.docker
uses: ./.github/actions/setup-supersetbot/
- name: Build Docker Image
if: steps.check.outputs.python || steps.check.outputs.frontend || steps.check.outputs.docker
shell: bash
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
BUILD_PRESET: ${{ matrix.build_preset }}
run: |
# Single platform builds in pull_request context to speed things up
if [ "$GITHUB_EVENT_NAME" = "push" ]; then
if [ "${{ github.event_name }}" = "push" ]; then
PLATFORM_ARG="--platform linux/arm64 --platform linux/amd64"
# can only --load images in single-platform builds
PUSH_OR_LOAD="--push"
elif [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
elif [ "${{ github.event_name }}" = "pull_request" ]; then
PLATFORM_ARG="--platform linux/amd64"
PUSH_OR_LOAD="--load"
fi
supersetbot docker \
$PUSH_OR_LOAD \
--preset "$BUILD_PRESET" \
--preset ${{ matrix.build_preset }} \
--context "$EVENT" \
--context-ref "$RELEASE" $FORCE_LATEST \
--extra-flags "--build-arg INCLUDE_CHROMIUM=false --tag $IMAGE_TAG" \
@@ -111,14 +94,11 @@ jobs:
# in the context of push (using multi-platform build), we need to pull the image locally
- name: Docker pull
if: github.event_name == 'push'
run: |
for i in 1 2 3; do
docker pull $IMAGE_TAG && break
[ $i -lt 3 ] && sleep 30
done
if: github.event_name == 'push' && (steps.check.outputs.python || steps.check.outputs.frontend || steps.check.outputs.docker)
run: docker pull $IMAGE_TAG
- name: Print docker stats
if: steps.check.outputs.python || steps.check.outputs.frontend || steps.check.outputs.docker
run: |
echo "SHA: ${{ github.sha }}"
echo "IMAGE: $IMAGE_TAG"
@@ -126,12 +106,10 @@ jobs:
docker history $IMAGE_TAG
- name: docker-compose sanity check
if: matrix.build_preset == 'dev'
if: (steps.check.outputs.python || steps.check.outputs.frontend || steps.check.outputs.docker) && matrix.build_preset == 'dev'
shell: bash
env:
BUILD_PRESET: ${{ matrix.build_preset }}
run: |
export SUPERSET_BUILD_TARGET=$BUILD_PRESET
export SUPERSET_BUILD_TARGET=${{ matrix.build_preset }}
# This should reuse the CACHED image built in the previous steps
docker compose build superset-init --build-arg DEV_MODE=false --build-arg INCLUDE_CHROMIUM=false
docker compose up superset-init --exit-code-from superset-init
@@ -139,16 +117,20 @@ jobs:
docker-compose-image-tag:
# Run this job only on pushes to master (not for PRs)
# goal is to check that building the latest image works, not required for all PR pushes
needs: changes
if: github.event_name == 'push' && github.ref == 'refs/heads/master' && needs.changes.outputs.docker == 'true'
if: github.event_name == 'push' && github.ref == 'refs/heads/master'
runs-on: ubuntu-24.04
timeout-minutes: 30
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Docker Environment
if: steps.check.outputs.docker
uses: ./.github/actions/setup-docker
with:
dockerhub-user: ${{ secrets.DOCKERHUB_USER }}
@@ -156,6 +138,7 @@ jobs:
build: "false"
install-docker-compose: "true"
- name: docker-compose sanity check
if: steps.check.outputs.docker
shell: bash
run: |
docker compose -f docker-compose-image-tag.yml up superset-init --exit-code-from superset-init

View File

@@ -33,13 +33,11 @@ jobs:
run:
working-directory: superset-embedded-sdk
steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./superset-embedded-sdk/.nvmrc"
registry-url: "https://registry.npmjs.org"
node-version-file: './superset-embedded-sdk/.nvmrc'
registry-url: 'https://registry.npmjs.org'
- run: npm ci
- run: npm run ci:release
env:

View File

@@ -21,13 +21,11 @@ jobs:
run:
working-directory: superset-embedded-sdk
steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./superset-embedded-sdk/.nvmrc"
registry-url: "https://registry.npmjs.org"
node-version-file: './superset-embedded-sdk/.nvmrc'
registry-url: 'https://registry.npmjs.org'
- run: npm ci
- run: npm test
- run: npm run build

View File

@@ -0,0 +1,83 @@
name: Cleanup ephemeral envs (PR close) [DEPRECATED]
# ⚠️ DEPRECATION NOTICE ⚠️
# This workflow is deprecated and will be removed in a future version.
# The new Superset Showtime workflow handles cleanup automatically.
# See .github/workflows/showtime.yml and showtime-cleanup.yml for replacements.
# Migration guide: https://github.com/mistercrunch/superset-showtime
on:
pull_request_target:
types: [closed]
jobs:
config:
runs-on: ubuntu-24.04
outputs:
has-secrets: ${{ steps.check.outputs.has-secrets }}
steps:
- name: "Check for secrets"
id: check
shell: bash
run: |
if [ -n "${AWS_ACCESS_KEY_ID}" ]; then
echo "has-secrets=1" >> "$GITHUB_OUTPUT"
fi
env:
AWS_ACCESS_KEY_ID: ${{ (secrets.AWS_ACCESS_KEY_ID != '' && secrets.AWS_SECRET_ACCESS_KEY != '') || '' }}
ephemeral-env-cleanup:
needs: config
if: needs.config.outputs.has-secrets
name: Cleanup ephemeral envs
runs-on: ubuntu-24.04
permissions:
pull-requests: write
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@8df5847569e6427dd6c4fb1cf565c83acfa8afa7 # v6
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-west-2
- name: Describe ECS service
id: describe-services
run: |
echo "active=$(aws ecs describe-services --cluster superset-ci --services pr-${{ github.event.number }}-service | jq '.services[] | select(.status == "ACTIVE") | any')" >> $GITHUB_OUTPUT
- name: Delete ECS service
if: steps.describe-services.outputs.active == 'true'
id: delete-service
run: |
aws ecs delete-service \
--cluster superset-ci \
--service pr-${{ github.event.number }}-service \
--force
- name: Login to Amazon ECR
if: steps.describe-services.outputs.active == 'true'
id: login-ecr
uses: aws-actions/amazon-ecr-login@fa648b43de3d4d023bcb3f89ed6940096949c419 # v2
- name: Delete ECR image tag
if: steps.describe-services.outputs.active == 'true'
id: delete-image-tag
run: |
aws ecr batch-delete-image \
--registry-id $(echo "${{ steps.login-ecr.outputs.registry }}" | grep -Eo "^[0-9]+") \
--repository-name superset-ci \
--image-ids imageTag=pr-${{ github.event.number }}
- name: Comment (success)
if: steps.describe-services.outputs.active == 'true'
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
with:
github-token: ${{github.token}}
script: |
github.rest.issues.createComment({
issue_number: ${{ github.event.number }},
owner: context.repo.owner,
repo: context.repo.repo,
body: '⚠️ **DEPRECATED WORKFLOW** - Ephemeral environment shutdown and build artifacts deleted. Please migrate to the new Superset Showtime system for future PRs.'
})

350
.github/workflows/ephemeral-env.yml vendored Normal file
View File

@@ -0,0 +1,350 @@
name: Ephemeral env workflow [DEPRECATED]
# ⚠️ DEPRECATION NOTICE ⚠️
# This workflow is deprecated and will be removed in a future version.
# Please use the new Superset Showtime workflow instead:
# - Use label "🎪 trigger-start" instead of "testenv-up"
# - Showtime provides better reliability and easier management
# - See .github/workflows/showtime.yml for the replacement
# - Migration guide: https://github.com/mistercrunch/superset-showtime
# Example manual trigger:
# gh workflow run ephemeral-env.yml --ref fix_ephemerals --field label_name="testenv-up" --field issue_number=666
on:
pull_request_target:
types:
- labeled
workflow_dispatch:
inputs:
label_name:
description: 'Label name to simulate label-based /testenv trigger'
required: true
default: 'testenv-up'
issue_number:
description: 'Issue or PR number'
required: true
jobs:
ephemeral-env-label:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}-label
cancel-in-progress: true
name: Evaluate ephemeral env label trigger
runs-on: ubuntu-24.04
permissions:
pull-requests: write
outputs:
slash-command: ${{ steps.eval-label.outputs.result }}
feature-flags: ${{ steps.eval-feature-flags.outputs.result }}
sha: ${{ steps.get-sha.outputs.sha }}
env:
DOCKERHUB_USER: ${{ secrets.DOCKERHUB_USER }}
DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
steps:
- name: Check for the "testenv-up" label
id: eval-label
run: |
if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then
LABEL_NAME="${INPUT_LABEL_NAME}"
else
LABEL_NAME="${{ github.event.label.name }}"
fi
echo "Evaluating label: $LABEL_NAME"
if [[ "$LABEL_NAME" == "testenv-up" ]]; then
echo "result=up" >> $GITHUB_OUTPUT
else
echo "result=noop" >> $GITHUB_OUTPUT
fi
env:
INPUT_LABEL_NAME: ${{ github.event.inputs.label_name }}
- name: Get event SHA
id: get-sha
if: steps.eval-label.outputs.result == 'up'
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
let prSha;
// If event is workflow_dispatch, use the issue_number from inputs
if (context.eventName === "workflow_dispatch") {
const prNumber = "${{ github.event.inputs.issue_number }}";
if (!prNumber) {
console.log("No PR number found.");
return;
}
// Fetch PR details using the provided issue_number
const { data: pr } = await github.rest.pulls.get({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber
});
prSha = pr.head.sha;
} else {
// If it's not workflow_dispatch, use the PR head sha from the event
prSha = context.payload.pull_request.head.sha;
}
console.log(`PR SHA: ${prSha}`);
core.setOutput("sha", prSha);
- name: Looking for feature flags in PR description
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
id: eval-feature-flags
if: steps.eval-label.outputs.result == 'up'
with:
script: |
const description = context.payload.pull_request
? context.payload.pull_request.body || ''
: context.payload.inputs.pr_description || '';
const pattern = /FEATURE_(\w+)=(\w+)/g;
let results = [];
[...description.matchAll(pattern)].forEach(match => {
const config = {
name: `SUPERSET_FEATURE_${match[1]}`,
value: match[2],
};
results.push(config);
});
return results;
- name: Reply with confirmation comment
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
if: steps.eval-label.outputs.result == 'up'
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
const action = '${{ steps.eval-label.outputs.result }}';
const user = context.actor;
const runId = context.runId;
const workflowUrl = `${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${runId}`;
const issueNumber = context.payload.pull_request
? context.payload.pull_request.number
: context.payload.inputs.issue_number;
if (!issueNumber) {
throw new Error("Issue number is not available.");
}
const body = `⚠️ **DEPRECATED WORKFLOW** ⚠️\n\n@${user} This workflow is deprecated! Please use the new **Superset Showtime** system instead:\n\n` +
`- Replace "testenv-up" label with "🎪 trigger-start"\n` +
`- Better reliability and easier management\n` +
`- See https://github.com/mistercrunch/superset-showtime for details\n\n` +
`Processing your ephemeral environment request [here](${workflowUrl}). Action: **${action}**.` +
` More information on [how to use or configure ephemeral environments]` +
`(https://superset.apache.org/docs/contributing/howtos/#github-ephemeral-environments)`;
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issueNumber,
body,
});
ephemeral-docker-build:
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}-build
cancel-in-progress: true
needs: ephemeral-env-label
if: needs.ephemeral-env-label.outputs.slash-command == 'up'
name: ephemeral-docker-build
runs-on: ubuntu-24.04
steps:
- name: "Checkout ${{ github.ref }} ( ${{ needs.ephemeral-env-label.outputs.sha }} : ${{steps.get-sha.outputs.sha}} )"
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
ref: ${{ needs.ephemeral-env-label.outputs.sha }}
persist-credentials: false
- name: Setup Docker Environment
uses: ./.github/actions/setup-docker
with:
dockerhub-user: ${{ secrets.DOCKERHUB_USER }}
dockerhub-token: ${{ secrets.DOCKERHUB_TOKEN }}
build: "true"
install-docker-compose: "false"
- name: Setup supersetbot
uses: ./.github/actions/setup-supersetbot/
- name: Build ephemeral env image
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
supersetbot docker \
--push \
--load \
--preset ci \
--platform linux/amd64 \
--context-ref "$RELEASE" \
--extra-flags "--build-arg INCLUDE_CHROMIUM=false"
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@8df5847569e6427dd6c4fb1cf565c83acfa8afa7 # v6
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-west-2
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@fa648b43de3d4d023bcb3f89ed6940096949c419 # v2
- name: Load, tag and push image to ECR
id: push-image
env:
ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
ECR_REPOSITORY: superset-ci
IMAGE_TAG: apache/superset:${{ needs.ephemeral-env-label.outputs.sha }}-ci
PR_NUMBER: ${{ github.event.inputs.issue_number || github.event.pull_request.number }}
run: |
docker tag $IMAGE_TAG $ECR_REGISTRY/$ECR_REPOSITORY:pr-$PR_NUMBER-ci
docker push -a $ECR_REGISTRY/$ECR_REPOSITORY
ephemeral-env-up:
needs: [ephemeral-env-label, ephemeral-docker-build]
if: needs.ephemeral-env-label.outputs.slash-command == 'up'
name: Spin up an ephemeral environment
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@8df5847569e6427dd6c4fb1cf565c83acfa8afa7 # v6
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-west-2
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@fa648b43de3d4d023bcb3f89ed6940096949c419 # v2
- name: Check target image exists in ECR
id: check-image
continue-on-error: true
env:
PR_NUMBER: ${{ github.event.inputs.issue_number || github.event.pull_request.number }}
run: |
aws ecr describe-images \
--registry-id $(echo "${{ steps.login-ecr.outputs.registry }}" | grep -Eo "^[0-9]+") \
--repository-name superset-ci \
--image-ids imageTag=pr-$PR_NUMBER-ci
- name: Fail on missing container image
if: steps.check-image.outcome == 'failure'
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
with:
github-token: ${{ github.token }}
script: |
const errMsg = '@${{ github.event.comment.user.login }} Container image not yet published for this PR. Please try again when build is complete.';
github.rest.issues.createComment({
issue_number: ${{ github.event.inputs.issue_number || github.event.pull_request.number }},
owner: context.repo.owner,
repo: context.repo.repo,
body: errMsg
});
core.setFailed(errMsg);
- name: Fill in the new image ID in the Amazon ECS task definition
id: task-def
uses: aws-actions/amazon-ecs-render-task-definition@6853cfae8c3a7d978fbf68b5a55453395541dfbb # v1
with:
task-definition: .github/workflows/ecs-task-definition.json
container-name: superset-ci
image: ${{ steps.login-ecr.outputs.registry }}/superset-ci:pr-${{ github.event.inputs.issue_number || github.event.pull_request.number }}-ci
- name: Update env vars in the Amazon ECS task definition
run: |
cat <<< "$(jq '.containerDefinitions[0].environment += ${{ needs.ephemeral-env-label.outputs.feature-flags }}' < ${{ steps.task-def.outputs.task-definition }})" > ${{ steps.task-def.outputs.task-definition }}
- name: Describe ECS service
id: describe-services
run: |
echo "active=$(aws ecs describe-services --cluster superset-ci --services pr-${INPUT_ISSUE_NUMBER}-service | jq '.services[] | select(.status == "ACTIVE") | any')" >> $GITHUB_OUTPUT
env:
INPUT_ISSUE_NUMBER: ${{ github.event.inputs.issue_number || github.event.pull_request.number }}
- name: Create ECS service
id: create-service
if: steps.describe-services.outputs.active != 'true'
env:
ECR_SUBNETS: subnet-0e15a5034b4121710,subnet-0e8efef4a72224974
ECR_SECURITY_GROUP: sg-092ff3a6ae0574d91
PR_NUMBER: ${{ github.event.inputs.issue_number || github.event.pull_request.number }}
run: |
aws ecs create-service \
--cluster superset-ci \
--service-name pr-$PR_NUMBER-service \
--task-definition superset-ci \
--launch-type FARGATE \
--desired-count 1 \
--platform-version LATEST \
--network-configuration "awsvpcConfiguration={subnets=[$ECR_SUBNETS],securityGroups=[$ECR_SECURITY_GROUP],assignPublicIp=ENABLED}" \
--tags key=pr,value=$PR_NUMBER key=github_user,value=${{ github.actor }}
- name: Deploy Amazon ECS task definition
id: deploy-task
uses: aws-actions/amazon-ecs-deploy-task-definition@a310a830f5c14e583e35d84e4e1ec7dd177c3c9c # v2
with:
task-definition: ${{ steps.task-def.outputs.task-definition }}
service: pr-${{ github.event.inputs.issue_number || github.event.pull_request.number }}-service
cluster: superset-ci
wait-for-service-stability: true
wait-for-minutes: 10
- name: List tasks
id: list-tasks
run: |
echo "task=$(aws ecs list-tasks --cluster superset-ci --service-name pr-${INPUT_ISSUE_NUMBER}-service | jq '.taskArns | first')" >> $GITHUB_OUTPUT
env:
INPUT_ISSUE_NUMBER: ${{ github.event.inputs.issue_number || github.event.pull_request.number }}
- name: Get network interface
id: get-eni
run: |
echo "eni=$(aws ecs describe-tasks --cluster superset-ci --tasks ${{ steps.list-tasks.outputs.task }} | jq '.tasks[0].attachments[0].details | map(select(.name=="networkInterfaceId"))[0].value')" >> $GITHUB_OUTPUT
- name: Get public IP
id: get-ip
run: |
echo "ip=$(aws ec2 describe-network-interfaces --network-interface-ids ${{ steps.get-eni.outputs.eni }} | jq -r '.NetworkInterfaces | first | .Association.PublicIp')" >> $GITHUB_OUTPUT
- name: Comment (success)
if: ${{ success() }}
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
with:
github-token: ${{github.token}}
script: |
const issue_number = context.payload.inputs?.issue_number || context.issue.number;
github.rest.issues.createComment({
issue_number: issue_number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `@${{ github.actor }} Ephemeral environment spinning up at http://${{ steps.get-ip.outputs.ip }}:8080. Credentials are 'admin'/'admin'. Please allow several minutes for bootstrapping and startup.`
});
- name: Comment (failure)
if: ${{ failure() }}
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
with:
github-token: ${{github.token}}
script: |
const issue_number = context.payload.inputs?.issue_number || context.issue.number;
github.rest.issues.createComment({
issue_number: issue_number,
owner: context.repo.owner,
repo: context.repo.repo,
body: '@${{ github.event.inputs.user_login || github.event.comment.user.login }} Ephemeral environment creation failed. Please check the Actions logs for details.'
})

View File

@@ -32,7 +32,7 @@ jobs:
runs-on: ubuntu-24.04
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive

View File

@@ -6,41 +6,26 @@ on:
- "master"
- "[0-9].[0-9]*"
pull_request:
branches:
- "**"
types: [synchronize, opened, reopened, ready_for_review]
permissions:
contents: read
# cancel previous workflow jobs for PRs
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true
jobs:
validate-all-ghas:
runs-on: ubuntu-24.04
permissions:
contents: read
# Required for the zizmor action to upload its SARIF results to
# GitHub code scanning (advanced-security is enabled by default).
security-events: write
steps:
- name: Checkout Repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Set up Node.js
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6.4.0
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version: "20"
node-version: '20'
- name: Install Dependencies
run: npm install -g @action-validator/core @action-validator/cli --save-dev
- name: Run Script
run: bash .github/workflows/github-action-validator.sh
- name: Check for security issues on GHA workflows
uses: zizmorcore/zizmor-action@5f14fd08f7cf1cb1609c1e344975f152c7ee938d # v0.5.6

View File

@@ -15,8 +15,9 @@ jobs:
pull-requests: write
issues: write
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false

View File

@@ -2,11 +2,6 @@ name: "Pull Request Labeler"
on:
- pull_request_target
# cancel previous workflow jobs for PRs
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true
jobs:
labeler:
permissions:
@@ -14,7 +9,7 @@ jobs:
pull-requests: write
runs-on: ubuntu-24.04
steps:
- uses: actions/labeler@f27b608878404679385c85cfa523b85ccb86e213 # v6.1.0
- uses: actions/labeler@v6
with:
sync-labels: true

View File

@@ -11,29 +11,27 @@ jobs:
contents: write
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
submodules: recursive
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Check for latest tag
id: latest-tag
env:
RELEASE_TAG_NAME: ${{ github.event.release.tag_name }}
run: |
source ./scripts/tag_latest_release.sh "$RELEASE_TAG_NAME" --dry-run
- name: Check for latest tag
id: latest-tag
run: |
source ./scripts/tag_latest_release.sh $(echo ${{ github.event.release.tag_name }}) --dry-run
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Configure Git
run: |
git config user.name "$GITHUB_ACTOR"
git config user.email "$GITHUB_ACTOR@users.noreply.github.com"
- name: Run latest-tag
uses: ./.github/actions/latest-tag
if: steps.latest-tag.outputs.SKIP_TAG != 'true'
with:
description: Superset latest release
tag-name: latest
env:
GITHUB_TOKEN: ${{ github.token }}
- name: Run latest-tag
uses: ./.github/actions/latest-tag
if: steps.latest-tag.outputs.SKIP_TAG != 'true'
with:
description: Superset latest release
tag-name: latest
env:
GITHUB_TOKEN: ${{ github.token }}

View File

@@ -18,14 +18,14 @@ jobs:
runs-on: ubuntu-24.04
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Setup Java
uses: actions/setup-java@be666c2fcd27ec809703dec50e508c2fdc7f6654 # v5
with:
distribution: "temurin"
java-version: "11"
distribution: 'temurin'
java-version: '11'
- name: Run license check
run: ./scripts/check_license.sh

View File

@@ -7,13 +7,10 @@ on:
permissions:
pull-requests: read
# Let each label event run to completion. Cancelling in-progress runs leaves
# CANCELLED entries in the PR's check-suite rollup, which poisons GitHub's
# `status:success` search filter even though all real CI passed. The job is
# a tiny no-op github-script call, so the wasted compute is negligible.
# cancel previous workflow jobs for PRs
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: false
cancel-in-progress: true
jobs:
check-hold-label:

View File

@@ -8,11 +8,6 @@ on:
# Possible values: https://help.github.com/en/actions/reference/events-that-trigger-workflows#pull-request-event-pull_request
types: [opened, edited, reopened, synchronize]
# cancel previous workflow jobs for PRs
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true
jobs:
lint-check:
runs-on: ubuntu-24.04
@@ -21,7 +16,7 @@ jobs:
pull-requests: write
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
@@ -31,5 +26,6 @@ jobs:
on-failed-regex-fail-action: true
on-failed-regex-request-changes: false
on-failed-regex-create-review: false
on-failed-regex-comment: "Please format your PR title to match: `%regex%`!"
on-failed-regex-comment:
"Please format your PR title to match: `%regex%`!"
repo-token: "${{ github.token }}"

View File

@@ -19,16 +19,12 @@ concurrency:
jobs:
pre-commit:
runs-on: ubuntu-24.04
timeout-minutes: 20
strategy:
matrix:
# Run the full version spread on push (master/release) and nightly,
# but only the current version on PRs — lint/format/type results
# rarely differ across patch versions, so 3x per PR is wasteful.
python-version: ${{ github.event_name == 'pull_request' && fromJSON('["current"]') || fromJSON('["current", "previous", "next"]') }}
python-version: ["current", "previous", "next"]
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
@@ -48,9 +44,7 @@ jobs:
- name: Setup Node.js
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "superset-frontend/.nvmrc"
cache: "npm"
cache-dependency-path: "superset-frontend/package-lock.json"
node-version: '20'
- name: Install Frontend Dependencies
run: |
@@ -74,15 +68,13 @@ jobs:
id: changed_files
uses: ./.github/actions/file-changes-action
with:
output: " "
output: ' '
- name: pre-commit
env:
CHANGED_FILES: ${{ steps.changed_files.outputs.files }}
run: |
set +e # Don't exit immediately on failure
export SKIP=type-checking-frontend
pre-commit run --files $CHANGED_FILES
pre-commit run --files ${{ steps.changed_files.outputs.files }}
PRE_COMMIT_EXIT_CODE=$?
git diff --quiet --exit-code
GIT_DIFF_EXIT_CODE=$?

View File

@@ -6,9 +6,6 @@ on:
- "master"
- "[0-9].[0-9]*"
permissions:
contents: read
jobs:
config:
runs-on: ubuntu-24.04
@@ -30,12 +27,9 @@ jobs:
if: needs.config.outputs.has-secrets
name: Bump version and publish package(s)
runs-on: ubuntu-24.04
permissions:
contents: write
steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
# pulls all commits (needed for lerna / semantic release to correctly version)
fetch-depth: 0
- name: Get tags and filter trigger tags
@@ -52,7 +46,7 @@ jobs:
if: env.HAS_TAGS
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./superset-frontend/.nvmrc"
node-version-file: './superset-frontend/.nvmrc'
- name: Cache npm
if: env.HAS_TAGS

View File

@@ -2,7 +2,6 @@ name: 🎪 Superset Showtime
# Ultra-simple: just sync on any PR state change
on:
# zizmor: ignore[dangerous-triggers] - required to react to PR label changes; this workflow does not check out or execute PR-provided code
pull_request_target:
types: [labeled, unlabeled, synchronize, closed]
@@ -10,11 +9,11 @@ on:
workflow_dispatch:
inputs:
pr_number:
description: "PR number to sync"
description: 'PR number to sync'
required: true
type: number
sha:
description: "Specific SHA to deploy (optional, defaults to latest)"
description: 'Specific SHA to deploy (optional, defaults to latest)'
required: false
type: string
@@ -103,7 +102,7 @@ jobs:
- name: Install Superset Showtime
if: steps.auth.outputs.authorized == 'true'
run: |
echo "::notice::Maintainer $GITHUB_ACTOR triggered deploy for PR ${PULL_REQUEST_NUMBER}"
echo "::notice::Maintainer ${{ github.actor }} triggered deploy for PR ${PULL_REQUEST_NUMBER}"
pip install --upgrade superset-showtime
showtime version
@@ -152,7 +151,7 @@ jobs:
- name: Checkout PR code (only if build needed)
if: steps.auth.outputs.authorized == 'true' && steps.check.outputs.build_needed == 'true'
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
ref: ${{ steps.check.outputs.target_sha }}
persist-credentials: false
@@ -174,11 +173,9 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DOCKERHUB_USER: ${{ secrets.DOCKERHUB_USER }}
DOCKERHUB_TOKEN: ${{ secrets.DOCKERHUB_TOKEN }}
CHECK_PR_NUMBER: ${{ steps.check.outputs.pr_number }}
CHECK_TARGET_SHA: ${{ steps.check.outputs.target_sha }}
run: |
PR_NUM="$CHECK_PR_NUMBER"
TARGET_SHA="$CHECK_TARGET_SHA"
PR_NUM="${{ steps.check.outputs.pr_number }}"
TARGET_SHA="${{ steps.check.outputs.target_sha }}"
if [[ -n "$TARGET_SHA" ]]; then
python -m showtime sync $PR_NUM --sha "$TARGET_SHA"
else

View File

@@ -41,7 +41,7 @@ jobs:
- 16379:6379
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive

View File

@@ -2,7 +2,6 @@ name: Docs Deployment
on:
# Deploy after integration tests complete on master
# zizmor: ignore[dangerous-triggers] - runs in base-branch context after a trusted upstream workflow; scoped to master
workflow_run:
workflows: ["Python-Integration"]
types: [completed]
@@ -28,9 +27,6 @@ concurrency:
group: docs-deploy-asf-site
cancel-in-progress: true
permissions:
contents: read
jobs:
config:
runs-on: ubuntu-24.04
@@ -49,18 +45,12 @@ jobs:
SUPERSET_SITE_BUILD: ${{ (secrets.SUPERSET_SITE_BUILD != '' && secrets.SUPERSET_SITE_BUILD != '') || '' }}
build-deploy:
needs: config
# For workflow_run triggers, only deploy when the triggering run originated
# from this repository (not a fork), ensuring the checked-out code and any
# local actions executed with deploy credentials are trusted.
if: >-
needs.config.outputs.has-secrets &&
(github.event_name != 'workflow_run' ||
github.event.workflow_run.head_repository.full_name == github.repository)
if: needs.config.outputs.has-secrets
name: Build & Deploy
runs-on: ubuntu-24.04
steps:
- name: "Checkout ${{ github.event.workflow_run.head_sha || github.sha }}"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
ref: ${{ github.event.workflow_run.head_sha || github.sha }}
persist-credentials: false
@@ -68,13 +58,13 @@ jobs:
- name: Set up Node.js
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./docs/.nvmrc"
node-version-file: './docs/.nvmrc'
- name: Setup Python
uses: ./.github/actions/setup-backend/
- uses: actions/setup-java@be666c2fcd27ec809703dec50e508c2fdc7f6654 # v5
with:
distribution: "zulu"
java-version: "21"
distribution: 'zulu'
java-version: '21'
- name: Install Graphviz
run: sudo apt-get install -y graphviz
- name: Compute Entity Relationship diagram (ERD)

View File

@@ -7,7 +7,6 @@ on:
- "superset/db_engine_specs/**"
- ".github/workflows/superset-docs-verify.yml"
types: [synchronize, opened, reopened, ready_for_review]
# zizmor: ignore[dangerous-triggers] - runs in base-branch context and only consumes artifacts from the trusted upstream workflow
workflow_run:
workflows: ["Python-Integration"]
types: [completed]
@@ -17,9 +16,6 @@ concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.event.workflow_run.head_sha || github.run_id }}
cancel-in-progress: true
permissions:
contents: read
jobs:
linkinator:
# See docs here: https://github.com/marketplace/actions/linkinator
@@ -28,12 +24,10 @@ jobs:
name: Link Checking
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
# Do not bump this linkinator-action version without opening
# an ASF Infra ticket to allow the new version first!
- uses: JustinBeckwith/linkinator-action@af984b9f30f63e796ae2ea5be5e07cb587f1bbd9 # v2.3
- uses: JustinBeckwith/linkinator-action@af984b9f30f63e796ae2ea5be5e07cb587f1bbd9 # v2.3
continue-on-error: true # This will make the job advisory (non-blocking, no red X)
with:
paths: "**/*.md, **/*.mdx"
@@ -73,14 +67,14 @@ jobs:
working-directory: docs
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Set up Node.js
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./docs/.nvmrc"
node-version-file: './docs/.nvmrc'
- name: yarn install
run: |
yarn install --check-cache
@@ -103,8 +97,7 @@ jobs:
# Only runs if integration tests succeeded
if: >
github.event_name == 'workflow_run' &&
github.event.workflow_run.conclusion == 'success' &&
github.event.workflow_run.head_repository.full_name == github.repository
github.event.workflow_run.conclusion == 'success'
name: Build (after integration tests)
runs-on: ubuntu-24.04
defaults:
@@ -112,7 +105,7 @@ jobs:
working-directory: docs
steps:
- name: "Checkout PR head: ${{ github.event.workflow_run.head_sha }}"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
ref: ${{ github.event.workflow_run.head_sha }}
persist-credentials: false
@@ -120,7 +113,7 @@ jobs:
- name: Set up Node.js
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./docs/.nvmrc"
node-version-file: './docs/.nvmrc'
- name: yarn install
run: |
yarn install --check-cache
@@ -131,7 +124,7 @@ jobs:
run_id: ${{ github.event.workflow_run.id }}
name: database-diagnostics
path: docs/src/data/
if_no_artifact_found: "warning"
if_no_artifact_found: 'warning'
- name: Use fresh diagnostics
run: |
if [ -f "src/data/databases-diagnostics.json" ]; then

View File

@@ -10,49 +10,26 @@ on:
workflow_dispatch:
inputs:
use_dashboard:
description: "Use Cypress Dashboard (true/false) [paid service - trigger manually when needed]. You MUST provide a branch and/or PR number below for this to work."
description: 'Use Cypress Dashboard (true/false) [paid service - trigger manually when needed]. You MUST provide a branch and/or PR number below for this to work.'
required: false
default: "false"
default: 'false'
ref:
description: "The branch or tag to checkout"
description: 'The branch or tag to checkout'
required: false
default: ""
default: ''
pr_id:
description: "The pull request ID to checkout"
description: 'The pull request ID to checkout'
required: false
default: ""
default: ''
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true
jobs:
changes:
runs-on: ubuntu-24.04
timeout-minutes: 10
permissions:
contents: read
pull-requests: read
outputs:
python: ${{ steps.check.outputs.python }}
frontend: ${{ steps.check.outputs.frontend }}
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
cypress-matrix:
needs: changes
if: needs.changes.outputs.python == 'true' || needs.changes.outputs.frontend == 'true'
# Somehow one test flakes on 24.04 for unknown reasons, this is the only GHA left on 22.04
runs-on: ubuntu-22.04
timeout-minutes: 30
permissions:
contents: read
pull-requests: read
@@ -63,14 +40,9 @@ jobs:
# https://github.com/cypress-io/github-action/issues/48
fail-fast: false
matrix:
parallel_id: [0, 1]
parallel_id: [0, 1, 2, 3, 4, 5]
browser: ["chrome"]
app_root: ${{ github.event_name == 'push' && fromJSON('["", "/app/prefix"]') || fromJSON('[""]') }}
# The /app/prefix variant (push events only) is smoke-tested on a single
# shard rather than the full matrix, so exclude it from the other shards.
exclude:
- parallel_id: 1
app_root: "/app/prefix"
app_root: ["", "/app/prefix"]
env:
SUPERSET_ENV: development
SUPERSET_CONFIG: tests.integration_tests.superset_test_config
@@ -97,60 +69,71 @@ jobs:
# Conditional checkout based on context
- name: Checkout for push or pull_request event
if: github.event_name == 'push' || github.event_name == 'pull_request'
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
- name: Checkout using ref (workflow_dispatch)
if: github.event_name == 'workflow_dispatch' && github.event.inputs.ref != ''
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
ref: ${{ github.event.inputs.ref }}
submodules: recursive
- name: Checkout using PR ID (workflow_dispatch)
if: github.event_name == 'workflow_dispatch' && github.event.inputs.pr_id != ''
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
ref: refs/pull/${{ github.event.inputs.pr_id }}/merge
submodules: recursive
# -------------------------------------------------------
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python || steps.check.outputs.frontend
- name: Setup postgres
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: setup-postgres
- name: Import test data
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: testdata
- name: Setup Node.js
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./superset-frontend/.nvmrc"
cache: "npm"
cache-dependency-path: "superset-frontend/package-lock.json"
node-version-file: './superset-frontend/.nvmrc'
- name: Install npm dependencies
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: npm-install
- name: Build javascript packages
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: build-instrumented-assets
- name: Install cypress
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: cypress-install
- name: Run Cypress
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
env:
CYPRESS_BROWSER: ${{ matrix.browser }}
PARALLEL_ID: ${{ matrix.parallel_id }}
PARALLELISM: 2
PARALLELISM: 6
CYPRESS_RECORD_KEY: ${{ secrets.CYPRESS_RECORD_KEY }}
NODE_OPTIONS: "--max-old-space-size=4096"
with:
@@ -158,9 +141,8 @@ jobs:
- name: Set safe app root
if: failure()
id: set-safe-app-root
env:
APP_ROOT: ${{ matrix.app_root }}
run: |
APP_ROOT="${{ matrix.app_root }}"
SAFE_APP_ROOT=${APP_ROOT//\//_}
echo "safe_app_root=$SAFE_APP_ROOT" >> $GITHUB_OUTPUT
- name: Upload Artifacts
@@ -171,10 +153,7 @@ jobs:
name: cypress-artifact-${{ github.run_id }}-${{ github.job }}-${{ matrix.browser }}-${{ matrix.parallel_id }}--${{ steps.set-safe-app-root.outputs.safe_app_root }}
playwright-tests:
needs: changes
if: needs.changes.outputs.python == 'true' || needs.changes.outputs.frontend == 'true'
runs-on: ubuntu-22.04
timeout-minutes: 30
permissions:
contents: read
pull-requests: read
@@ -182,7 +161,7 @@ jobs:
fail-fast: false
matrix:
browser: ["chromium"]
app_root: ${{ github.event_name == 'push' && fromJSON('["", "/app/prefix"]') || fromJSON('[""]') }}
app_root: ["", "/app/prefix"]
env:
SUPERSET_ENV: development
SUPERSET_CONFIG: tests.integration_tests.superset_test_config
@@ -207,59 +186,66 @@ jobs:
# Conditional checkout based on context (same as Cypress workflow)
- name: Checkout for push or pull_request event
if: github.event_name == 'push' || github.event_name == 'pull_request'
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
- name: Checkout using ref (workflow_dispatch)
if: github.event_name == 'workflow_dispatch' && github.event.inputs.ref != ''
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
ref: ${{ github.event.inputs.ref }}
submodules: recursive
- name: Checkout using PR ID (workflow_dispatch)
if: github.event_name == 'workflow_dispatch' && github.event.inputs.pr_id != ''
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
ref: refs/pull/${{ github.event.inputs.pr_id }}/merge
submodules: recursive
# -------------------------------------------------------
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python || steps.check.outputs.frontend
- name: Setup postgres
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: setup-postgres
- name: Import test data
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: playwright_testdata
- name: Setup Node.js
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./superset-frontend/.nvmrc"
cache: "npm"
cache-dependency-path: "superset-frontend/package-lock.json"
node-version-file: './superset-frontend/.nvmrc'
- name: Install npm dependencies
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: npm-install
- name: Build javascript packages
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: build-instrumented-assets
- name: Build embedded SDK
uses: ./.github/actions/cached-dependencies
with:
run: build-embedded-sdk
- name: Install Playwright
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: playwright-install
- name: Run Playwright (Required Tests)
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
env:
NODE_OPTIONS: "--max-old-space-size=4096"
@@ -268,9 +254,8 @@ jobs:
- name: Set safe app root
if: failure()
id: set-safe-app-root
env:
APP_ROOT: ${{ matrix.app_root }}
run: |
APP_ROOT="${{ matrix.app_root }}"
SAFE_APP_ROOT=${APP_ROOT//\//_}
echo "safe_app_root=$SAFE_APP_ROOT" >> $GITHUB_OUTPUT
- name: Upload Playwright Artifacts
@@ -281,63 +266,3 @@ jobs:
${{ github.workspace }}/superset-frontend/playwright-results/
${{ github.workspace }}/superset-frontend/test-results/
name: playwright-artifact-${{ github.run_id }}-${{ github.job }}-${{ matrix.browser }}--${{ steps.set-safe-app-root.outputs.safe_app_root }}
# Stable required-status-check anchors. cypress-matrix and playwright-tests
# are matrix jobs gated on change detection (python || frontend). On a PR
# that touches neither — e.g. a docs-only PR — they are skipped at the job
# level, which happens before matrix expansion, so the per-combination
# contexts (`cypress-matrix (0, chrome)`, `playwright-tests (chromium)`) are
# never produced and branch protection waits on them forever. These
# always-running jobs report a single stable context that passes when the
# underlying matrix job succeeded or was skipped, and fails only on a real
# failure. Require these in .asf.yaml instead of the matrix-expanded names.
#
# A matrix job reads as "skipped" in two distinct cases, and only the first
# is a legitimate pass: (a) change detection succeeded and gated the job off
# (docs-only PR); (b) the `changes` job itself failed or was cancelled, in
# which case GHA skips its dependents too. Accepting (b) would let a broken
# change-detector report a false green, so each anchor first requires
# `changes` to have succeeded before honouring a skip.
cypress-matrix-required:
needs: [changes, cypress-matrix]
if: always()
runs-on: ubuntu-24.04
timeout-minutes: 5
permissions: {}
steps:
- name: Check cypress-matrix result
env:
CHANGES: ${{ needs.changes.result }}
RESULT: ${{ needs.cypress-matrix.result }}
run: |
if [ "$CHANGES" != "success" ]; then
echo "change detection did not succeed (result: $CHANGES); refusing to pass on a skipped matrix"
exit 1
fi
if [ "$RESULT" != "success" ] && [ "$RESULT" != "skipped" ]; then
echo "cypress-matrix did not pass (result: $RESULT)"
exit 1
fi
echo "cypress-matrix result: $RESULT (changes: $CHANGES)"
playwright-tests-required:
needs: [changes, playwright-tests]
if: always()
runs-on: ubuntu-24.04
timeout-minutes: 5
permissions: {}
steps:
- name: Check playwright-tests result
env:
CHANGES: ${{ needs.changes.result }}
RESULT: ${{ needs.playwright-tests.result }}
run: |
if [ "$CHANGES" != "success" ]; then
echo "change detection did not succeed (result: $CHANGES); refusing to pass on a skipped matrix"
exit 1
fi
if [ "$RESULT" != "success" ] && [ "$RESULT" != "skipped" ]; then
echo "playwright-tests did not pass (result: $RESULT)"
exit 1
fi
echo "playwright-tests result: $RESULT (changes: $CHANGES)"

View File

@@ -20,18 +20,15 @@ concurrency:
jobs:
test-superset-extensions-cli-package:
runs-on: ubuntu-24.04
timeout-minutes: 30
strategy:
matrix:
# Full version spread on push (master/release) + nightly; current only
# on PRs to cut runner cost (cross-version breaks are caught at merge).
python-version: ${{ github.event_name == 'pull_request' && fromJSON('["current"]') || fromJSON('["previous", "current", "next"]') }}
python-version: ["previous", "current", "next"]
defaults:
run:
working-directory: superset-extensions-cli
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
@@ -56,7 +53,7 @@ jobs:
- name: Upload coverage reports to Codecov
if: steps.check.outputs.superset-extensions-cli
uses: codecov/codecov-action@fb8b3582c8e4def4969c97caa2f19720cb33a72f # v7.0.0
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
with:
file: ./coverage.xml
flags: superset-extensions-cli

View File

@@ -16,18 +16,14 @@ concurrency:
env:
TAG: apache/superset:GHA-${{ github.run_id }}
permissions:
contents: read
jobs:
frontend-build:
runs-on: ubuntu-24.04
timeout-minutes: 30
outputs:
should-run: ${{ steps.check.outputs.frontend }}
steps:
- name: Checkout Code
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
fetch-depth: 0
@@ -75,7 +71,6 @@ jobs:
shard: [1, 2, 3, 4, 5, 6, 7, 8]
fail-fast: false
runs-on: ubuntu-24.04
timeout-minutes: 20
steps:
- name: Download Docker Image Artifact
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
@@ -105,12 +100,11 @@ jobs:
needs: [sharded-jest-tests]
if: needs.frontend-build.outputs.should-run == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 15
permissions:
id-token: write
steps:
- name: Checkout Code
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
fetch-depth: 0
@@ -134,7 +128,7 @@ jobs:
run: npx nyc merge coverage/ merged-output/coverage-summary.json
- name: Upload Code Coverage
uses: codecov/codecov-action@fb8b3582c8e4def4969c97caa2f19720cb33a72f # v7.0.0
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
with:
flags: javascript
use_oidc: true
@@ -147,7 +141,6 @@ jobs:
needs: frontend-build
if: needs.frontend-build.outputs.should-run == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 20
steps:
- name: Download Docker Image Artifact
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
@@ -172,7 +165,6 @@ jobs:
needs: frontend-build
if: needs.frontend-build.outputs.should-run == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 20
steps:
- name: Download Docker Image Artifact
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
@@ -192,7 +184,6 @@ jobs:
needs: frontend-build
if: needs.frontend-build.outputs.should-run == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 25
steps:
- name: Download Docker Image Artifact
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8

View File

@@ -19,7 +19,7 @@ jobs:
runs-on: ubuntu-24.04
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
@@ -33,7 +33,7 @@ jobs:
- name: Setup Python
uses: ./.github/actions/setup-backend/
with:
install-superset: "false"
install-superset: 'false'
- name: Set up chart-testing
uses: ./.github/actions/chart-testing-action

View File

@@ -29,7 +29,7 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
ref: ${{ inputs.ref || github.ref_name }}
persist-credentials: true
@@ -62,8 +62,6 @@ jobs:
run: echo "branch_name=helm-publish-${GITHUB_SHA:0:7}" >> $GITHUB_ENV
- name: Force recreate branch from gh-pages
env:
BRANCH_NAME: ${{ env.branch_name }}
run: |
# Ensure a clean working directory
git reset --hard
@@ -75,13 +73,13 @@ jobs:
git fetch origin gh-pages
# Check out and reset the target branch based on gh-pages
git checkout -B "$BRANCH_NAME" origin/gh-pages
git checkout -B ${{ env.branch_name }} origin/gh-pages
# Remove submodules from the branch
git submodule deinit -f --all
# Force push to the remote branch
git push origin "$BRANCH_NAME" --force
git push origin ${{ env.branch_name }} --force
# Return to the original branch
git checkout local_gha_temp
@@ -106,7 +104,7 @@ jobs:
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
with:
script: |
const branchName = process.env.BRANCH_NAME;
const branchName = '${{ env.branch_name }}';
const [owner, repo] = process.env.GITHUB_REPOSITORY.split('/');
if (!branchName) {

View File

@@ -10,46 +10,23 @@ on:
workflow_dispatch:
inputs:
ref:
description: "The branch or tag to checkout"
description: 'The branch or tag to checkout'
required: false
default: ""
default: ''
pr_id:
description: "The pull request ID to checkout"
description: 'The pull request ID to checkout'
required: false
default: ""
default: ''
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true
jobs:
changes:
runs-on: ubuntu-24.04
timeout-minutes: 10
permissions:
contents: read
pull-requests: read
outputs:
python: ${{ steps.check.outputs.python }}
frontend: ${{ steps.check.outputs.frontend }}
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
# NOTE: Required Playwright tests are in superset-e2e.yml (E2E / playwright-tests)
# This workflow contains only experimental tests that run in shadow mode
playwright-tests-experimental:
needs: changes
if: needs.changes.outputs.python == 'true' || needs.changes.outputs.frontend == 'true'
runs-on: ubuntu-22.04
timeout-minutes: 30
continue-on-error: true
permissions:
contents: read
@@ -83,78 +60,71 @@ jobs:
# Conditional checkout based on context (same as Cypress workflow)
- name: Checkout for push or pull_request event
if: github.event_name == 'push' || github.event_name == 'pull_request'
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
- name: Checkout using ref (workflow_dispatch)
if: github.event_name == 'workflow_dispatch' && github.event.inputs.ref != ''
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
ref: ${{ github.event.inputs.ref }}
submodules: recursive
- name: Checkout using PR ID (workflow_dispatch)
if: github.event_name == 'workflow_dispatch' && github.event.inputs.pr_id != ''
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
ref: refs/pull/${{ github.event.inputs.pr_id }}/merge
submodules: recursive
# -------------------------------------------------------
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python || steps.check.outputs.frontend
- name: Setup postgres
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: setup-postgres
- name: Import test data
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: playwright_testdata
- name: Setup Node.js
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./superset-frontend/.nvmrc"
cache: "npm"
cache-dependency-path: "superset-frontend/package-lock.json"
node-version-file: './superset-frontend/.nvmrc'
- name: Install npm dependencies
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: npm-install
- name: Build javascript packages
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: build-instrumented-assets
- name: Build embedded SDK
uses: ./.github/actions/cached-dependencies
with:
run: build-embedded-sdk
- name: Install Playwright
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
with:
run: playwright-install
- name: Run Playwright (Experimental Tests)
if: steps.check.outputs.python || steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
env:
NODE_OPTIONS: "--max-old-space-size=4096"
with:
run: playwright-run "${{ matrix.app_root }}" experimental/
- name: Run Playwright (Embedded Tests)
uses: ./.github/actions/cached-dependencies
env:
NODE_OPTIONS: "--max-old-space-size=4096"
# Scope embedded-only env vars to this step. Setting them at the job
# level enabled the EMBEDDED_SUPERSET feature flag inside Flask for
# the preceding "Required Tests" and "Experimental Tests" steps too,
# which loads extra handlers and destabilizes the werkzeug dev
# server under the 2-worker Playwright load. Required Tests should
# match master's Flask configuration.
SUPERSET_FEATURE_EMBEDDED_SUPERSET: "true"
INCLUDE_EMBEDDED: "true"
with:
run: playwright-run "${{ matrix.app_root }}" embedded
- name: Set safe app root
if: failure()
id: set-safe-app-root

View File

@@ -14,30 +14,8 @@ concurrency:
cancel-in-progress: true
jobs:
changes:
runs-on: ubuntu-24.04
timeout-minutes: 10
permissions:
contents: read
pull-requests: read
outputs:
python: ${{ steps.check.outputs.python }}
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
test-mysql:
needs: changes
if: needs.changes.outputs.python == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 45
permissions:
id-token: write
env:
@@ -49,8 +27,6 @@ jobs:
services:
mysql:
image: mysql:8.0
# Authenticated pulls use our higher Docker Hub rate limit. Empty on
# fork PRs (secrets unavailable) -> runner falls back to anonymous.
env:
MYSQL_ROOT_PASSWORD: root
ports:
@@ -67,31 +43,41 @@ jobs:
- 16379:6379
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python
- name: Setup MySQL
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: setup-mysql
- name: Start Celery worker
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: celery-worker
- name: Python integration tests (MySQL)
if: steps.check.outputs.python
run: |
./scripts/python_tests.sh
- name: Upload code coverage
uses: codecov/codecov-action@fb8b3582c8e4def4969c97caa2f19720cb33a72f # v7.0.0
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
with:
flags: python,mysql
verbose: true
use_oidc: true
slug: apache/superset
- name: Generate database diagnostics for docs
if: steps.check.outputs.python
env:
SUPERSET_CONFIG: tests.integration_tests.superset_test_config
SUPERSET__SQLALCHEMY_DATABASE_URI: |
@@ -114,23 +100,19 @@ jobs:
print(f'Generated diagnostics for {len(docs)} databases')
"
- name: Upload database diagnostics artifact
if: steps.check.outputs.python
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7
with:
name: database-diagnostics
path: databases-diagnostics.json
retention-days: 7
test-postgres:
needs: changes
if: needs.changes.outputs.python == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 45
permissions:
id-token: write
strategy:
matrix:
# Full version spread on push (master/release) + nightly; current only
# on PRs to cut runner cost (cross-version breaks are caught at merge).
python-version: ${{ github.event_name == 'pull_request' && fromJSON('["current"]') || fromJSON('["current", "previous", "next"]') }}
python-version: ["current", "previous", "next"]
env:
PYTHONPATH: ${{ github.workspace }}
SUPERSET_CONFIG: tests.integration_tests.superset_test_config
@@ -152,28 +134,37 @@ jobs:
- 16379:6379
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python
with:
python-version: ${{ matrix.python-version }}
- name: Setup Postgres
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: |
setup-postgres
- name: Start Celery worker
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: celery-worker
- name: Python integration tests (PostgreSQL)
if: steps.check.outputs.python
run: |
./scripts/python_tests.sh
- name: Upload code coverage
uses: codecov/codecov-action@fb8b3582c8e4def4969c97caa2f19720cb33a72f # v7.0.0
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
with:
flags: python,postgres
verbose: true
@@ -181,10 +172,7 @@ jobs:
slug: apache/superset
test-sqlite:
needs: changes
if: needs.changes.outputs.python == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 45
permissions:
id-token: write
env:
@@ -202,51 +190,38 @@ jobs:
- 16379:6379
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python
- name: Install dependencies
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: |
# sqlite needs this working directory
mkdir ${{ github.workspace }}/.temp
- name: Start Celery worker
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: celery-worker
- name: Python integration tests (SQLite)
if: steps.check.outputs.python
run: |
./scripts/python_tests.sh
- name: Upload code coverage
uses: codecov/codecov-action@fb8b3582c8e4def4969c97caa2f19720cb33a72f # v7.0.0
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
with:
flags: python,sqlite
verbose: true
use_oidc: true
slug: apache/superset
# Stable required-status-check anchor for the matrix-based test-postgres job.
# It is gated on change detection, so on non-Python PRs it is skipped and
# never produces its `test-postgres (current)` context (a job-level skip
# happens before matrix expansion). This always-running job reports a single
# context branch protection can require: it passes when test-postgres
# succeeded or was skipped, and fails only on a real failure.
test-postgres-required:
needs: [changes, test-postgres]
if: always()
runs-on: ubuntu-24.04
timeout-minutes: 5
steps:
- name: Check test-postgres result
env:
RESULT: ${{ needs.test-postgres.result }}
run: |
if [ "$RESULT" != "success" ] && [ "$RESULT" != "skipped" ]; then
echo "test-postgres did not pass (result: $RESULT)"
exit 1
fi
echo "test-postgres result: $RESULT"

View File

@@ -15,30 +15,8 @@ concurrency:
cancel-in-progress: true
jobs:
changes:
runs-on: ubuntu-24.04
timeout-minutes: 10
permissions:
contents: read
pull-requests: read
outputs:
python: ${{ steps.check.outputs.python }}
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
test-postgres-presto:
needs: changes
if: needs.changes.outputs.python == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 45
permissions:
id-token: write
env:
@@ -72,25 +50,36 @@ jobs:
- 16379:6379
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python == 'true'
- name: Setup Postgres
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: setup-postgres
run: |
echo "${{ steps.check.outputs.python }}"
setup-postgres
- name: Start Celery worker
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: celery-worker
- name: Python unit tests (PostgreSQL)
if: steps.check.outputs.python
run: |
./scripts/python_tests.sh -m 'chart_data_flow or sql_json_flow'
- name: Upload code coverage
uses: codecov/codecov-action@fb8b3582c8e4def4969c97caa2f19720cb33a72f # v7.0.0
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
with:
flags: python,presto
verbose: true
@@ -98,10 +87,7 @@ jobs:
slug: apache/superset
test-postgres-hive:
needs: changes
if: needs.changes.outputs.python == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 45
permissions:
id-token: write
env:
@@ -127,32 +113,44 @@ jobs:
- 16379:6379
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Create csv upload directory
if: steps.check.outputs.python
run: sudo mkdir -p /tmp/.superset/uploads
- name: Give write access to the csv upload directory
if: steps.check.outputs.python
run: sudo chown -R $USER:$USER /tmp/.superset
- name: Start hadoop and hive
if: steps.check.outputs.python
run: docker compose -f scripts/databases/hive/docker-compose.yml up -d
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python
- name: Setup Postgres
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: setup-postgres
- name: Start Celery worker
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: celery-worker
- name: Python unit tests (PostgreSQL)
if: steps.check.outputs.python
run: |
pip install -e .[hive]
./scripts/python_tests.sh -m 'chart_data_flow or sql_json_flow'
- name: Upload code coverage
uses: codecov/codecov-action@fb8b3582c8e4def4969c97caa2f19720cb33a72f # v7.0.0
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
with:
flags: python,hive
verbose: true

View File

@@ -15,56 +15,40 @@ concurrency:
cancel-in-progress: true
jobs:
changes:
unit-tests:
runs-on: ubuntu-24.04
timeout-minutes: 10
permissions:
contents: read
pull-requests: read
outputs:
python: ${{ steps.check.outputs.python }}
id-token: write
strategy:
matrix:
python-version: ["previous", "current", "next"]
env:
PYTHONPATH: ${{ github.workspace }}
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
unit-tests:
needs: changes
if: needs.changes.outputs.python == 'true'
runs-on: ubuntu-24.04
timeout-minutes: 30
permissions:
id-token: write
strategy:
matrix:
# Full version spread on push (master/release) + nightly; current only
# on PRs to cut runner cost (cross-version breaks are caught at merge).
python-version: ${{ github.event_name == 'pull_request' && fromJSON('["current"]') || fromJSON('["previous", "current", "next"]') }}
env:
PYTHONPATH: ${{ github.workspace }}
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
submodules: recursive
- name: Setup Python
uses: ./.github/actions/setup-backend/
if: steps.check.outputs.python
with:
python-version: ${{ matrix.python-version }}
- name: Python unit tests
if: steps.check.outputs.python
env:
SUPERSET_TESTENV: true
SUPERSET_SECRET_KEY: not-a-secret
run: |
pytest --durations-min=0.5 --cov-report= --cov=superset ./tests/common ./tests/unit_tests --cache-clear --maxfail=50
- name: Python 100% coverage unit tests
if: steps.check.outputs.python
env:
SUPERSET_TESTENV: true
SUPERSET_SECRET_KEY: not-a-secret
@@ -72,31 +56,9 @@ jobs:
pytest --durations-min=0.5 --cov=superset/sql/ ./tests/unit_tests/sql/ --cache-clear --cov-fail-under=100
pytest --durations-min=0.5 --cov=superset/semantic_layers/ ./tests/unit_tests/semantic_layers/ --cache-clear --cov-fail-under=100
- name: Upload code coverage
uses: codecov/codecov-action@fb8b3582c8e4def4969c97caa2f19720cb33a72f # v7.0.0
uses: codecov/codecov-action@57e3a136b779b570ffcdbf80b3bdc90e7fab3de2 # v5
with:
flags: python,unit
verbose: true
use_oidc: true
slug: apache/superset
# Stable required-status-check anchor. `unit-tests` is a matrix job gated on
# change detection, so on non-Python PRs it is skipped and never produces its
# `unit-tests (current)` context (a job-level skip happens before matrix
# expansion). This always-running job reports a single context that branch
# protection can require: it passes when unit-tests succeeded or was skipped,
# and fails only on a real failure.
unit-tests-required:
needs: [changes, unit-tests]
if: always()
runs-on: ubuntu-24.04
timeout-minutes: 5
steps:
- name: Check unit-tests result
env:
RESULT: ${{ needs.unit-tests.result }}
run: |
if [ "$RESULT" != "success" ] && [ "$RESULT" != "skipped" ]; then
echo "unit-tests did not pass (result: $RESULT)"
exit 1
fi
echo "unit-tests result: $RESULT"

View File

@@ -1,88 +0,0 @@
name: Translation Regression Comment
on:
# zizmor: ignore[dangerous-triggers] - runs in base-branch context and only consumes the uploaded artifact; never checks out PR code (see note below)
workflow_run:
workflows: ["Translations"]
types: [completed]
# This workflow posts a PR comment when the Translations workflow detects a
# regression. It uses the workflow_run trigger so that it always runs in the
# base-branch context and can safely be granted write permissions, even for
# PRs from forks.
#
# IMPORTANT: This workflow must NEVER check out code from the PR branch.
# All data comes from the artifact uploaded by the Translations workflow.
permissions:
pull-requests: write
actions: read
jobs:
post-comment:
runs-on: ubuntu-24.04
# Only act when the Translations workflow failed (which means a regression
# was detected — the workflow exits 1 on regression).
if: github.event.workflow_run.conclusion == 'failure'
steps:
- name: Download regression artifact
id: download
continue-on-error: true
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8
with:
name: translation-regression
run-id: ${{ github.event.workflow_run.id }}
github-token: ${{ secrets.GITHUB_TOKEN }}
path: /tmp/translation-regression
- name: Post or update PR comment
if: steps.download.outcome == 'success'
uses: actions/github-script@3a2844b7e9c422d3c10d287c895573f7108da1b3 # v9.0.0
with:
script: |
const fs = require('fs');
const prNumberFile = '/tmp/translation-regression/pr-number.txt';
const reportFile = '/tmp/translation-regression/regression-report.md';
if (!fs.existsSync(prNumberFile) || !fs.existsSync(reportFile)) {
console.log('Artifact files not found, skipping comment.');
return;
}
const prNumber = parseInt(fs.readFileSync(prNumberFile, 'utf8').trim(), 10);
if (!prNumber) {
console.log('Could not parse PR number, skipping comment.');
return;
}
const report = fs.readFileSync(reportFile, 'utf8');
const marker = '<!-- translation-regression-bot -->';
const body = `${marker}\n${report}`;
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
});
const existing = comments.find(c => c.body && c.body.includes(marker));
if (existing) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existing.id,
body,
});
console.log(`Updated existing comment ${existing.id} on PR #${prNumber}`);
} else {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body,
});
console.log(`Created new comment on PR #${prNumber}`);
}

View File

@@ -20,12 +20,9 @@ concurrency:
jobs:
frontend-check-translations:
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: read
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
@@ -40,9 +37,7 @@ jobs:
if: steps.check.outputs.frontend
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./superset-frontend/.nvmrc"
cache: "npm"
cache-dependency-path: "superset-frontend/package-lock.json"
node-version-file: './superset-frontend/.nvmrc'
- name: Install dependencies
if: steps.check.outputs.frontend
uses: ./.github/actions/cached-dependencies
@@ -56,16 +51,12 @@ jobs:
babel-extract:
runs-on: ubuntu-24.04
permissions:
contents: read
pull-requests: read
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
@@ -73,83 +64,12 @@ jobs:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
if: steps.check.outputs.python == 'true' || steps.check.outputs.frontend == 'true'
if: steps.check.outputs.python
uses: ./.github/actions/setup-backend/
- name: Install gettext tools
if: steps.check.outputs.python == 'true' || steps.check.outputs.frontend == 'true'
run: sudo apt-get update && sudo apt-get install -y gettext
- name: Install msgcat
run: sudo apt update && sudo apt install gettext
# Fetch the base ref so we can compare PR-introduced regressions
# against a fair baseline (also runs babel_update against the base
# source) — this isolates the PR's contribution from any pre-existing
# drift on the base branch.
- name: Fetch base ref and create comparison worktree
if: steps.check.outputs.python == 'true' || steps.check.outputs.frontend == 'true'
env:
PR_BASE_REF: ${{ github.event.pull_request.base.ref }}
run: |
# For PRs use the base branch; for direct pushes compare against the previous commit.
BASE_REF="$PR_BASE_REF"
if [ -n "$BASE_REF" ]; then
git fetch --depth=1 origin "$BASE_REF"
else
git fetch --depth=2 origin "$GITHUB_REF"
fi
git worktree add /tmp/base-worktree FETCH_HEAD
# Run babel_update against BASE source + BASE translations. Any drift
# already present on the base branch (source strings that have changed
# without .po updates) shows up here as fuzzies — and will also show
# up in the PR run, so it cancels out in the comparison.
- name: Baseline — run babel_update against BASE source
if: steps.check.outputs.python == 'true' || steps.check.outputs.frontend == 'true'
working-directory: /tmp/base-worktree
- name: Test babel extraction
if: steps.check.outputs.python
run: ./scripts/translations/babel_update.sh
- name: Record baseline translation counts
if: steps.check.outputs.python == 'true' || steps.check.outputs.frontend == 'true'
run: |
python scripts/translations/check_translation_regression.py \
--count \
--translations-dir /tmp/base-worktree/superset/translations \
> /tmp/before.json
# Run babel_update against the PR source and PR translations. This keeps
# committed .po fixes in play while the base babel_update above still
# cancels out translation drift already present on the base branch.
- name: Run babel_update against PR source
if: steps.check.outputs.python == 'true' || steps.check.outputs.frontend == 'true'
run: ./scripts/translations/babel_update.sh
- name: Check for translation regression
id: regression
if: steps.check.outputs.python == 'true' || steps.check.outputs.frontend == 'true'
continue-on-error: true
run: |
python scripts/translations/check_translation_regression.py \
--compare /tmp/before.json \
--report /tmp/regression-report.md
# Save the PR number so the comment workflow can post the report without
# needing write permissions on this pull_request-triggered job.
- name: Save PR number for comment workflow
if: >-
github.event_name == 'pull_request' &&
steps.regression.outcome == 'failure'
run: echo "${{ github.event.pull_request.number }}" > /tmp/pr-number.txt
- name: Upload regression artifact
if: >-
github.event_name == 'pull_request' &&
steps.regression.outcome == 'failure'
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7
with:
name: translation-regression
path: |
/tmp/regression-report.md
/tmp/pr-number.txt
- name: Fail if regression detected
if: steps.regression.outcome == 'failure'
run: exit 1

View File

@@ -22,10 +22,9 @@ concurrency:
jobs:
app-checks:
runs-on: ubuntu-24.04
timeout-minutes: 20
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
- name: Install dependencies

View File

@@ -9,7 +9,7 @@ on:
workflow_dispatch:
inputs:
comment_body:
description: "Comment Body"
description: 'Comment Body'
required: true
type: string
@@ -38,7 +38,7 @@ jobs:
});
- name: "Checkout ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false

View File

@@ -16,14 +16,11 @@ on:
force-latest:
required: true
type: choice
default: "false"
default: 'false'
description: Whether to force a latest tag on the release
options:
- "true"
- "false"
permissions:
contents: read
- 'true'
- 'false'
jobs:
config:
runs-on: ubuntu-24.04
@@ -45,18 +42,15 @@ jobs:
if: needs.config.outputs.has-secrets
name: docker-release
runs-on: ubuntu-24.04
permissions:
contents: write
strategy:
matrix:
build_preset:
["dev", "lean", "py310", "websocket", "dockerize", "py311", "py312"]
build_preset: ["dev", "lean", "py310", "websocket", "dockerize", "py311", "py312"]
fail-fast: false
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
fetch-depth: 0
- name: Setup Docker Environment
@@ -68,11 +62,9 @@ jobs:
build: "true"
- name: Use Node.js 20
# zizmor: ignore[cache-poisoning] - node only runs the supersetbot CLI; no dependency cache is enabled
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version: 20
package-manager-cache: false
- name: Setup supersetbot
uses: ./.github/actions/setup-supersetbot/
@@ -85,9 +77,8 @@ jobs:
INPUT_RELEASE: ${{ github.event.inputs.release }}
INPUT_FORCE_LATEST: ${{ github.event.inputs.force-latest }}
INPUT_GIT_REF: ${{ github.event.inputs.git-ref }}
GITHUB_EVENT_RELEASE_TAG_NAME: ${{ github.event.release.tag_name }}
run: |
RELEASE="${GITHUB_EVENT_RELEASE_TAG_NAME}"
RELEASE="${{ github.event.release.tag_name }}"
FORCE_LATEST=""
EVENT="${{github.event_name}}"
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
@@ -119,18 +110,16 @@ jobs:
contents: read
pull-requests: write
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
with:
persist-credentials: false
fetch-depth: 0
- name: Use Node.js 20
# zizmor: ignore[cache-poisoning] - node only runs the supersetbot CLI; no dependency cache is enabled
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version: 20
package-manager-cache: false
- name: Setup supersetbot
uses: ./.github/actions/setup-supersetbot/
@@ -139,12 +128,11 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
INPUT_RELEASE: ${{ github.event.inputs.release }}
GITHUB_EVENT_RELEASE_TAG_NAME: ${{ github.event.release.tag_name }}
run: |
export GITHUB_ACTOR=""
git fetch --all --tags
git checkout master
RELEASE="${GITHUB_EVENT_RELEASE_TAG_NAME}"
RELEASE="${{ github.event.release.tag_name }}"
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
# in the case of a manually-triggered run, read release from input
RELEASE="${INPUT_RELEASE}"

View File

@@ -32,14 +32,12 @@ jobs:
name: Generate Reports
steps:
- name: Checkout Repository
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
with:
persist-credentials: false
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6
- name: Set up Node.js
uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6
with:
node-version-file: "./superset-frontend/.nvmrc"
node-version-file: './superset-frontend/.nvmrc'
- name: Install Dependencies
run: npm ci

View File

@@ -1,29 +1,22 @@
name: Welcome New Contributor
on:
# zizmor: ignore[dangerous-triggers] - posts a welcome comment only; does not check out or execute PR-provided code
pull_request_target:
types: [opened]
jobs:
welcome:
runs-on: ubuntu-24.04
if: github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR'
permissions:
pull-requests: write
steps:
- name: Welcome Message
uses: actions/first-interaction@1c4688942c71f71d4f5502a26ea67c331730fa4d # v3.1.0
uses: actions/first-interaction@v3
continue-on-error: true
with:
repo_token: ${{ github.token }}
issue_message: |-
Congrats on opening your first issue and thank you for contributing to Superset! :tada: :heart:
Please read our [New Contributor Welcome & Expectations](https://github.com/apache/superset/wiki/New-Contributor-Welcome-&-Expectations) guide.
pr_message: |-
repo-token: ${{ github.token }}
pr-message: |-
Congrats on making your first PR and thank you for contributing to Superset! :tada: :heart:
Please read our [New Contributor Welcome & Expectations](https://github.com/apache/superset/wiki/New-Contributor-Welcome-&-Expectations) guide.
We hope to see you in our [Slack](https://apache-superset.slack.com/) community too! Not signed up? Use our [Slack App](http://bit.ly/join-superset-slack) to self-register.

2
.gitignore vendored
View File

@@ -115,8 +115,6 @@ release.json
superset/translations/**/messages.json
# these mo binary files are generated by `pybabel compile`
superset/translations/**/messages.mo
# cross-language index generated by scripts/translations/build_translation_index.py
superset/translations/translation_index.json
docker/requirements-local.txt

View File

@@ -50,7 +50,7 @@ repos:
hooks:
- id: check-docstring-first
- id: check-added-large-files
exclude: ^.*\.(geojson)$|^docs/static/img/screenshots/.*|^superset-frontend/CHANGELOG\.md$|^superset/examples/.*/data\.parquet$|^superset/translations/.*\.po$
exclude: ^.*\.(geojson)$|^docs/static/img/screenshots/.*|^superset-frontend/CHANGELOG\.md$|^superset/examples/.*/data\.parquet$
- id: check-yaml
exclude: ^helm/superset/templates/
- id: debug-statements
@@ -158,14 +158,3 @@ repos:
language: system
files: ^superset/config\.py$
pass_filenames: false
- id: zizmor
name: zizmor (GHA security audit)
entry: zizmor
language: python
additional_dependencies: [zizmor==1.25.2]
files: ^\.github/
types: [yaml]
pass_filenames: false
# Advisory until pre-existing findings are resolved; remove
# --no-exit-codes to make this hook blocking.
args: [--no-exit-codes, .github/]

View File

@@ -53,7 +53,7 @@ extension-pkg-whitelist=pyarrow
[MESSAGES CONTROL]
disable=all
enable=disallowed-sql-import,consider-using-transaction
enable=json-import,disallowed-sql-import,consider-using-transaction
[REPORTS]

View File

@@ -52,43 +52,6 @@ Common pre-commit failures:
- **External API exposure** - Use UUIDs in public APIs instead of internal integer IDs
- **Existing models** - Add UUID fields alongside integer IDs for gradual migration
## Security and Threat Model
Before evaluating any code path for security issues, read [`SECURITY.md`](SECURITY.md). It is the canonical, authoritative source for Apache Superset's security model and is referenced by both human reporters and automated scanners.
In short, the test for whether a finding is in scope is one question:
> *Does it let a principal perform an action the role and capability matrix in `SECURITY.md` does not entitle them to?*
If yes, it is in scope. If no, it is not.
The three trust boundaries are:
1. **The Admin role** is a fully trusted operational principal. Anything an Admin can do through documented configuration, API, or UI is an intended capability, not a vulnerability.
2. **The operator** owns deployment-time decisions (secrets, network exposure, feature-flag selection, connector and codec choices, notification destinations, third-party plugins). Misconfiguration at this layer is a deployment defect, not a Superset vulnerability.
3. **The codebase** is responsible for enforcing the role and capability matrix wherever it exposes functionality to a principal: API routes, command and DAO layers, UI handlers, background jobs, and any other entry point. A missing or incorrect enforcement check is in scope no matter where it lives.
The security model assumes that operator-controlled infrastructure, including the metadata database, cache backends, message brokers, secret stores, and deployment environment, remains within the operator's trust boundary. Vulnerabilities must demonstrate a security boundary violation by an attacker who does not already control those systems.
Route-level authorization in this codebase uses one of three Flask-AppBuilder decorators depending on the route type:
- `@protect()` for REST API routes (`ModelRestApi` / `BaseApi`)
- `@has_access_api` for legacy view routes
- `@has_access` for legacy HTML view routes
Object-level authorization via `security_manager.raise_for_access(...)` applies to data-bearing resources: dashboards, charts, datasets and datasources, queries, database and table access, and query contexts. Other resources (annotations, tags, CSS templates, reports, RLS rules, and similar) rely on the route-level decorator plus DAO `base_filters` for ownership scoping; the absence of `raise_for_access` on these resources is by design, not a finding. Code that omits the per-object gate on a route that returns or mutates a specific data-bearing object is in scope; code that follows the correct pattern for its resource class can still contain injection, SSRF, XSS, or other classes of finding unrelated to authorization, which are evaluated separately.
The full role and capability matrix, in-scope and out-of-scope class lists, and CVE aggregation rules are in [`SECURITY.md`](SECURITY.md). Defer to that document for any specifics.
**Requirements for findings filed by automated tooling**
Automated scanners (LLM-based code scanners, static analyzers, dependency tools) that file findings against this codebase must, in each finding, name:
1. The specific role and capability matrix row in [`SECURITY.md`](SECURITY.md) the finding believes is violated.
2. The principal the finding assumes the attacker holds (Public, Gamma, sql_lab, Alpha, Admin, Embedded guest token, or a custom role with explicit capability grants).
Findings that cannot identify both should be filed as questions, not vulnerabilities. This requirement exists to ensure every reported issue is testable against the published security model and to keep speculative or pattern-match-only reports out of the triage queue.
## Key Directories
```
@@ -165,7 +128,6 @@ The Developer Portal auto-generates MDX documentation from Storybook stories. **
## Architecture Patterns
### Security & Features
- **Security model**: see the top-level [Security and Threat Model](#security-and-threat-model) section and [`SECURITY.md`](SECURITY.md)
- **RBAC**: Role-based access via Flask-AppBuilder
- **Feature flags**: Control feature rollouts
- **Row-level security**: SQL-based data access control

View File

@@ -29,7 +29,7 @@ ARG BUILD_TRANSLATIONS="false"
######################################################################
# superset-node-ci used as a base for building frontend assets and CI
######################################################################
FROM --platform=${BUILDPLATFORM} node:24-trixie-slim AS superset-node-ci
FROM --platform=${BUILDPLATFORM} node:22-trixie-slim AS superset-node-ci
ARG BUILD_TRANSLATIONS
ENV BUILD_TRANSLATIONS=${BUILD_TRANSLATIONS}
ARG DEV_MODE="false" # Skip frontend build in dev mode
@@ -55,13 +55,6 @@ WORKDIR /app/superset-frontend
RUN mkdir -p /app/superset/static/assets \
/app/superset/translations
# Harden `npm ci` against transient npm-registry network blips (e.g. ECONNRESET),
# which otherwise fail the entire multi-platform image build with no retry.
ENV npm_config_fetch_retries=5 \
npm_config_fetch_retry_mintimeout=20000 \
npm_config_fetch_retry_maxtimeout=120000 \
npm_config_fetch_timeout=600000
# Mount package files and install dependencies if not in dev mode
# NOTE: we mount packages and plugins as they are referenced in package.json as workspaces
# ideally we'd COPY only their package.json. Here npm ci will be cached as long
@@ -120,7 +113,7 @@ RUN useradd --user-group -d ${SUPERSET_HOME} -m --no-log-init --shell /bin/bash
# Some bash scripts needed throughout the layers
COPY --chmod=755 docker/*.sh /app/docker/
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv
RUN pip install --no-cache-dir --upgrade uv
# Using uv as it's faster/simpler than pip
RUN uv venv /app/.venv
@@ -209,8 +202,6 @@ RUN mkdir -p /app/data && chown -R superset:superset /app/data
# Copy compiled things from previous stages
COPY --from=superset-node /app/superset/static/assets superset/static/assets
# Copy service.worker.js optionall as it doesn't exist when DEV_MODE=true
COPY --from=superset-node /app/superset/static/service-worker.j[s] superset/static/service-worker.js
# TODO, when the next version comes out, use --exclude superset/translations
COPY superset superset

View File

@@ -189,11 +189,6 @@ Try out Superset's [quickstart](https://superset.apache.org/docs/quickstart/) gu
- [Join our community's Slack](http://bit.ly/join-superset-slack)
and please read our [Slack Community Guidelines](https://github.com/apache/superset/blob/master/CODE_OF_CONDUCT.md#slack-community-guidelines)
- [Join our dev@superset.apache.org Mailing list](https://lists.apache.org/list.html?dev@superset.apache.org). To join, simply send an email to [dev-subscribe@superset.apache.org](mailto:dev-subscribe@superset.apache.org)
- Follow us on social media:
[X](https://x.com/apachesuperset) |
[LinkedIn](https://www.linkedin.com/company/apache-superset) |
[Bluesky](https://bsky.app/profile/apachesuperset.bsky.social) |
[Reddit](https://reddit.com/r/apache-superset)
- If you want to help troubleshoot GitHub Issues involving the numerous database drivers that Superset supports, please consider adding your name and the databases you have access to on the [Superset Database Familiarity Rolodex](https://docs.google.com/spreadsheets/d/1U1qxiLvOX0kBTUGME1AHHi6Ywel6ECF8xk_Qy-V9R8c/edit#gid=0)
- Join Superset's Town Hall and [Operational Model](https://preset.io/blog/the-superset-operational-model-wants-you/) recurring meetings. Meeting info is available on the [Superset Community Calendar](https://superset.apache.org/community)

View File

@@ -1,145 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Security Policy
This is a project of the [Apache Software Foundation](https://apache.org) and follows the
ASF [vulnerability handling process](https://apache.org/security/#vulnerability-handling).
## Reporting Vulnerabilities
**⚠️ Please do not file GitHub issues for security vulnerabilities as they are public! ⚠️**
Apache Software Foundation takes a rigorous standpoint in resolving the security issues
in its software projects. Apache Superset is highly sensitive and forthcoming to issues
pertaining to its features and functionality.
If you have any concern or believe you have found a vulnerability in Apache Superset,
please get in touch with the Apache Superset Security Team privately at
e-mail address [security@superset.apache.org](mailto:security@superset.apache.org).
More details can be found on the ASF website at
[ASF vulnerability reporting process](https://apache.org/security/#reporting-a-vulnerability)
**Submission Standards & AI Policy**
To ensure engineering focus remains on verified risks and to manage high reporting volumes, all reports must meet the following criteria:
- Plain Text Format: In accordance with Apache guidelines, please provide all details in plain text within the email body. Avoid sending PDFs, Word documents, or password-protected archives.
- Mandatory AI Disclosure: If you utilized Large Language Models (LLMs) or AI tools to identify a flaw or assist in writing a report, you must disclose this in your submission so our triage team can contextualize the findings.
- Human-Verified PoC: All submissions must include a manual, step-by-step Proof of Concept (PoC) performed on a supported release. Raw AI outputs, hypothetical chat transcripts, or unverified scanner logs will be closed as Invalid.
We kindly ask you to include the following information in your report to assist our developers in triaging and remediating issues efficiently:
- Version/Commit: The specific version of Apache Superset or the Git commit hash you are using.
- Configuration: A sanitized copy of your `superset_config.py` file or any config overrides.
- Environment: Your deployment method (e.g., Docker Compose, Helm, or source) and relevant OS/Browser details.
- Impacted Component: Identification of the affected area (e.g., Python backend, React frontend, or a specific database connector).
- Expected vs. Actual Behavior: A clear description of the intended system behavior versus the observed vulnerability.
- Detailed Reproduction Steps: Clear, manual steps to reproduce the vulnerability.
## Security Model
This section defines what Apache Superset considers a security issue and what it does not. It is the canonical reference for reporters, the Apache Superset Security Team, and any automated tool (LLM-based scanner, static analyzer, dependency tool) that needs to constrain its hypotheses to behaviors that genuinely violate the project's security policy.
The model is intentionally written in terms of principals, trust boundaries, and capability surface rather than in terms of specific files, functions, or libraries. New code paths inherit the model automatically.
### Trust Boundaries
Apache Superset's threat model assumes three trust boundaries.
1. *The Admin role* is a fully trusted operational principal. Anything an Admin can do through the documented user interface, REST API, or configuration system is an intended capability, not a vulnerability, even if individually powerful or destructive. The Admin role is, by policy, equivalent to operating-system-level trust over the Apache Superset application. This is unavoidable rather than aspirational: an Admin can, for example, register new database connections of arbitrary type, execute arbitrary SQL through SQL Lab, render Jinja templates that resolve to SQL or rendered HTML, and override application configuration. Granting Admin is functionally equivalent to granting shell access on the host, which is the reasoning behind treating it as a trust boundary in the sense of MITRE CNA Operational Rules 4.1.
2. *The operator* is whoever deploys, configures, and runs Apache Superset. Behaviors that depend on deployment-time decisions are the operator's responsibility, not Apache Superset's. This includes the values of secrets, the network reachability of the application and its data sources, the choice of database connectors and cache backends, the selection of feature flags, the destinations of notifications, and the trust placed in third-party plugins. Defaults that fail closed are the responsibility of the Apache Superset codebase. Defaults that fail open must be accompanied by a documented hardening requirement; applying that hardening is the operator's responsibility, while shipping an undocumented or unflagged fail-open default is a codebase issue.
3. *The Apache Superset codebase* is responsible for enforcing the role and capability matrix below across its product surface. A failure to enforce, anywhere in that surface, is in scope. The codebase's commitments are limited to the role and capability matrix and to controls Apache Superset's own documentation (this file and the linked Security documentation) explicitly positions as security boundaries; configurable hardening that operators can layer on top is treated separately under *Vulnerability Scope* below.
### Roles and Capabilities
Apache Superset ships with the following first-class principals. Detailed permission definitions live in the [Security documentation](https://superset.apache.org/docs/security).
| Principal | Read data | Write objects | Execute SQL | Manage databases | Manage users, roles, RLS |
|---|---|---|---|---|---|
| Public (anonymous) | none by default | no | no | no | no |
| Gamma | only granted datasets | own charts and dashboards on granted datasets | no by default (requires the `sql_lab` role) | no | no |
| Alpha | all data sources | own charts, dashboards, and datasets | no by default (requires the `sql_lab` role) | data upload to existing databases only | no |
| Admin | all | all | yes | yes | yes |
| Embedded guest token | data sources reachable through the embedded dashboards the token authorizes | no | no | no | no |
The `sql_lab` role is *additive*: it grants the SQL Lab permission set on top of the base role above, and is the only path by which Gamma or Alpha gain SQL execution capability. Database access is still scoped per the base role's grants. Admin includes SQL Lab access by default.
Deployments may grant or revoke individual view-menu permissions, which shifts the boundary for that deployment but does not redefine the model. Any custom role created by an operator inherits the same principle: its capabilities are whatever the operator has explicitly granted it. The Public principal follows the same rule: operators may grant the Public role read access to specific datasets or dashboards (typically for anonymous reporting use cases), which shifts the boundary for that deployment without redefining the model.
### Vulnerability Scope
The test for whether a finding is in scope is a single question:
> *Does this finding let a principal perform an action the role and capability matrix above does not entitle them to?*
If yes, it is in scope. If no, it is out of scope. The lists below apply that test to the classes Apache Superset most commonly receives reports about; they are illustrative, not exhaustive.
*In Scope*
- A user, embedded guest, or anonymous visitor reads, modifies, or deletes data outside their granted set. Includes object-level access bypass on charts, dashboards, datasets, saved queries, tags, annotations, and similar per-object endpoints, and row-level-security rule bypass.
- A user supplies input that the codebase should sanitize or parameterize but does not, causing arbitrary SQL, template code, or scripts to execute. Includes injection through Jinja templates, SQL-construction paths, and any field the codebase passes to a query engine or template engine.
- A user bypasses authentication, fixates or reuses another user's session, or reaches an authenticated endpoint without logging in.
- An embedded guest token authorizes actions outside the dashboard it was issued for, or can be forged, replayed, or escalated to a higher principal.
- Apache Superset, acting on behalf of an unprivileged user, fetches an outbound URL the user controls in a feature where Apache Superset itself, not the operator, controls the outbound destination set (server-side request forgery).
- An Apache Superset default fails open without an accompanying documented hardening requirement. The codebase is responsible for shipping fail-closed defaults or for documenting the hardening required when a default fails open; failures of that responsibility are in scope (see *Trust Boundaries*).
- A user bypasses a control Apache Superset documents specifically as a security boundary. This includes row-level security, the access checks tied to the role and capability matrix above, and any feature whose documentation positions it as security-relevant. The codebase commits to enforcing those controls; bypasses are in scope regardless of which principal triggers them.
- A user causes a script to execute in another user's browser through a field the codebase renders to that other user (cross-site scripting), or causes cross-origin leakage of authenticated session state or data.
- A user reaches a route, page, or API endpoint that requires a role they do not have.
*Out of Scope*
- Any action an Admin role can perform through documented configuration, API, or UI. The Admin role is a trusted operational principal by policy. Per MITRE CNA Operational Rules 4.1, a qualifying vulnerability must violate a security policy; behavior within a documented trust boundary does not.
- Deployment or operator decisions: the values of secrets and tokens, whether internal networks are reachable from the server, which database connectors or cache backends are enabled, which feature flags are set, where notifications are delivered, and which third-party plugins are loaded.
- Compromise, modification, or malicious control of trusted backend infrastructure. Apache Superset assumes the integrity of its metastore, cache backends (for example Redis or Memcached), message brokers, secret stores, and other operator-managed infrastructure. Findings that require an attacker to read from, write to, or otherwise tamper with these systems, including injecting malicious state, serialized objects, cache entries, task metadata, configuration, or database records, are post-compromise scenarios and do not constitute vulnerabilities in Apache Superset itself. A finding remains in scope only if an unprivileged user can cause such modification through a vulnerability in Apache Superset.
- The continued presence of expired key-value or metastore-cache entries that have not yet been deleted from the metadata database. Such entries are excluded from reads once expired, are purged opportunistically on write, and are removed in bulk by the scheduled `prune_key_value` maintenance task; their lingering until purged is an eventual-cleanup property, not a security boundary, and does not constitute a vulnerability.
- How a downstream application (spreadsheet program, email client, browser handling user-downloaded files) interprets output Apache Superset produced for it.
- Findings without a reproducible proof of concept against a supported release. The burden of demonstrating exploitability rests with the reporter; findings closed for lack of a proof of concept may be refiled if one is later produced.
- Brute force, rate limiting, denial of service, or resource exhaustion that does not bypass a documented control.
- Missing security headers, banner or version disclosure, user or object enumeration through error messages or timing, and similar low-impact information disclosure that does not enable a further concrete exploit.
- Bypasses of configurable defense-in-depth hardening that Apache Superset does not document as a security boundary. Apache Superset is not a SQL or database firewall: operator-deployable filters such as SQL function or table denylists, URI restrictions on already-authorized database connectors, and similar belt-and-braces controls are provided to let operators layer hardening on top of the role and capability matrix, not as firewall-grade guarantees the codebase commits to. Findings against such hardening are improvements, not vulnerabilities, unless the documentation positions the specific control as security-relevant.
- Hardening suggestions that improve defense in depth but do not violate the security model.
Findings in third-party dependencies fall into two cases. A finding in a transitive dependency, or in an operator-selected dependency that Apache Superset does not ship, is out of scope and should be reported to the dependency's maintainers. A finding caused by Apache Superset pinning a known-vulnerable version of a direct dependency it ships, or using a dependency in a way that creates a vulnerability the dependency itself does not have, remains in scope. Dependency findings in the official Apache Superset Docker image that fall into the first case can be remediated by extending the image at release time.
When uncertain whether a finding falls in scope, please file it through the reporting process above. The triage team will classify it and explain the reasoning if it is closed as out of scope.
**Outcome of Reports**
Reports that are deemed out of scope for a CVE but represent valid security best practices or hardening opportunities are typically converted into public GitHub issues, where the community can contribute fixes alongside the maintainers. The triage decision and reasoning are communicated back to the reporter in either case.
**Vulnerability Aggregation & CVE Attribution**
In accordance with MITRE CNA Operational Rules (4.1.10, 4.1.11, and 4.2.13), Apache Superset issues CVEs based on the underlying architectural root cause rather than the number of affected endpoints or exploit payloads.
- Aggregation: If multiple exploit vectors stem from the same programmatic failure or shared vulnerable code, they must be aggregated into a single, comprehensive report.
- Independent Fixes: Separate CVEs will only be assigned if the vulnerabilities reside in decoupled architectural modules and can be fixed independently of one another.
Reports that fail to aggregate related findings will be merged during triage to ensure an accurate and defensible CVE record.
**Your responsible disclosure and collaboration are invaluable.**
## Extra Information
- [Apache Superset documentation](https://superset.apache.org/docs/security)
- [Common Vulnerabilities and Exposures by release](https://superset.apache.org/docs/security/cves)
- [How Security Vulnerabilities are Reported & Handled in Apache Superset (Blog)](https://preset.io/blog/how-security-vulnerabilities-are-reported-and-handled-in-apache-superset/)
**For LLM agents and automated scanners**
This file is the canonical source of truth for what Apache Superset considers a security vulnerability. Automated tooling (LLM-based code scanners, static analyzers, dependency tools) should treat the Security Model section as authoritative when classifying findings. The repository's [AGENTS.md](AGENTS.md) file contains a short pointer to this document for agents that read AGENTS.md as their entry point.

View File

@@ -24,165 +24,55 @@ assists people when migrating to a new version.
## Next
### Webhook alerts/reports block private/internal hosts by default
### Entity version history for charts, dashboards, and datasets
Webhook alert/report dispatch (`WebhookNotification.send`) now validates the target URL's host against the same private/internal-IP block applied to dataset import URLs. If the resolved host is in a loopback, link-local, private (RFC-1918), shared-CGNAT, or multicast range, the webhook is rejected with `NotificationParamException`.
Saves of charts, dashboards, and datasets now automatically produce a version history — browsable and restorable via new API endpoints. No frontend UI in this release; the backend plumbing is the deliverable.
Deployments that intentionally point webhooks at internal targets (chatops bridges, internal automation servers, on-premises Mattermost/Rocket.Chat, etc.) can opt out by setting `ALERT_REPORTS_WEBHOOK_ALLOW_INTERNAL_HOSTS = True` in `superset_config.py`. This mirrors the existing `DATASET_IMPORT_ALLOW_INTERNAL_DATA_URLS` opt-out for dataset imports.
**New endpoints** (per entity type — same pattern for `chart`, `dashboard`, and `dataset`):
### Impala cancel_query blocks private/internal hosts by default
| Method | Path | Purpose |
|---|---|---|
| `GET` | `/api/v1/{resource}/<uuid>/versions/` | List the entity's version history (0-based `version_number`, `version_uuid`, `issued_at`, `changed_by`) |
| `GET` | `/api/v1/{resource}/<uuid>/versions/<version_uuid>/` | Get a single version snapshot (scalar fields at that version; plus `columns` / `metrics` for datasets) |
| `POST` | `/api/v1/{resource}/<uuid>/versions/<version_uuid>/restore` | Restore the entity to the state captured by that version |
The Impala engine spec's `cancel_query` issues an HTTP request from the Superset backend to the host configured on the Impala database connection. That host is now validated before the request: if it resolves to a private/internal IP range, the cancel call is refused and a warning is logged. Operators whose Impala cluster runs on an internal network can opt out by setting `IMPALA_CANCEL_QUERY_ALLOW_INTERNAL_HOSTS = True` in `superset_config.py`. This mirrors the dataset-import and webhook opt-out flags.
### Map chart renderer and OpenStreetMap migration behavior
`<version_uuid>` is a deterministic `UUIDv5` derived from the entity's UUID and the Continuum transaction id — stable across replicas and retention pruning. Authorisation reuses the resource's existing `can_write` permission; workspace admins can list/restore any entity.
The MapLibre migration for deck.gl charts preserves saved non-Mapbox styles on
the MapLibre-compatible path. Saved styles such as OpenStreetMap, `tile://`
tile templates, generic HTTPS style URLs, and charts without a saved style are
not reclassified as Mapbox during migration and do not require
`MAPBOX_API_KEY` only because of the migration.
**Version response shape — `changes` array:**
Saved true Mapbox styles whose value starts with `mapbox://` remain
Mapbox-backed. If a Superset deployment does not configure `MAPBOX_API_KEY`,
those saved Mapbox charts keep the existing missing-key message instead of
silently falling back to MapLibre or another provider. In Explore, deck.gl and
point-cluster renderer controls preserve saved Mapbox state, but the Mapbox
choice is not available as a new working renderer without a configured key.
Each entry returned by `GET /api/v1/{resource}/<uuid>/versions/` and `GET .../versions/<version_uuid>/` includes a `changes` array describing what changed relative to the previous version:
The MapLibre style choices include `Streets (OSM)`, backed by
`https://tile.openstreetmap.org/{z}/{x}/{y}.png`. This OpenStreetMap tile
service requires visible `© OpenStreetMap contributors` attribution and should
be used through normal browser map tile requests and caching; it is not intended
for bulk prefetch or offline tile downloads.
### Password complexity policy enabled by default
Superset now ships a default password-complexity policy, enforced (via Flask-AppBuilder) across self-registration, the user create/edit/reset forms, and the User REST API. The policy requires a minimum password length of 8 characters and rejects a built-in blocklist of common/guessable passwords.
This is enabled by default (`FAB_PASSWORD_COMPLEXITY_ENABLED = True`), so new or reset passwords that are too short or appear in the blocklist will be rejected where they were previously accepted. Existing stored passwords are unaffected until they are next changed.
Operators can tune or disable the policy via config:
- `AUTH_PASSWORD_MIN_LENGTH` — minimum length (default `8`).
- `AUTH_PASSWORD_COMMON_BLOCKLIST` — extra passwords to reject, in addition to the built-in list.
- `FAB_PASSWORD_COMPLEXITY_VALIDATOR` — replace with your own callable for custom rules.
- `FAB_PASSWORD_COMPLEXITY_ENABLED = False` — disable enforcement entirely.
### Data uploads bounded by UPLOAD_MAX_FILE_SIZE_BYTES
Single data-file uploads (CSV, Excel, columnar) are now bounded by the `UPLOAD_MAX_FILE_SIZE_BYTES` config option, which defaults to `100 * 1024 * 1024` (100 MB). Files larger than this are rejected with a `413` before their contents are buffered into memory. Set `UPLOAD_MAX_FILE_SIZE_BYTES = None` to disable the check and restore unbounded uploads.
### Duration formatter precision
The `DURATION` number formatter now uses `Intl.DurationFormat` for locale-aware output. By default, sub-second fields are omitted, so values that previously displayed fractional seconds with `pretty-ms`, such as `10500` milliseconds rendering as `10.5s`, now render as `10s`.
To preserve sub-second precision in custom duration formatters, enable `formatSubMilliseconds`.
### Cache warmup authenticates via SUPERSET_CACHE_WARMUP_USER
The `cache-warmup` Celery task now drives a real WebDriver session for reliable authentication and reads the user to authenticate as from the new `SUPERSET_CACHE_WARMUP_USER` config option. It no longer consults `CACHE_WARMUP_EXECUTORS` for the warmup path. `SUPERSET_CACHE_WARMUP_USER` defaults to `None`, so the task fails fast with a clear message until you set it. Operators who previously relied on `CACHE_WARMUP_EXECUTORS` for cache warmup must set `SUPERSET_CACHE_WARMUP_USER` to a dedicated least-privilege user with access to the dashboards they want warmed up before the next warmup run.
### YDB now uses a native sqlglot dialect
YDB SQL parsing now relies on the dedicated [`ydb-sqlglot-plugin`](https://pypi.org/project/ydb-sqlglot-plugin/) dialect, which registers itself with sqlglot automatically. YDB users must install this plugin (e.g., via `pip install "apache-superset[ydb]"`) to avoid a `ValueError` when Superset parses YDB queries.
### Embedded dashboards enforce configured Allowed Domains for postMessage
The embedded dashboard page now validates the origin of incoming `postMessage` events against the dashboard's configured **Allowed Domains**. The server-rendered embedded page exposes the configured domains in its bootstrap payload, and the frontend rejects message events whose origin is not in that list.
Enforcement only applies when the Allowed Domains list is non-empty. If the list is empty (the default), any origin is accepted, so there is no behavior change for embeds that did not configure Allowed Domains.
### Default guest/async JWT secrets are rejected at startup
Superset already refuses to start in production (non-debug, non-testing) when `SECRET_KEY` is left at its built-in default, and when `GUEST_TOKEN_JWT_SECRET` is left at its default while `EMBEDDED_SUPERSET` is enabled. This behavior is extended to `GLOBAL_ASYNC_QUERIES_JWT_SECRET`: if the `GLOBAL_ASYNC_QUERIES` feature flag is enabled and the secret is still the publicly known default (`test-secret-change-me`), Superset logs a clear error and refuses to start.
As with the existing `SECRET_KEY` check, this only fails in production. In debug mode, testing mode, or under the test runner, a warning is logged instead of exiting, so local development is unaffected.
To resolve the error, set a strong random value in `superset_config.py`:
```python
GLOBAL_ASYNC_QUERIES_JWT_SECRET = "<output of: openssl rand -base64 42>"
```json
"changes": [
{"kind": "field", "path": "slice_name", "from_value": "Old", "to_value": "New"}
]
```
The check is only active when the relevant feature is enabled, so deployments that do not use global async queries (or embedding) are not affected.
The array is empty for baseline (`operation_type=0`) transactions. `kind` enumerates structured record types (`field`, layout-walker records for dashboards, dataset child diffs for `TableColumn` / `SqlMetric`); `path` is a dotted JSON-pointer-style locator; `from_value` / `to_value` are JSON-safe scalars or compact records.
### Guest token revocation (opt-in)
**Save-response and ETag headers:**
Embedded guest tokens can be coarsely revoked at runtime via a new opt-in mechanism. A new config flag `GUEST_TOKEN_REVOCATION_ENABLED` (default `False`) gates the feature. When enabled, every minted guest token carries a revocation version, and tokens whose version is below the current expected version (stored in the metadata database) are rejected at validation time.
- Save responses (`PUT /api/v1/{resource}/<pk>`) include `old_version_uuid` and `new_version_uuid` body fields so the client can correlate a save with its resulting version row.
- All entity GETs (`GET /api/v1/{chart,dashboard,dataset}/<pk>`), version-list GETs, single-version GETs, and save responses emit an `ETag: "<version_uuid>"` header reflecting the entity's current live version. The default `CORS_OPTIONS` now sets `expose_headers: ["ETag"]` so cross-origin browser clients can read the header. **No `If-Match` enforcement in v1**`ETag` is informational; concurrent-edit detection is deferred to a follow-up SIP.
- **Operators overriding `CORS_OPTIONS` in `superset_config.py` MUST include `"expose_headers": ["ETag"]`** (or merge with the default) for cross-origin clients to read the ETag. A bare `CORS_OPTIONS = {"origins": [...]}` will silently drop the expose-headers default.
Bump the expected version with the new CLI command to invalidate all outstanding guest tokens:
**Behaviour changes on save:**
```bash
superset revoke-guest-tokens
```
- Every save of a chart, dashboard, or dataset produces one new version row. Rows preserve the full post-save state (scalar fields for all three entity types; `TableColumn` / `SqlMetric` children for datasets; `dashboard_slices` chart membership for dashboards — children versioned via SQLAlchemy-Continuum shadow tables `table_columns_version`, `sql_metrics_version`, and `dashboard_slices_version`).
- First save after an entity already exists in the DB creates a retroactive baseline version so the UI can show "what this looked like before I edited it."
- Tags, owners, and roles are **not** versioned in v1 (ADR-005). A restore leaves those at their live values.
This change is backward compatible. The feature is off by default, and even when enabled nothing is revoked until an admin explicitly bumps the version: the expected version starts at `0`, and tokens minted before this change (which carry no version claim) are treated as version `0`. No database migration is required.
**New config key:**
### Sessions are terminated when an account is disabled
| Key | Default | Purpose |
|---|---|---|
| `SUPERSET_VERSION_HISTORY_RETENTION_DAYS` | `30` | Versions older than this many days are pruned by a nightly Celery beat task (`superset.tasks.version_history_retention.prune_old_versions`). Each entity's live row (`end_transaction_id IS NULL`) is always preserved; closed historical rows including the baseline age out with the rest. Set to `0` to disable retention entirely. |
Disabling a user account (setting `active` to `False`, via the admin UI, REST API, or CLI) now terminates that user's outstanding sessions on their next request, instead of relying on a passive check. This works for both client-side cookie sessions and server-side session stores via a per-user invalidation epoch (`user_attribute.sessions_invalidated_at`, added by a migration). The mechanism is inert for users that were never disabled (NULL epoch), so there is no behavior change for active users. Re-enabling an account and logging in again starts a fresh, valid session. The migration backfills the epoch for accounts that are already disabled at upgrade time, so re-enabling such an account does not revive a session that predates this feature.
**Impact on external integrations:**
### Opt-in SSH tunnel server host key verification
SSH tunnels can now optionally pin the expected SSH server host key as a defense-in-depth measure against man-in-the-middle attacks. paramiko's transport performs no known-hosts checking by default, so previously the SSH server's identity was not verified. This feature is opt-in and off by default; existing tunnels are unaffected.
- A new nullable `server_host_key` column on the `ssh_tunnels` table stores the expected host key in authorized-key form (e.g. `ssh-ed25519 AAAA...`). It is a public key and is stored in plaintext. It can be set via the SSH tunnel POST/PUT payloads (`ssh_tunnel.server_host_key`).
- When a tunnel has `server_host_key` set, Superset connects to the SSH server, reads the host key it presents, and rejects the tunnel if it does not match.
- A new config flag `SSH_TUNNEL_STRICT_HOST_KEY_CHECKING` (default `False`) controls fail-closed behavior. When `True`, every tunnel must declare a `server_host_key`; a tunnel without one is rejected.
Runbook to adopt:
1. Capture the SSH server's host key, e.g. `ssh-keyscan -t ed25519 ssh.example.com` (verify it out-of-band).
2. Set that value on the tunnel's `server_host_key` (via the database/SSH tunnel API or UI payload).
3. Optionally set `SSH_TUNNEL_STRICT_HOST_KEY_CHECKING = True` in `superset_config.py` to require host-key verification on all tunnels.
### Dataset import validates catalog against the target connection
Importing a dataset now validates the `catalog` field against the target database connection. When the connection has multi-catalog disabled (`allow_multi_catalog` off) and the dataset's catalog is not the connection's default catalog, the import fails instead of silently persisting the non-default catalog. This matches the validation already enforced on the dataset update path and prevents imported datasets from querying an unintended database.
If you relied on importing datasets with a non-default catalog, enable "Allow changing catalogs" on the target connection, or set the dataset's catalog to the connection's default before importing.
### Extension supply-chain controls (denylist + version policy)
Two opt-in static gates control which extensions are allowed to load:
- `EXTENSION_DENYLIST` refuses extensions matching an id (every version) or `id@version` (a single version), e.g. `["compromised-extension", "other-ext@1.2.3"]`.
- `EXTENSION_VERSION_POLICY` enforces a minimum version per extension id, e.g. `{"acme.widget": "1.2.0"}` (PEP 440 comparison); a release below the minimum is refused.
Both default to empty (no behavior change). They apply to both the `LOCAL_EXTENSIONS` and `EXTENSIONS_PATH` load paths.
### Dynamic Group By respects the sort toggle for display values
The Dynamic Group By chart customization now orders its display values according to the "Sort display control values" toggle: ascending (AZ), descending (ZA), or the dataset's source order when the toggle is unset. Previously the dropdown always sorted alphabetically. Existing dashboards where the toggle was never set will show options in source order instead of AZ; open the customization and enable the toggle to restore alphabetical ordering.
### Selectable encryption engine for app-encrypted fields (AES-GCM)
App-encrypted fields (database passwords, SSH tunnel credentials, OAuth tokens, etc.) can now use authenticated **AES-GCM** encryption instead of the historical unauthenticated **AES-CBC**. A new config selects the engine for the default adapter:
```python
# "aes" (AES-CBC, historical default) | "aes-gcm" (authenticated, recommended for new installs)
SQLALCHEMY_ENCRYPTED_FIELD_ENGINE = "aes"
```
**No action required / no behavior change:** the default remains `"aes"`, so existing installs are unaffected.
**Opting in on an existing install:** flipping the engine on a populated database without re-encrypting first will make stored secrets undecryptable, because the two ciphertext formats are not compatible. A migrator is provided. Recommended runbook:
1. Take a metadata-DB backup.
2. Re-encrypt existing secrets into the new engine (the `SECRET_KEY` is unchanged):
```bash
superset re-encrypt-secrets --engine aes-gcm
```
3. Set `SQLALCHEMY_ENCRYPTED_FIELD_ENGINE = "aes-gcm"` in your config.
4. Restart Superset.
5. Re-run the migrator once more after the restart:
```bash
superset re-encrypt-secrets --engine aes-gcm
```
A live instance keeps writing *new* secrets as AES-CBC during the window between step 2 and the restart in step 4; this second pass sweeps those up (it is idempotent, so already-migrated values are skipped).
Schedule the cutover in a quiet window. Runtime reads use only the single configured engine, so in a multi-worker deployment there is an unavoidable brief decrypt-outage between the migration commit and the last worker restarting with the new config — each migrator run is transactional, but the fleet-wide cutover is not zero-downtime.
The migration is transactional (all-or-nothing) and idempotent — it can be safely re-run or resumed. Note that AES-GCM, unlike AES-CBC, does not support querying directly over encrypted columns; audit any code that filters on an encrypted column before switching. See the SIP at `docs/sip/authenticated-encryption-at-rest.md` for details.
- New tables populated on every save — `dashboards_version`, `slices_version`, `tables_version` (parent shadow tables for the three entity types), `table_columns_version`, `sql_metrics_version`, `dashboard_slices_version` (child shadow tables), plus the shared `version_transaction` and `version_changes` tables. External tooling that queries Superset's DB directly will see writes to these tables proportional to save traffic.
- Existing entity endpoints (`GET`/`PUT /api/v1/{chart,dashboard,dataset}/<pk>`) gain an `ETag` response header and the save response gains `old_version_uuid` / `new_version_uuid` body fields. No existing fields are removed or repurposed.
- Version capture is always active — no feature flag.
### Granular Export Controls
@@ -470,6 +360,246 @@ See `superset/mcp_service/PRODUCTION.md` for deployment guides.
}
```
### Composite primary keys on many-to-many association tables
The eight M:N association tables listed below have been changed from a synthetic surrogate `id INTEGER PRIMARY KEY` to a composite `PRIMARY KEY (fk1, fk2)` on the two foreign-key columns. The `id` column is dropped, and the two tables that previously carried a redundant `UNIQUE (fk1, fk2)` constraint have that constraint removed (it is now subsumed by the composite primary key).
**Affected tables and their composite-PK column pairs:**
| Table | Composite PK |
|---|---|
| `dashboard_roles` | `(dashboard_id, role_id)` |
| `dashboard_slices` | `(dashboard_id, slice_id)` |
| `dashboard_user` | `(user_id, dashboard_id)` |
| `report_schedule_user` | `(user_id, report_schedule_id)` |
| `rls_filter_roles` | `(role_id, rls_filter_id)` |
| `rls_filter_tables` | `(table_id, rls_filter_id)` |
| `slice_user` | `(user_id, slice_id)` |
| `sqlatable_user` | `(user_id, table_id)` |
**Impact on external readers:** Any BI tool, custom report, backup script, or external integration that references these tables by their old surrogate `id` column (e.g., `SELECT id FROM dashboard_slices WHERE …`, `WHERE dashboard_slices.id IN (…)`) will break. Update such queries to project or filter on the FK pair (`dashboard_id, slice_id`) instead. The FK columns themselves are unchanged.
**Pre-flight inventory queries.** Before applying the upgrade, operators are encouraged to run the queries below against their database to assess what the migration will change. Two classes of pre-existing data are not preserved by the migration: duplicate `(fk1, fk2)` rows (the migration keeps `MIN(id)` and deletes the rest) and rows with `NULL` in either FK column (the migration deletes them, since FK columns are promoted to `NOT NULL` for the composite PK). Compliance- or audit-sensitive operators should also `\copy` (Postgres) or `SELECT … INTO OUTFILE` (MySQL) the affected rows for their own records before upgrading.
```sql
-- Duplicate (fk1, fk2) pairs (the migration will keep MIN(id) per group, delete the rest)
SELECT dashboard_id, role_id, COUNT(*) FROM dashboard_roles GROUP BY dashboard_id, role_id HAVING COUNT(*) > 1;
SELECT dashboard_id, slice_id, COUNT(*) FROM dashboard_slices GROUP BY dashboard_id, slice_id HAVING COUNT(*) > 1;
SELECT user_id, dashboard_id, COUNT(*) FROM dashboard_user GROUP BY user_id, dashboard_id HAVING COUNT(*) > 1;
SELECT user_id, report_schedule_id, COUNT(*) FROM report_schedule_user GROUP BY user_id, report_schedule_id HAVING COUNT(*) > 1;
SELECT role_id, rls_filter_id, COUNT(*) FROM rls_filter_roles GROUP BY role_id, rls_filter_id HAVING COUNT(*) > 1;
SELECT table_id, rls_filter_id, COUNT(*) FROM rls_filter_tables GROUP BY table_id, rls_filter_id HAVING COUNT(*) > 1;
SELECT user_id, slice_id, COUNT(*) FROM slice_user GROUP BY user_id, slice_id HAVING COUNT(*) > 1;
SELECT user_id, table_id, COUNT(*) FROM sqlatable_user GROUP BY user_id, table_id HAVING COUNT(*) > 1;
-- Rows with a NULL in either FK (the migration will delete these)
SELECT COUNT(*) FROM dashboard_roles WHERE dashboard_id IS NULL OR role_id IS NULL;
SELECT COUNT(*) FROM dashboard_slices WHERE dashboard_id IS NULL OR slice_id IS NULL;
SELECT COUNT(*) FROM dashboard_user WHERE user_id IS NULL OR dashboard_id IS NULL;
SELECT COUNT(*) FROM report_schedule_user WHERE user_id IS NULL OR report_schedule_id IS NULL;
SELECT COUNT(*) FROM rls_filter_roles WHERE role_id IS NULL OR rls_filter_id IS NULL;
SELECT COUNT(*) FROM rls_filter_tables WHERE table_id IS NULL OR rls_filter_id IS NULL;
SELECT COUNT(*) FROM slice_user WHERE user_id IS NULL OR slice_id IS NULL;
SELECT COUNT(*) FROM sqlatable_user WHERE user_id IS NULL OR table_id IS NULL;
```
**Sizing the maintenance window on PostgreSQL.** The queries above are dialect-portable but only count rows. Operators on PostgreSQL can run the diagnostic queries below to characterize the migration's runtime cost ahead of time: per-table row count and on-disk size, an aggregated duplicate roll-up, the external-FK pre-flight check (the migration runs the same check and aborts if it returns rows), and a rewrite-time estimate for the two tables that go through the slower full-table-rebuild path.
```sql
-- Per-table size, row count, and which migration path each will take.
-- Two tables ("dashboard_slices", "report_schedule_user") have a
-- redundant UNIQUE constraint that the migration drops via a full
-- table rewrite (op.batch_alter_table(recreate="always")). The other
-- six use direct ALTER TABLE, which is much cheaper.
WITH affected(name, has_unique) AS (
VALUES
('dashboard_roles', false),
('dashboard_slices', true),
('dashboard_user', false),
('report_schedule_user', true),
('rls_filter_roles', false),
('rls_filter_tables', false),
('slice_user', false),
('sqlatable_user', false)
)
SELECT
a.name AS table_name,
CASE WHEN a.has_unique THEN 'recreate (full rewrite)'
ELSE 'direct ALTER' END AS migration_path,
c.reltuples::bigint AS estimated_rows,
pg_size_pretty(pg_total_relation_size(c.oid)) AS total_size,
pg_size_pretty(pg_relation_size(c.oid)) AS heap_size,
pg_size_pretty(pg_indexes_size(c.oid)) AS index_size
FROM affected a
JOIN pg_class c ON c.relname = a.name AND c.relkind = 'r'
ORDER BY pg_total_relation_size(c.oid) DESC;
```
```sql
-- Aggregated duplicate-row roll-up.
-- "dup_groups" is the number of (fk1, fk2) pairs that appear more
-- than once; "rows_dropped" is the total number of rows the
-- migration will delete during the dedupe pass (it keeps MIN(id) per
-- group and discards the rest).
SELECT 'dashboard_roles' AS t, COUNT(*) AS dup_groups, SUM(c) - COUNT(*) AS rows_dropped
FROM (SELECT COUNT(*) c FROM dashboard_roles GROUP BY dashboard_id, role_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'dashboard_slices', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM dashboard_slices GROUP BY dashboard_id, slice_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'dashboard_user', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM dashboard_user GROUP BY user_id, dashboard_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'report_schedule_user',COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM report_schedule_user GROUP BY user_id, report_schedule_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'rls_filter_roles', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM rls_filter_roles GROUP BY role_id, rls_filter_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'rls_filter_tables', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM rls_filter_tables GROUP BY table_id, rls_filter_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'slice_user', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM slice_user GROUP BY user_id, slice_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'sqlatable_user', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM sqlatable_user GROUP BY user_id, table_id HAVING COUNT(*) > 1) g
ORDER BY rows_dropped DESC NULLS LAST;
```
```sql
-- External-FK pre-flight check.
-- The migration runs the equivalent check at upgrade time and aborts
-- if any external FK references one of the soon-to-be-removed `id`
-- columns. Running it ahead of time lets you discover (and migrate)
-- any such reference before the maintenance window. On a stock
-- Superset install this should return zero rows. (Default schema
-- only; multi-schema deployments need to broaden the lookup.)
SELECT
rc.constraint_name,
kcu.table_schema || '.' || kcu.table_name AS referencing_table,
kcu.column_name AS referencing_column,
ccu.table_name AS referenced_table,
ccu.column_name AS referenced_column
FROM information_schema.referential_constraints rc
JOIN information_schema.key_column_usage kcu
ON kcu.constraint_name = rc.constraint_name
AND kcu.constraint_schema = rc.constraint_schema
JOIN information_schema.constraint_column_usage ccu
ON ccu.constraint_name = rc.constraint_name
AND ccu.constraint_schema = rc.constraint_schema
WHERE ccu.table_name IN (
'dashboard_roles','dashboard_slices','dashboard_user',
'report_schedule_user','rls_filter_roles','rls_filter_tables',
'slice_user','sqlatable_user')
AND ccu.column_name = 'id';
```
```sql
-- Lock-window estimate for the two full-rewrite tables.
-- recreate="always" takes ACCESS EXCLUSIVE on the table for the full
-- rewrite. Use heap size combined with your hardware's effective
-- write throughput (~100-200 MB/s on commodity SSD; faster on NVMe)
-- to size the maintenance window. The other six tables use direct
-- ALTER and are dominated by composite-index build time, typically
-- seconds for tables in the low millions of rows.
SELECT
c.relname AS table_name,
pg_size_pretty(pg_relation_size(c.oid)) AS heap_size,
pg_relation_size(c.oid) / 1024 / 1024 AS heap_size_mb,
ROUND(pg_relation_size(c.oid) / 1024 / 1024 / 100.0, 1) AS est_rewrite_seconds_at_100mbs
FROM pg_class c
WHERE c.relname IN ('dashboard_slices', 'report_schedule_user');
```
**Sizing the maintenance window on MySQL.** Equivalent diagnostic queries for MySQL/InnoDB. One important difference from PostgreSQL: InnoDB rebuilds the clustered index on every PK change, so *all eight* tables undergo a full table rebuild on MySQL — not just the two that go through the explicit `recreate="always"` path. The lock-window estimate query below therefore covers all eight tables.
```sql
-- Per-table size, row count, and which migration path each will take.
-- TABLE_ROWS is an InnoDB estimate (analogous to PostgreSQL's reltuples);
-- run SELECT COUNT(*) per table for an exact count if needed.
SELECT
TABLE_NAME AS table_name,
CASE WHEN TABLE_NAME IN ('dashboard_slices', 'report_schedule_user')
THEN 'recreate (explicit, drops UNIQUE)'
ELSE 'direct ALTER (still rebuilds InnoDB clustered index)'
END AS migration_path,
TABLE_ROWS AS estimated_rows,
CONCAT(ROUND((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024, 1), ' MB') AS total_size,
CONCAT(ROUND(DATA_LENGTH / 1024 / 1024, 1), ' MB') AS heap_size,
CONCAT(ROUND(INDEX_LENGTH / 1024 / 1024, 1), ' MB') AS index_size
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = DATABASE()
AND TABLE_NAME IN (
'dashboard_roles', 'dashboard_slices', 'dashboard_user',
'report_schedule_user', 'rls_filter_roles', 'rls_filter_tables',
'slice_user', 'sqlatable_user'
)
ORDER BY (DATA_LENGTH + INDEX_LENGTH) DESC;
```
```sql
-- Aggregated duplicate-row roll-up. Same SQL as the PostgreSQL version
-- (standard SQL); included here for copy-paste convenience.
SELECT 'dashboard_roles' AS t, COUNT(*) AS dup_groups, SUM(c) - COUNT(*) AS rows_dropped
FROM (SELECT COUNT(*) c FROM dashboard_roles GROUP BY dashboard_id, role_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'dashboard_slices', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM dashboard_slices GROUP BY dashboard_id, slice_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'dashboard_user', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM dashboard_user GROUP BY user_id, dashboard_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'report_schedule_user',COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM report_schedule_user GROUP BY user_id, report_schedule_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'rls_filter_roles', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM rls_filter_roles GROUP BY role_id, rls_filter_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'rls_filter_tables', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM rls_filter_tables GROUP BY table_id, rls_filter_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'slice_user', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM slice_user GROUP BY user_id, slice_id HAVING COUNT(*) > 1) g
UNION ALL SELECT 'sqlatable_user', COUNT(*), SUM(c) - COUNT(*)
FROM (SELECT COUNT(*) c FROM sqlatable_user GROUP BY user_id, table_id HAVING COUNT(*) > 1) g
ORDER BY rows_dropped DESC;
```
```sql
-- External-FK pre-flight check. KEY_COLUMN_USAGE on MySQL carries
-- both sides of the FK in a single row, so this is simpler than the
-- PostgreSQL version. Should return zero rows on a stock install.
SELECT
CONSTRAINT_NAME,
CONCAT(TABLE_SCHEMA, '.', TABLE_NAME) AS referencing_table,
COLUMN_NAME AS referencing_column,
REFERENCED_TABLE_NAME AS referenced_table,
REFERENCED_COLUMN_NAME AS referenced_column
FROM information_schema.KEY_COLUMN_USAGE
WHERE TABLE_SCHEMA = DATABASE()
AND REFERENCED_TABLE_NAME IN (
'dashboard_roles', 'dashboard_slices', 'dashboard_user',
'report_schedule_user', 'rls_filter_roles', 'rls_filter_tables',
'slice_user', 'sqlatable_user'
)
AND REFERENCED_COLUMN_NAME = 'id';
```
```sql
-- Lock-window estimate for ALL EIGHT tables (InnoDB rebuilds the
-- clustered index on PK change, so even "direct ALTER" is a rewrite).
-- ADD PRIMARY KEY is INPLACE but not LOCK=NONE — it allows concurrent
-- reads but blocks writes. Use heap size combined with your effective
-- rebuild throughput (~100-200 MB/s on commodity SSD; higher on NVMe).
SELECT
TABLE_NAME AS table_name,
CONCAT(ROUND(DATA_LENGTH / 1024 / 1024, 1), ' MB') AS heap_size,
ROUND(DATA_LENGTH / 1024 / 1024, 1) AS heap_size_mb,
ROUND(DATA_LENGTH / 1024 / 1024 / 100.0, 1) AS est_rewrite_seconds_at_100mbs
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = DATABASE()
AND TABLE_NAME IN (
'dashboard_roles', 'dashboard_slices', 'dashboard_user',
'report_schedule_user', 'rls_filter_roles', 'rls_filter_tables',
'slice_user', 'sqlatable_user'
)
ORDER BY DATA_LENGTH DESC;
```
**Restoring an old `pg_dump` (or equivalent) against the new schema.** A dump taken before the migration includes `INSERT` statements that populate the now-removed `id` column. Restoring such a dump against the post-migration schema will fail. The supported workaround is to dump only the schema and reference data, then re-create the M:N associations from application data after restore — for example with `pg_dump --exclude-table-data` (or per-table `--exclude-table-data=dashboard_slices` etc.) for the eight junction tables, restore the rest, then run a one-shot script that re-INSERTs `(fk1, fk2)` pairs derived from your application export. Operators who need to restore an old dump verbatim should restore against a pre-migration Superset and then re-run the upgrade.
**Intentional downgrade asymmetry.** The migration's `downgrade()` restores the surrogate `id` column and (for `dashboard_slices` and `report_schedule_user`) the original `UNIQUE (fk1, fk2)` constraint, but it does **not** restore the original `NULL`-allowed state on the FK columns — they remain `NOT NULL`. This is intentional: under SQLAlchemy's `secondary=` semantics, a `NULL` in either FK column of a junction table is meaningless (it cannot participate in either side of the relationship). Operators downgrading are not expected to need this restored. The asymmetry is documented for completeness so that round-trip schema diffs are not mistaken for migration bugs.
**Constraint-name divergence between upgrade and downgrade.** The composite primary key created on upgrade is named `pk_<table>` (Alembic's default for `op.create_primary_key("pk_<table>", ...)`), while the surrogate `id` primary key restored on downgrade is named `<table>_pkey` (PostgreSQL's default convention for `PrimaryKeyConstraint("id")`). The two names alternate so that a round-trip (upgrade → downgrade → upgrade) does not collide on a pre-existing constraint name. Operators using schema-comparison tools (e.g. `pg_diff`, `migra`) against a downgraded database may see this as drift versus a fresh-install schema. It is cosmetic — no application code references either constraint name.
## 6.0.0
- [33055](https://github.com/apache/superset/pull/33055): Upgrades Flask-AppBuilder to 5.0.0. The AUTH_OID authentication type has been deprecated and is no longer available as an option in Flask-AppBuilder. OpenID (OID) is considered a deprecated authentication protocol - if you are using AUTH_OID, you will need to migrate to an alternative authentication method such as OAuth, LDAP, or database authentication before upgrading.
- [34871](https://github.com/apache/superset/pull/34871): Fixed Jest test hanging issue from Ant Design v5 upgrade. MessageChannel is now mocked in test environment to prevent rc-overflow from causing Jest to hang. Test environment only - no production impact.

117
docker-compose-mysql.yml Normal file
View File

@@ -0,0 +1,117 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
# Compose override that swaps the default Postgres metadata DB for MySQL 8.
# Useful for evaluating dialect-specific behaviour (e.g., DDL-migration
# cost on a deployment whose production metadata DB is MySQL).
#
# Usage:
# docker compose -f docker-compose.yml -f docker-compose-mysql.yml up
# docker compose -f docker-compose.yml -f docker-compose-mysql.yml down
#
# To switch back to Postgres, just drop the second `-f` flag — the MySQL
# data lives in a separate volume (`db_home_mysql`) so neither side is
# corrupted by switching dialects.
#
# Notes:
# - Mirrors the connection settings used by CI's `test-mysql` shard:
# dialect ``mysql+mysqldb``, charset utf8mb4 with binary_prefix.
# - Host port 13306 (configurable via DATABASE_PORT_MYSQL) to avoid
# colliding with a native MySQL install on 3306.
# - The Postgres-specific init scripts under
# docker/docker-entrypoint-initdb.d/ are not mounted (they are
# postgres-only); examples / cypress fixtures still load via
# `superset-init`'s post-startup steps.
# Shared environment override applied to every Superset-side service that
# connects to the metadata DB. ``environment:`` takes precedence over the
# values inherited from the env_file in docker-compose.yml.
x-mysql-env: &mysql-env
DATABASE_DIALECT: mysql+mysqldb
DATABASE_HOST: db
DATABASE_PORT: "3306"
DATABASE_DB: superset
DATABASE_USER: superset
DATABASE_PASSWORD: superset
SQLALCHEMY_DATABASE_URI: "mysql+mysqldb://superset:superset@db:3306/superset?charset=utf8mb4&binary_prefix=true"
# Override the analytics-examples DB connection too. ``EXAMPLES_PORT``
# in docker/.env is hardcoded to 5432 (the Postgres port); without
# this override the examples connection would try MySQL on 5432 and
# fail. The examples user/DB are created by docker/mysql-init/
# examples-init.sql on first MySQL boot.
EXAMPLES_HOST: db
EXAMPLES_PORT: "3306"
EXAMPLES_DB: examples
EXAMPLES_USER: examples
EXAMPLES_PASSWORD: examples
SUPERSET__SQLALCHEMY_EXAMPLES_URI: "mysql+mysqldb://examples:examples@db:3306/examples?charset=utf8mb4&binary_prefix=true"
services:
db:
image: mysql:8.0
environment:
MYSQL_DATABASE: superset
MYSQL_USER: superset
MYSQL_PASSWORD: superset
MYSQL_ROOT_PASSWORD: root
# The original 5432 port mapping is harmless on a MySQL container
# (nothing listens on 5432 inside it) but we add 13306->3306 so the
# MySQL port is reachable from the host without colliding with a
# native MySQL on 3306. Compose merges port lists.
ports:
- "127.0.0.1:${DATABASE_PORT_MYSQL:-13306}:3306"
# Override the init-scripts mount by re-binding the same target path
# to a MySQL-compatible directory. Compose merges volume lists by
# target path; later definitions win on conflict, so this displaces
# the Postgres-specific ``./docker/docker-entrypoint-initdb.d`` mount
# from docker-compose.yml. Without this, MySQL would try to run
# ``cypress-init.sh`` (which invokes ``psql``, not in the MySQL
# image), abort the init phase, and never create the ``examples``
# database. Add the MySQL data volume separately.
volumes:
- db_home_mysql:/var/lib/mysql
- ./docker/mysql-init:/docker-entrypoint-initdb.d
command:
- --default-authentication-plugin=caching_sha2_password
- --character-set-server=utf8mb4
- --collation-server=utf8mb4_0900_ai_ci
healthcheck:
test: ["CMD-SHELL", "mysqladmin ping -h localhost -uroot -proot --silent"]
interval: 5s
timeout: 5s
retries: 20
superset:
environment: *mysql-env
superset-init:
environment: *mysql-env
superset-worker:
environment: *mysql-env
superset-worker-beat:
environment: *mysql-env
superset-node:
environment: *mysql-env
superset-tests-worker:
environment: *mysql-env
volumes:
db_home_mysql:

View File

@@ -61,31 +61,6 @@ services:
volumes:
- ./docker/nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./docker/nginx/templates:/etc/nginx/templates:ro
# Wait for the webpack dev server's manifest.json to be served before
# starting nginx. This prevents 404s on static assets at startup. The
# probe targets host.docker.internal so it works regardless of whether
# the dev server runs in the superset-node container
# (BUILD_SUPERSET_FRONTEND_IN_DOCKER=true, the default) or directly on
# the host (BUILD_SUPERSET_FRONTEND_IN_DOCKER=false).
command:
- /bin/bash
- -c
- |
url="http://host.docker.internal:9000/static/assets/manifest.json"
max_attempts=150 # ~5 minutes at 2s intervals
echo "Waiting for webpack dev server at $url..."
attempt=0
until curl -sf --max-time 5 -o /dev/null "$url"; do
attempt=$((attempt + 1))
if [ "$attempt" -ge "$max_attempts" ]; then
echo "ERROR: webpack dev server did not serve $url after $max_attempts attempts (~5 minutes)." >&2
echo "Is the dev server running? With BUILD_SUPERSET_FRONTEND_IN_DOCKER=false you must start it on the host (e.g. 'npm run dev' in superset-frontend)." >&2
exit 1
fi
sleep 2
done
echo "Webpack dev server is ready; starting nginx."
exec nginx -g 'daemon off;'
redis:
image: redis:7

View File

@@ -80,23 +80,7 @@ case "${1}" in
;;
app)
echo "Starting web app (using development server)..."
# Environment-based debugger control for security
# Only enable Werkzeug interactive debugger when explicitly requested
# Modern Werkzeug (3.0+) includes PIN protection, but defense-in-depth approach
# Override FLASK_DEBUG so the effective state matches SUPERSET_DEBUG_ENABLED even
# when FLASK_DEBUG=true is inherited from docker/.env or .flaskenv
if [[ "${SUPERSET_DEBUG_ENABLED:-}" == "true" ]]; then
export FLASK_DEBUG=1
DEBUGGER_FLAG="--debugger"
echo " ⚠️ Werkzeug debugger enabled (requires PIN for /console access)"
else
export FLASK_DEBUG=0
DEBUGGER_FLAG="--no-debugger"
echo " 🔒 Werkzeug debugger disabled (set SUPERSET_DEBUG_ENABLED=true to enable)"
fi
flask run -p $PORT --reload $DEBUGGER_FLAG --host=0.0.0.0 --exclude-patterns "*/node_modules/*:*/.venv/*:*/build/*:*/__pycache__/*:*/superset-frontend/*"
flask run -p $PORT --reload --debugger --host=0.0.0.0 --exclude-patterns "*/node_modules/*:*/.venv/*:*/build/*:*/__pycache__/*:*/superset-frontend/*"
;;
app-gunicorn)
echo "Starting web app..."

View File

@@ -0,0 +1,32 @@
-- Licensed to the Apache Software Foundation (ASF) under one
-- or more contributor license agreements. See the NOTICE file
-- distributed with this work for additional information
-- regarding copyright ownership. The ASF licenses this file
-- to you under the Apache License, Version 2.0 (the
-- "License"); you may not use this file except in compliance
-- with the License. You may obtain a copy of the License at
--
-- http://www.apache.org/licenses/LICENSE-2.0
--
-- Unless required by applicable law or agreed to in writing,
-- software distributed under the License is distributed on an
-- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-- KIND, either express or implied. See the License for the
-- specific language governing permissions and limitations
-- under the License.
-- MySQL counterpart to docker/docker-entrypoint-initdb.d/examples-init.sh.
-- Creates the analytics-examples database and user that Superset's
-- ``load-examples`` command writes to. Mounted by docker-compose-mysql.yml
-- at /docker-entrypoint-initdb.d/ so the MySQL image's first-boot
-- entrypoint runs it automatically. (The Postgres init scripts under
-- docker/docker-entrypoint-initdb.d/ are NOT mounted on the MySQL
-- service — they invoke psql, which doesn't exist in the MySQL image.)
CREATE DATABASE IF NOT EXISTS examples
CHARACTER SET utf8mb4
COLLATE utf8mb4_0900_ai_ci;
CREATE USER IF NOT EXISTS 'examples'@'%' IDENTIFIED BY 'examples';
GRANT ALL PRIVILEGES ON examples.* TO 'examples'@'%';
FLUSH PRIVILEGES;

View File

@@ -18,8 +18,6 @@
# Configuration for docker-compose-light.yml - disables Redis and uses minimal services
# Import all settings from the main config first
import os
from flask_caching.backends.filesystemcache import FileSystemCache
from superset_config import * # noqa: F403
@@ -38,32 +36,3 @@ THUMBNAIL_CACHE_CONFIG = CACHE_CONFIG
# Disable Celery entirely for lightweight mode
CELERY_CONFIG = None # type: ignore[assignment,misc]
# Honor SUPERSET_FEATURE_<NAME> env vars on top of any flags inherited from
# superset_config. Lets local dev/e2e enable features (e.g. EMBEDDED_SUPERSET)
# without editing shipped config files. Only the literal string "true"
# (case-insensitive) is treated as enabled — "1"/"yes"/"on" are not, matching
# the strict-string convention used elsewhere in Superset's env parsing.
FEATURE_FLAGS = {
**FEATURE_FLAGS, # noqa: F405
**{
name[len("SUPERSET_FEATURE_") :]: value.strip().lower() == "true"
for name, value in os.environ.items()
if name.startswith("SUPERSET_FEATURE_")
},
}
if os.environ.get("SUPERSET_FEATURE_EMBEDDED_SUPERSET", "").strip().lower() == "true":
# Disable Talisman so /embedded/<uuid> doesn't return X-Frame-Options:SAMEORIGIN.
# Without this, browsers refuse to render Superset inside an iframe from a
# different origin (i.e. the embedded SDK use case). Production/CI configures
# Talisman with explicit `frame-ancestors`; for the lightweight local stack we
# just turn it off.
TALISMAN_ENABLED = False
# Guest tokens (used by the embedded SDK) inherit the "Public" role's perms.
# Out of the box Public has zero perms, so embedded dashboards immediately fail
# their first call (`/api/v1/me/roles/`) with 403. Mirror Public to Gamma —
# the standard read-only viewer role — so the embedded flow can authenticate
# and load dashboard data in local dev.
PUBLIC_ROLE_LIKE = "Gamma"

View File

@@ -1 +1 @@
v24.16.0
v22.22.0

View File

@@ -36,9 +36,9 @@ Screenshots will be taken but no messages actually sent as long as `ALERT_REPORT
#### In your `Dockerfile`
You'll need to extend the Superset image to include a headless browser. Your options include:
- Use Playwright with Chromium: this is the recommended approach as of version 4.1.x or greater. Playwright always uses Chromium — the `WEBDRIVER_TYPE` config setting has no effect when Playwright is active. A working example of a Dockerfile that installs these tools is provided under "Building your own production Docker image" on the [Docker Builds](/admin-docs/installation/docker-builds#building-your-own-production-docker-image) page. Enable the `PLAYWRIGHT_REPORTS_AND_THUMBNAILS` feature flag in your config to activate it.
- Use Firefox (Selenium): you'll need to install geckodriver and Firefox. Set `WEBDRIVER_TYPE` to `"firefox"` in your `superset_config.py`.
- Use Chrome (Selenium): you'll need to install Chrome. Set `WEBDRIVER_TYPE` to `"chrome"` in your `superset_config.py`.
- Use Playwright with Chrome: this is the recommended approach as of version 4.1.x or greater. A working example of a Dockerfile that installs these tools is provided under "Building your own production Docker image" on the [Docker Builds](/admin-docs/installation/docker-builds#building-your-own-production-docker-image) page. Read the code comments there as you'll also need to change a feature flag in your config.
- Use Firefox: you'll need to install geckodriver and Firefox.
- Use Chrome without Playwright: you'll need to install Chrome and set the value of `WEBDRIVER_TYPE` to `"chrome"` in your `superset_config.py`.
In Superset versions &lt;=4.0x, users installed Firefox or Chrome and that was documented here.

View File

@@ -86,39 +86,6 @@ instead requires a cachelib object.
See [Async Queries via Celery](/admin-docs/configuration/async-queries-celery) for details.
## Celery beat
Superset has a Celery task that will periodically warm up the cache based on different strategies.
To use it, add the following to your `superset_config.py`:
```python
from celery.schedules import crontab
from superset.config import CeleryConfig
# User that will be used to authenticate and render dashboards for cache warmup
SUPERSET_CACHE_WARMUP_USER = "user_with_permission_to_dashboards"
# Extend the default CeleryConfig to add cache warmup schedule
class CustomCeleryConfig(CeleryConfig):
beat_schedule = {
**CeleryConfig.beat_schedule,
'cache-warmup-hourly': {
'task': 'cache-warmup',
'schedule': crontab(minute=0, hour='*'), # hourly
'kwargs': {
'strategy_name': 'top_n_dashboards',
'top_n': 5,
'since': '7 days ago',
},
},
}
CELERY_CONFIG = CustomCeleryConfig
```
This will cache the top 5 most popular dashboards every hour. For other
strategies, check the `superset/tasks/cache.py` file.
## Caching Thumbnails
This is an optional feature that can be turned on by activating its [feature flag](/admin-docs/configuration/configuring-superset#feature-flags) on config:

View File

@@ -157,15 +157,8 @@ superset load_examples
superset init
# To start a development web server on port 8088, use -p to bind to another port
superset run -p 8088 --with-threads --reload
# For debugging with interactive console (⚠️ localhost only)
# superset run -p 8088 --with-threads --reload --debugger
superset run -p 8088 --with-threads --reload --debugger
```
:::warning Security Note
The `--debugger` flag enables Werkzeug's interactive console at `/console`. Only use this for local development and never bind to `0.0.0.0` or expose the server to networks when debugging is enabled.
:::
If everything worked, you should be able to navigate to `hostname:port` in your browser (e.g.
locally by default at `localhost:8088`) and login using the username and password you created.

View File

@@ -157,15 +157,8 @@ superset load_examples
superset init
# To start a development web server on port 8088, use -p to bind to another port
superset run -p 8088 --with-threads --reload
# For debugging with interactive console (⚠️ localhost only)
# superset run -p 8088 --with-threads --reload --debugger
superset run -p 8088 --with-threads --reload --debugger
```
:::warning Security Note
The `--debugger` flag enables Werkzeug's interactive console at `/console`. Only use this for local development and never bind to `0.0.0.0` or expose the server to networks when debugging is enabled.
:::
If everything worked, you should be able to navigate to `hostname:port` in your browser (e.g.
locally by default at `localhost:8088`) and login using the username and password you created.

View File

@@ -88,6 +88,7 @@ using our `docker compose` constructs to support production-type use-cases. For
environments, we recommend using [minikube](https://minikube.sigs.k8s.io/docs/start/) along
our [installing on k8s](https://superset.apache.org/docs/installation/running-on-kubernetes)
documentation.
configured to be secure.
:::
### Supported environment variables
@@ -102,8 +103,6 @@ Affecting the Docker build process:
save some precious time on startup by `SUPERSET_LOAD_EXAMPLES=no docker compose up`
- **SUPERSET_LOG_LEVEL (default=info)**: Can be set to debug, info, warning, error, critical
for more verbose logging
- **SUPERSET_DEBUG_ENABLED (default=false)**: Enable Werkzeug debugger with interactive console.
Set to `true` for debugging: `SUPERSET_DEBUG_ENABLED=true docker compose up`
For more env vars that affect your configuration, see this
[superset_config.py](https://github.com/apache/superset/blob/master/docker/pythonpath_dev/superset_config.py)

View File

@@ -335,92 +335,6 @@ npm run build-translation
pybabel compile -d superset/translations
```
### Backfilling missing translations with AI
For languages with many untranslated strings, the repo includes a script that
uses Claude AI to generate draft translations for any missing entries. All
AI-generated strings are marked `#, fuzzy` and tagged with an attribution
comment so that human reviewers know they need to be checked before merging.
#### Prerequisites
```bash
pip install -r superset/translations/requirements.txt
```
Claude Code must be installed and authenticated (`claude --version` should
work). The script calls `claude -p` internally — no separate API key is needed.
#### Step 1 — Build the translation index
The index captures every already-translated string in every language and
serves as cross-language context for the AI. Rebuild it whenever `.po` files
change significantly:
```bash
python scripts/translations/build_translation_index.py
# Writes: superset/translations/translation_index.json
```
#### Step 2 — Preview with a dry run
Check what would be translated without writing anything:
```bash
python scripts/translations/backfill_po.py --lang fr --limit 20 --dry-run
```
Output shows each string, its translation, and a context tag:
- No tag — 3+ reference languages available (high confidence)
- `[ctx:N]` — only N other languages have this string (lower confidence)
- `[ctx:0]` — no other language has this string yet; English alone used
#### Step 3 — Run the backfill
```bash
python scripts/translations/backfill_po.py --lang fr
```
Options:
| Flag | Default | Description |
|------|---------|-------------|
| `--lang LANG` | required | ISO language code (`fr`, `de`, `ja`, …) |
| `--batch-size N` | 50 | Strings per Claude request |
| `--limit N` | unlimited | Stop after N entries |
| `--min-context N` | 0 | Skip entries with fewer than N reference translations |
| `--model MODEL` | `claude-sonnet-4-6` | Claude model to use |
| `--dry-run` | off | Print without writing |
| `--no-fuzzy` | off | Don't mark entries as fuzzy |
Use `--min-context 2` to skip strings that have fewer than 2 reference
translations in other languages. Those strings are more likely to be ambiguous
(short labels, UI fragments) where the correct meaning can't be inferred
without additional context.
#### Step 4 — Review and commit
Open the target `.po` file and search for `fuzzy`. For each generated entry:
1. Verify the translation is correct for the UI context.
2. Remove the `# Machine-translated via backfill_po.py` comment and the
`#, fuzzy` flag line once you are satisfied.
3. If the translation is wrong, correct the `msgstr` before removing the flag.
4. Commit the `.po` file — do **not** commit `translation_index.json` (it is
gitignored and regenerated locally).
#### Running via npm
From `superset-frontend/`:
```bash
# Rebuild index
npm run translations:build-index
# Backfill (pass arguments after --)
npm run translations:backfill -- --lang fr --dry-run
```
## Linting
### Python

View File

@@ -917,23 +917,6 @@ const config: Config = {
footer: {
links: [],
copyright: `
<div class="footer__social-links">
<a href="https://bit.ly/join-superset-slack" target="_blank" rel="noopener noreferrer" title="Join us on Slack" aria-label="Slack">
<img src="/img/community/slack-symbol.svg" alt="Slack" />
</a>
<a href="https://x.com/apachesuperset" target="_blank" rel="noopener noreferrer" title="Follow us on X" aria-label="X">
<img src="/img/community/x-symbol.svg" alt="X" />
</a>
<a href="https://www.linkedin.com/company/apache-superset" target="_blank" rel="noopener noreferrer" title="Follow us on LinkedIn" aria-label="LinkedIn">
<img src="/img/community/linkedin-symbol.svg" alt="LinkedIn" />
</a>
<a href="https://bsky.app/profile/apachesuperset.bsky.social" target="_blank" rel="noopener noreferrer" title="Follow us on Bluesky" aria-label="Bluesky">
<img src="/img/community/bluesky-symbol.svg" alt="Bluesky" />
</a>
<a href="https://reddit.com/r/apache-superset" target="_blank" rel="noopener noreferrer" title="Follow us on Reddit" aria-label="Reddit">
<img src="/img/community/reddit-symbol.svg" alt="Reddit" />
</a>
</div>
<div class="footer__ci-services">
<span>CI powered by</span>
<a href="https://www.netlify.com/" target="_blank" rel="nofollow noopener noreferrer"><img src="/img/netlify.png" alt="Netlify" title="Netlify - Deploy Previews" /></a>

View File

@@ -1,6 +1,6 @@
{
"copyright": {
"message": "\n <div class=\"footer__social-links\">\n <a href=\"https://bit.ly/join-superset-slack\" target=\"_blank\" rel=\"noopener noreferrer\" title=\"Join us on Slack\" aria-label=\"Slack\">\n <img src=\"/img/community/slack-symbol.svg\" alt=\"Slack\" />\n </a>\n <a href=\"https://x.com/apachesuperset\" target=\"_blank\" rel=\"noopener noreferrer\" title=\"Follow us on X\" aria-label=\"X\">\n <img src=\"/img/community/x-symbol.svg\" alt=\"X\" />\n </a>\n <a href=\"https://www.linkedin.com/company/apache-superset\" target=\"_blank\" rel=\"noopener noreferrer\" title=\"Follow us on LinkedIn\" aria-label=\"LinkedIn\">\n <img src=\"/img/community/linkedin-symbol.svg\" alt=\"LinkedIn\" />\n </a>\n <a href=\"https://bsky.app/profile/apachesuperset.bsky.social\" target=\"_blank\" rel=\"noopener noreferrer\" title=\"Follow us on Bluesky\" aria-label=\"Bluesky\">\n <img src=\"/img/community/bluesky-symbol.svg\" alt=\"Bluesky\" />\n </a>\n <a href=\"https://reddit.com/r/apache-superset\" target=\"_blank\" rel=\"noopener noreferrer\" title=\"Follow us on Reddit\" aria-label=\"Reddit\">\n <img src=\"/img/community/reddit-symbol.svg\" alt=\"Reddit\" />\n </a>\n </div>\n <div class=\"footer__ci-services\">\n <span>CI powered by</span>\n <a href=\"https://www.netlify.com/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\"><img src=\"/img/netlify.png\" alt=\"Netlify\" title=\"Netlify - Deploy Previews\" /></a>\n </div>\n <p>Copyright © 2026,\n The <a href=\"https://www.apache.org/\" target=\"_blank\" rel=\"noreferrer\">Apache Software Foundation</a>,\n Licensed under the Apache <a href=\"https://apache.org/licenses/LICENSE-2.0\" target=\"_blank\" rel=\"noreferrer\">License</a>.</p>\n <p><small>Apache Superset, Apache, Superset, the Superset logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation. All other products or name brands are trademarks of their respective holders, including The Apache Software Foundation.\n <a href=\"https://www.apache.org/\" target=\"_blank\">Apache Software Foundation</a> resources</small></p>\n <img class=\"footer__divider\" src=\"/img/community/line.png\" alt=\"Divider\" />\n <p>\n <small>\n <a href=\"/admin-docs/security/\" target=\"_blank\" rel=\"noreferrer\">Security</a>&nbsp;|&nbsp;\n <a href=\"https://www.apache.org/foundation/sponsorship.html\" target=\"_blank\" rel=\"noreferrer\">Donate</a>&nbsp;|&nbsp;\n <a href=\"https://www.apache.org/foundation/thanks.html\" target=\"_blank\" rel=\"noreferrer\">Thanks</a>&nbsp;|&nbsp;\n <a href=\"https://apache.org/events/current-event\" target=\"_blank\" rel=\"noreferrer\">Events</a>&nbsp;|&nbsp;\n <a href=\"https://apache.org/licenses/\" target=\"_blank\" rel=\"noreferrer\">License</a>&nbsp;|&nbsp;\n <a href=\"https://privacy.apache.org/policies/privacy-policy-public.html\" target=\"_blank\" rel=\"noreferrer\">Privacy</a>\n </small>\n </p>\n <!-- telemetry/analytics pixel: -->\n <img referrerPolicy=\"no-referrer-when-downgrade\" src=\"https://static.scarf.sh/a.png?x-pxid=39ae6855-95fc-4566-86e5-360d542b0a68\" />\n ",
"message": "\n <div class=\"footer__ci-services\">\n <span>CI powered by</span>\n <a href=\"https://www.netlify.com/\" target=\"_blank\" rel=\"nofollow noopener noreferrer\"><img src=\"/img/netlify.png\" alt=\"Netlify\" title=\"Netlify - Deploy Previews\" /></a>\n </div>\n <p>Copyright © 2026,\n The <a href=\"https://www.apache.org/\" target=\"_blank\" rel=\"noreferrer\">Apache Software Foundation</a>,\n Licensed under the Apache <a href=\"https://apache.org/licenses/LICENSE-2.0\" target=\"_blank\" rel=\"noreferrer\">License</a>.</p>\n <p><small>Apache Superset, Apache, Superset, the Superset logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation. All other products or name brands are trademarks of their respective holders, including The Apache Software Foundation.\n <a href=\"https://www.apache.org/\" target=\"_blank\">Apache Software Foundation</a> resources</small></p>\n <img class=\"footer__divider\" src=\"/img/community/line.png\" alt=\"Divider\" />\n <p>\n <small>\n <a href=\"/docs/security/\" target=\"_blank\" rel=\"noreferrer\">Security</a>&nbsp;|&nbsp;\n <a href=\"https://www.apache.org/foundation/sponsorship.html\" target=\"_blank\" rel=\"noreferrer\">Donate</a>&nbsp;|&nbsp;\n <a href=\"https://www.apache.org/foundation/thanks.html\" target=\"_blank\" rel=\"noreferrer\">Thanks</a>&nbsp;|&nbsp;\n <a href=\"https://apache.org/events/current-event\" target=\"_blank\" rel=\"noreferrer\">Events</a>&nbsp;|&nbsp;\n <a href=\"https://apache.org/licenses/\" target=\"_blank\" rel=\"noreferrer\">License</a>&nbsp;|&nbsp;\n <a href=\"https://privacy.apache.org/policies/privacy-policy-public.html\" target=\"_blank\" rel=\"noreferrer\">Privacy</a>\n </small>\n </p>\n <!-- telemetry/analytics pixel: -->\n <img referrerPolicy=\"no-referrer-when-downgrade\" src=\"https://static.scarf.sh/a.png?x-pxid=39ae6855-95fc-4566-86e5-360d542b0a68\" />\n ",
"description": "The footer copyright"
}
}

View File

@@ -25,17 +25,9 @@
command = "yarn install && yarn build"
# Output directory (relative to base)
publish = "build"
# Skip builds when no docs changes (exit 0 = skip, non-zero = build).
# Checks for changes in docs/ and README.md (which gets pulled into docs).
#
# $CACHED_COMMIT_REF is the last *deployed* commit. On a PR's first build it
# is empty, so the original `git diff` errored and Netlify fell back to
# building -- which is why every PR built a docs preview once even with no
# docs changes. When it is empty we instead diff the whole branch against its
# merge-base with master, so non-docs PRs are skipped from the very first
# build. Subsequent builds (and the master production build) keep the cheaper
# incremental $CACHED_COMMIT_REF diff. Any failure exits non-zero -> build.
ignore = 'if [ -n "$CACHED_COMMIT_REF" ]; then git diff --quiet "$CACHED_COMMIT_REF" "$COMMIT_REF" -- . ../README.md; else git fetch origin master --depth=100 >/dev/null 2>&1; git diff --quiet "$(git merge-base origin/master "$COMMIT_REF" 2>/dev/null || echo origin/master)" "$COMMIT_REF" -- . ../README.md; fi'
# Skip builds when no docs changes (exit 0 = skip, exit 1 = build)
# Checks for changes in docs/ and README.md (which gets pulled into docs)
ignore = "git diff --quiet $CACHED_COMMIT_REF $COMMIT_REF -- . ../README.md"
[build.environment]
# Node version matching docs/.nvmrc

View File

@@ -43,7 +43,7 @@
"version:remove:components": "node scripts/manage-versions.mjs remove components"
},
"dependencies": {
"@ant-design/icons": "^6.2.5",
"@ant-design/icons": "^6.2.3",
"@docusaurus/core": "^3.10.1",
"@docusaurus/faster": "^3.10.1",
"@docusaurus/plugin-client-redirects": "^3.10.1",
@@ -70,13 +70,13 @@
"@storybook/preview-api": "^8.6.18",
"@storybook/theming": "^8.6.15",
"@superset-ui/core": "^0.20.4",
"@swc/core": "^1.15.40",
"@swc/core": "^1.15.33",
"antd": "^6.4.3",
"baseline-browser-mapping": "^2.10.34",
"caniuse-lite": "^1.0.30001797",
"baseline-browser-mapping": "^2.10.30",
"caniuse-lite": "^1.0.30001793",
"docusaurus-plugin-openapi-docs": "^5.0.2",
"docusaurus-theme-openapi-docs": "^5.0.2",
"js-yaml": "^4.2.0",
"js-yaml": "^4.1.1",
"js-yaml-loader": "^1.2.2",
"json-bigint": "^1.0.0",
"prism-react-renderer": "^2.4.1",
@@ -101,16 +101,16 @@
"@types/js-yaml": "^4.0.9",
"@types/react": "^19.1.8",
"@typescript-eslint/eslint-plugin": "^8.59.3",
"@typescript-eslint/parser": "^8.60.1",
"@typescript-eslint/parser": "^8.59.3",
"eslint": "^9.39.2",
"eslint-config-prettier": "^10.1.8",
"eslint-plugin-prettier": "^5.5.6",
"eslint-plugin-prettier": "^5.5.5",
"eslint-plugin-react": "^7.37.5",
"globals": "^17.6.0",
"prettier": "^3.8.3",
"typescript": "~6.0.3",
"typescript-eslint": "^8.60.1",
"webpack": "^5.107.2"
"typescript-eslint": "^8.59.3",
"webpack": "^5.106.2"
},
"browserslist": {
"production": [
@@ -128,13 +128,7 @@
"react-redux": "^9.2.0",
"@reduxjs/toolkit": "^2.5.0",
"baseline-browser-mapping": "^2.9.19",
"swagger-client": "3.37.3",
"lodash": "4.18.1",
"lodash-es": "4.18.1",
"yaml": "1.10.3",
"uuid": "11.1.1",
"serialize-javascript": "7.0.5",
"d3-color": "3.1.0"
"swagger-client": "3.37.3"
},
"packageManager": "yarn@1.22.22+sha1.ac34549e6aa8e7ead463a7407e1c7390f61a6610"
}

View File

@@ -206,26 +206,12 @@ async function downloadBadge(url, staticDir) {
badgeCache.set(url, webPath);
return webPath;
} catch (error) {
// Soft fallback: keep the original remote URL in the rendered output
// so the badge still appears for readers, and the docs build continues.
// External badge services (notably img.shields.io) rate-limit CI IPs
// aggressively, and a transient fetch failure shouldn't take the whole
// docs build down with it. Set REMARK_BADGES_STRICT=true to opt back
// into hard-fail-the-build behavior (e.g. for release builds where you
// want to catch genuinely broken badge URLs).
if (process.env.REMARK_BADGES_STRICT === 'true') {
throw new Error(
`[remark-localize-badges] Failed to download badge: ${url}\n` +
`Error: ${error.message}\n` +
`Build cannot continue with broken badges (REMARK_BADGES_STRICT=true).`,
);
}
console.warn(
`[remark-localize-badges] Could not localize ${url} ` +
`(${error.message}); falling back to remote URL.`,
// Fail the build on badge download failure
throw new Error(
`[remark-localize-badges] Failed to download badge: ${url}\n` +
`Error: ${error.message}\n` +
`Build cannot continue with broken badges. Please fix the badge URL or remove it.`,
);
badgeCache.set(url, url);
return url;
} finally {
// Clean up the in-flight tracker
inFlightDownloads.delete(url);

View File

@@ -1,136 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# SIP: Authenticated encryption (AES-GCM) for app-encrypted fields
## [DRAFT — proposal for discussion]
This document is a draft proposal accompanying the code in this PR. It is
intended to seed the formal SIP discussion. The code here ships the
backward-compatible engine selection **and** the re-encryption migrator
(Phases 12 below); both are opt-in and change nothing for existing installs by
default. Flipping the default for fresh installs (Phase 3) remains future work.
## Motivation
Superset app-encrypts a number of sensitive fields before persisting them to
the metadata database, including:
- database connection passwords and `encrypted_extra` (`superset/models/core.py`),
- SSH tunnel credentials — password, private key, private-key password
(`superset/databases/ssh_tunnel/models.py`),
- OAuth2 tokens and other secrets stored via `EncryptedType`.
These fields are encrypted with `sqlalchemy_utils.EncryptedType`, which
**defaults to `AesEngine` (AES-CBC)**. AES-CBC provides confidentiality but is
**unauthenticated**: it has no integrity tag. An attacker with write access to
the ciphertext (e.g. direct metadata-DB access, a backup, or a compromised
replica) can perform **bit-flipping / chosen-ciphertext manipulation** to
silently alter the decrypted plaintext of a secret without detection.
`AesGcmEngine` (AES-GCM) is authenticated encryption: tampering causes
decryption to fail loudly rather than yielding attacker-influenced plaintext.
Using authenticated encryption for secrets at rest is an ASVS L1 expectation
(11.3.2 / cryptography best practice).
`config.py` already documents that operators *can* switch to GCM by writing a
custom `AbstractEncryptedFieldAdapter`, but:
1. it is opt-in, undocumented as a security recommendation, and easy to miss;
2. there is **no migration path** — flipping the engine on a populated database
makes every existing secret undecryptable, because GCM ciphertext is not
format-compatible with CBC.
## Proposed change
A three-part change, delivered incrementally so existing deployments are never
broken:
### Phase 1 — engine selection (this PR)
- Add a `SQLALCHEMY_ENCRYPTED_FIELD_ENGINE` config (`"aes"` | `"aes-gcm"`),
**defaulting to `"aes"`** (no behavior change for existing installs).
- Teach the default `SQLAlchemyUtilsAdapter` to honor it (an explicit `engine`
kwarg still wins, so the migrator can pin an engine).
- This lets **new** deployments choose AES-GCM from day one with a one-line
config, instead of writing a custom adapter.
### Phase 2 — CBC→GCM re-encryption migrator (this PR)
The existing `SecretsMigrator` (previously only used for `SECRET_KEY` rotation)
gains an **engine migration** mode that:
1. discovers every `EncryptedType` column (via `discover_encrypted_fields()`),
2. decrypts each value with the **source** engine (AES-CBC) under the current
`SECRET_KEY`,
3. re-encrypts with the **target** engine (AES-GCM),
4. runs transactionally per the existing all-or-nothing semantics, and is
idempotent per column (already-migrated values are skipped), so a run can be
safely repeated or resumed.
Exposed via a new `--engine` option on the existing CLI command:
`superset re-encrypt-secrets --engine aes-gcm`, runnable by operators with a DB
backup in hand. The `SECRET_KEY` is unchanged; an engine change and a key
rotation can also be combined (pass `--previous_secret_key` as well).
### Phase 3 — flip the default for new installs
Once the migrator and docs are in place, change the default to `"aes-gcm"` for
**fresh** installs only (e.g. keyed off an empty metadata DB / documented in
`UPDATING.md`), keeping existing installs on `"aes"` until they run Phase 2.
## New or changed public interfaces
- New config: `SQLALCHEMY_ENCRYPTED_FIELD_ENGINE: Literal["aes", "aes-gcm"]`.
- New (Phase 2) CLI: `superset re-encrypt-secrets --engine <name>`.
- No schema changes; ciphertext format changes per migrated column.
## Migration plan and compatibility
- **Backward compatible by default.** Phase 1 changes nothing unless the
operator opts in.
- Switching an existing deployment to `"aes-gcm"` **without** running the Phase
2 migrator will make existing secrets undecryptable — this is called out in
the config comment and must be in `UPDATING.md`.
- Recommended operator runbook: take a metadata-DB backup → run
`re-encrypt-secrets --engine aes-gcm` → set
`SQLALCHEMY_ENCRYPTED_FIELD_ENGINE = "aes-gcm"` → restart → re-run
`re-encrypt-secrets --engine aes-gcm` once more to sweep up any secrets a live
instance wrote as AES-CBC during the cutover window. The canonical, more
detailed version of this runbook lives in `UPDATING.md`; this is a summary.
- `AesEngine` allows queryability over encrypted fields; AES-GCM does not.
Any code that filters/queries on an encrypted column directly must be audited
before Phase 3 (none is expected, but it must be verified).
## Rejected alternatives
- **Flip the default immediately.** Rejected: bricks every existing
deployment's secrets with no migration path.
- **Document-only (custom adapter).** Status quo; high friction and no
migration tooling — most operators will never do it.
## Open questions
- GCM→CBC rollback (for operators who need queryability) already works via the
same command (`re-encrypt-secrets --engine aes`), since the migrator is
engine-symmetric. Should rollback be documented as a supported path or
discouraged?
- The migrator already supports a concurrent `SECRET_KEY` rotation + engine
change in a single pass (pass `--previous_secret_key` alongside `--engine`).
Is that combination worth calling out in the operator docs, or kept advanced?

View File

@@ -260,45 +260,10 @@ a > span > svg {
.footer {
position: relative;
padding-top: 130px;
padding-top: 90px;
font-size: 15px;
}
.footer__social-links {
background-color: #173036;
position: absolute;
top: 52px;
left: 0;
width: 100%;
padding: 10px 0;
display: flex;
align-items: center;
justify-content: center;
gap: 24px;
}
.footer__social-links a {
display: inline-flex;
align-items: center;
transition: opacity 0.2s, transform 0.2s;
}
.footer__social-links a:hover {
opacity: 0.8;
transform: scale(1.1);
}
.footer__social-links img {
height: 24px;
width: 24px;
/* The brand SVGs ship in their native colors (e.g. Slack's dark aubergine,
X's near-black), which disappear on the dark footer. Render them all as
uniform white silhouettes. The icons are single-path glyphs whose
counters (the LinkedIn "in", Slack gaps, Reddit face) are transparent
cut-outs, so they stay legible against the footer background. */
filter: brightness(0) invert(1);
}
.footer__ci-services {
background-color: #0d3e49;
color: #e1e1e1;
@@ -344,21 +309,6 @@ a > span > svg {
}
@media only screen and (max-width: 996px) {
.footer {
padding-top: 120px;
}
.footer__social-links {
top: 44px;
gap: 20px;
padding: 8px 16px;
}
.footer__social-links img {
height: 20px;
width: 20px;
}
.footer__ci-services {
gap: 12px;
padding: 10px 16px;

View File

@@ -1,21 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="40" height="40" fill="#FF4500">
<path d="M12 0A12 12 0 0 0 0 12a12 12 0 0 0 12 12 12 12 0 0 0 12-12A12 12 0 0 0 12 0zm5.01 4.744c.688 0 1.25.561 1.25 1.249a1.25 1.25 0 0 1-2.498.056l-2.597-.547-.8 3.747c1.824.07 3.48.632 4.674 1.488.308-.309.73-.491 1.207-.491.968 0 1.754.786 1.754 1.754 0 .716-.435 1.333-1.01 1.614a3.111 3.111 0 0 1 .042.52c0 2.694-3.13 4.87-7.004 4.87-3.874 0-7.004-2.176-7.004-4.87 0-.183.015-.366.043-.534A1.748 1.748 0 0 1 4.028 12c0-.968.786-1.754 1.754-1.754.463 0 .898.196 1.207.49 1.207-.883 2.878-1.43 4.744-1.487l.885-4.182a.342.342 0 0 1 .14-.197.35.35 0 0 1 .238-.042l2.906.617a1.214 1.214 0 0 1 1.108-.701zM9.25 12c-.688 0-1.25.561-1.25 1.25 0 .687.562 1.248 1.25 1.248.687 0 1.248-.561 1.248-1.249 0-.688-.561-1.249-1.249-1.249zm5.5 0c-.687 0-1.248.561-1.248 1.25 0 .687.561 1.248 1.249 1.248.688 0 1.249-.561 1.249-1.249 0-.687-.562-1.249-1.25-1.249zm-5.466 3.99a.327.327 0 0 0-.231.094.33.33 0 0 0 0 .463c.842.842 2.484.913 2.961.913.477 0 2.105-.056 2.961-.913a.361.361 0 0 0 .029-.463.33.33 0 0 0-.464 0c-.547.533-1.684.73-2.512.73-.828 0-1.979-.196-2.512-.73a.326.326 0 0 0-.232-.095z"/>
</svg>

Before

Width:  |  Height:  |  Size: 1.9 KiB

View File

@@ -1,21 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="40" height="40" fill="#4A154B">
<path d="M5.042 15.165a2.528 2.528 0 0 1-2.52 2.523A2.528 2.528 0 0 1 0 15.165a2.527 2.527 0 0 1 2.522-2.52h2.52v2.52zm1.271 0a2.527 2.527 0 0 1 2.521-2.52 2.527 2.527 0 0 1 2.521 2.52v6.313A2.528 2.528 0 0 1 8.834 24a2.528 2.528 0 0 1-2.521-2.522v-6.313zM8.834 5.042a2.528 2.528 0 0 1-2.521-2.52A2.528 2.528 0 0 1 8.834 0a2.528 2.528 0 0 1 2.521 2.522v2.52H8.834zm0 1.271a2.528 2.528 0 0 1 2.521 2.521 2.528 2.528 0 0 1-2.521 2.521H2.522A2.528 2.528 0 0 1 0 8.834a2.528 2.528 0 0 1 2.522-2.521h6.312zm10.124 2.521a2.528 2.528 0 0 1 2.522-2.521A2.528 2.528 0 0 1 24 8.834a2.528 2.528 0 0 1-2.52 2.521h-2.522V8.834zm-1.271 0a2.528 2.528 0 0 1-2.521 2.521 2.528 2.528 0 0 1-2.521-2.521V2.522A2.528 2.528 0 0 1 15.166 0a2.528 2.528 0 0 1 2.521 2.522v6.312zm-2.521 10.124a2.528 2.528 0 0 1 2.521 2.522A2.528 2.528 0 0 1 15.166 24a2.528 2.528 0 0 1-2.521-2.52v-2.522h2.521zm0-1.271a2.528 2.528 0 0 1-2.521-2.521 2.528 2.528 0 0 1 2.521-2.521h6.312A2.528 2.528 0 0 1 24 15.165a2.528 2.528 0 0 1-2.52 2.521h-6.313z"/>
</svg>

Before

Width:  |  Height:  |  Size: 1.9 KiB

File diff suppressed because it is too large Load Diff

View File

@@ -15,7 +15,7 @@
# limitations under the License.
#
apiVersion: v2
appVersion: "6.1.0"
appVersion: "5.0.0"
description: Apache Superset is a modern, enterprise-ready business intelligence web application
name: superset
icon: https://artifacthub.io/image/68c1d717-0e97-491f-b046-754e46f46922@2x
@@ -29,7 +29,7 @@ maintainers:
- name: craig-rueda
email: craig@craigrueda.com
url: https://github.com/craig-rueda
version: 0.16.0 # See [README](https://github.com/apache/superset/blob/master/helm/superset/README.md#versioning) for version details.
version: 0.15.5 # See [README](https://github.com/apache/superset/blob/master/helm/superset/README.md#versioning) for version details.
dependencies:
- name: postgresql
version: 16.7.27

View File

@@ -23,7 +23,7 @@ NOTE: This file is generated by helm-docs: https://github.com/norwoodj/helm-docs
# superset
![Version: 0.16.0](https://img.shields.io/badge/Version-0.16.0-informational?style=flat-square)
![Version: 0.15.5](https://img.shields.io/badge/Version-0.15.5-informational?style=flat-square)
Apache Superset is a modern, enterprise-ready business intelligence web application

View File

@@ -39,7 +39,7 @@ dependencies = [
"apache-superset-core",
"backoff>=1.8.0",
"celery>=5.3.6, <6.0.0",
"click>=8.4.0",
"click>=8.0.3",
"click-option-group",
"colorama",
"flask-cors>=6.0.0, <7.0",
@@ -55,23 +55,23 @@ dependencies = [
"flask-login>=0.6.0, < 1.0",
"flask-migrate>=3.1.0, <5.0",
"flask-session>=0.4.0, <1.0",
"flask-wtf>=1.3.0, <2.0",
"flask-wtf>=1.1.0, <2.0",
"geopy",
"greenlet>=3.0.3, <=3.5.0",
"gunicorn>=25.3.0, <26; sys_platform != 'win32'",
"gunicorn>=22.0.0; sys_platform != 'win32'",
"hashids>=1.3.1, <2",
# holidays>=0.45 required for security fix
"holidays>=0.45, <1",
"humanize",
"isodate",
"jsonpath-ng>=1.8.0, <2",
"jsonpath-ng>=1.6.1, <2",
"Mako>=1.2.2",
"markdown>=3.10.2",
"markdown>=3.0",
# marshmallow>=4 has issues: https://github.com/apache/superset/issues/33162
"marshmallow>=3.0, <4",
"marshmallow-union>=0.1",
"msgpack>=1.0.0, <1.2",
"nh3>=0.3.5, <0.4",
"nh3>=0.2.11, <0.4",
"numpy>1.23.5, <2.3",
"packaging",
# --------------------------
@@ -80,7 +80,7 @@ dependencies = [
"bottleneck", # recommended performance dependency for pandas, see https://pandas.pydata.org/docs/getting_started/install.html#performance-dependencies-recommended
# --------------------------
"parsedatetime",
"paramiko>=3.4.0, <4.0", # 4.0 removed DSSKey, still referenced by sshtunnel
"paramiko>=3.4.0",
"pgsanity",
"Pillow>=11.0.0, <13",
"polyline>=2.0.0, <3.0",
@@ -89,27 +89,28 @@ dependencies = [
"python-dateutil",
"python-dotenv", # optional dependencies for Flask but required for Superset, see https://flask.palletsprojects.com/en/stable/installation/#optional-dependencies
"pygeohash",
"pyarrow>=24.0.0, <25", # before upgrading pyarrow, check that all db dependencies support this, see e.g. https://github.com/apache/superset/pull/34693
"pyarrow>=16.1.0, <21", # before upgrading pyarrow, check that all db dependencies support this, see e.g. https://github.com/apache/superset/pull/34693
"pyyaml>=6.0.0, <7.0.0",
"PyJWT>=2.4.0, <3.0",
"redis>=5.0.0, <6.0",
"rison>=2.0.0, <3.0",
"selenium>=4.44.0, <5.0",
"selenium>=4.14.0, <5.0",
"shillelagh[gsheetsapi]>=1.4.4, <2.0",
"sshtunnel>=0.4.0, <0.5",
"simplejson>=3.15.0",
"slack_sdk>=3.19.0, <4",
"sqlalchemy>=1.4, <2",
"sqlalchemy-continuum>=1.6.0, <2.0.0",
"sqlalchemy-utils>=0.38.0, <0.43", # expanding lowerbound to work with pydoris
"sqlglot>=30.8.0, <31",
"sqlglot>=28.10.0, <29",
# newer pandas needs 0.9+
"tabulate>=0.10.0, <1.0",
"tabulate>=0.9.0, <1.0",
"typing-extensions>=4, <5",
"waitress; sys_platform == 'win32'",
"watchdog>=6.0.0",
"wtforms>=3.2.2, <4",
"wtforms>=2.3.3, <4",
"wtforms-json",
"xlsxwriter>=3.2.9, <3.3",
"xlsxwriter>=3.0.7, <3.3",
]
[project.optional-dependencies]
@@ -118,12 +119,12 @@ athena = ["pyathena[pandas]>=2, <4"]
aurora-data-api = ["preset-sqlalchemy-aurora-data-api>=0.2.8,<0.3"]
bigquery = [
"pandas-gbq>=0.19.1",
"sqlalchemy-bigquery>=1.17.0",
"sqlalchemy-bigquery>=1.15.0",
"google-cloud-bigquery>=3.10.0",
]
clickhouse = ["clickhouse-connect>=1.1.1, <2.0"]
clickhouse = ["clickhouse-connect>=0.13.0, <2.0"]
cockroachdb = ["cockroachdb>=0.3.5, <0.4"]
crate = ["sqlalchemy-cratedb>=0.41.0, <1"]
crate = ["sqlalchemy-cratedb>=0.40.1, <1"]
d1 = [
"superset-engine-d1>=0.1.0",
"sqlalchemy-d1>=0.1.0",
@@ -135,39 +136,39 @@ databricks = [
"databricks-sqlalchemy==1.0.5",
]
db2 = ["ibm-db-sa>0.3.8, <=0.4.4"]
denodo = ["denodo-sqlalchemy>=2.0.5,<2.1.0"]
denodo = ["denodo-sqlalchemy>=1.0.6,<2.1.0"]
dremio = ["sqlalchemy-dremio>=1.2.1, <4"]
drill = ["sqlalchemy-drill>=1.1.10, <2"]
drill = ["sqlalchemy-drill>=1.1.4, <2"]
druid = ["pydruid>=0.6.5,<0.7"]
duckdb = ["duckdb>=1.5.2,<2", "duckdb-engine>=0.17.0"]
duckdb = ["duckdb>=1.4.2,<2", "duckdb-engine>=0.17.0"]
dynamodb = ["pydynamodb>=0.4.2"]
solr = ["sqlalchemy-solr >= 0.2.0"]
elasticsearch = ["elasticsearch-dbapi>=0.2.13, <0.3.0"]
exasol = ["sqlalchemy-exasol>=2.4.0, <8.0"]
exasol = ["sqlalchemy-exasol >= 2.4.0, < 8.0"]
excel = ["xlrd>=1.2.0, <1.3"]
fastmcp = [
"fastmcp>=3.2.4,<4.0",
# tiktoken backs the response-size-guard token estimator. Without
# it, the middleware falls back to a coarser character-based
# heuristic that under-counts JSON-heavy MCP responses.
"tiktoken>=0.13.0,<1.0",
"tiktoken>=0.7.0,<1.0",
]
firebird = ["sqlalchemy-firebird>=0.7.0, <2.2"]
firebolt = ["firebolt-sqlalchemy>=1.0.0, <2"]
gevent = ["gevent>=26.4.0"]
gevent = ["gevent>=23.9.1"]
gsheets = ["shillelagh[gsheetsapi]>=1.4.4, <2"]
hana = ["hdbcli==2.28.20", "sqlalchemy_hana==0.4.0"]
hive = [
"pyhive[hive]>=0.6.5;python_version<'3.11'",
"pyhive[hive_pure_sasl]>=0.7.0",
"tableschema",
"thrift>=0.23.0, <1.0.0",
"thrift>=0.14.1, <1.0.0",
"thrift_sasl>=0.4.3, < 1.0.0",
]
impala = ["impyla>0.16.2, <0.23"]
kusto = ["sqlalchemy-kusto>=3.1.2, <4"]
kusto = ["sqlalchemy-kusto>=3.0.0, <4"]
kylin = ["kylinpy>=2.8.1, <2.9"]
mssql = ["pymssql>=2.3.13, <3"]
mssql = ["pymssql>=2.2.8, <3"]
# motherduck is an alias for duckdb - MotherDuck works via the duckdb driver
motherduck = ["apache-superset[duckdb]"]
mysql = ["mysqlclient>=2.1.0, <3"]
@@ -180,7 +181,7 @@ ocient = [
oracle = ["cx-Oracle>8.0.0, <8.4"]
parseable = ["sqlalchemy-parseable>=0.1.3,<0.2.0"]
pinot = ["pinotdb>=5.0.0, <10.0.0"]
playwright = ["playwright>=1.60.0, <2"]
playwright = ["playwright>=1.37.0, <2"]
postgres = ["psycopg2-binary==2.9.12"]
presto = ["pyhive[presto]>=0.6.5"]
trino = ["trino>=0.328.0"]
@@ -190,25 +191,25 @@ risingwave = ["sqlalchemy-risingwave"]
shillelagh = ["shillelagh[all]>=1.4.4, <2"]
singlestore = ["sqlalchemy-singlestoredb>=1.1.1, <2"]
snowflake = ["snowflake-sqlalchemy>=1.2.4, <2"]
sqlite = ["syntaqlite>=0.1.0,<0.5.0"]
sqlite = ["syntaqlite>=0.1.0"]
spark = [
"pyhive[hive]>=0.6.5;python_version<'3.11'",
"pyhive[hive_pure_sasl]>=0.7",
"tableschema",
"thrift>=0.23.0, <1",
"thrift>=0.14.1, <1",
]
tdengine = [
"taospy>=2.7.21",
"taos-ws-py>=0.6.9"
"taos-ws-py>=0.3.8"
]
teradata = ["teradatasql>=16.20.0.23"]
thumbnails = [] # deprecated, will be removed in 7.0
vertica = ["sqlalchemy-vertica-python>= 0.6.3, < 0.7"]
vertica = ["sqlalchemy-vertica-python>= 0.5.9, < 0.7"]
netezza = ["nzalchemy>=11.0.2"]
starrocks = ["starrocks>=1.3.3, <2"]
starrocks = ["starrocks>=1.0.0"]
doris = ["pydoris>=1.0.0, <2.0.0"]
oceanbase = ["oceanbase_py>=0.0.1.2"]
ydb = ["ydb-sqlalchemy>=0.1.2", "ydb-sqlglot-plugin>=0.2.5"]
oceanbase = ["oceanbase_py>=0.0.1"]
ydb = ["ydb-sqlalchemy>=0.1.2"]
development = [
# no bounds for apache-superset-extensions-cli until a stable version
"apache-superset-extensions-cli",
@@ -220,22 +221,21 @@ development = [
"openapi-spec-validator",
"parameterized",
"pip",
"polib", # used by scripts/translations/ and their unit tests
"pre-commit",
"progress>=1.6.1,<2",
"progress>=1.5,<2",
"psutil",
"pyfakefs",
"pyinstrument>=5.1.2,<6",
"pyinstrument>=4.0.2,<6",
"pylint",
"pytest<8.0.0", # hairy issue with pytest >=8 where current_app proxies are not set in time
"pytest-asyncio",
"pytest-cov",
"pytest-mock",
"python-ldap>=3.4.7",
"python-ldap>=3.4.4",
"ruff",
"sqloxide",
"statsd",
"syntaqlite>=0.4.2,<0.5.0",
"syntaqlite>=0.1.0",
]
[project.urls]
@@ -447,7 +447,6 @@ requirement_txt_file = "requirements/base.txt"
authorized_licenses = [
"academic free license (afl)",
"any-osi",
"apache-2.0",
"apache license 2.0",
"apache software",
"apache software, bsd",
@@ -457,7 +456,6 @@ authorized_licenses = [
"isc license (iscl)",
"isc license",
"mit",
"mit and psf-2.0",
"mit-cmu",
"mozilla public license 2.0 (mpl 2.0)",
"osi approved",

View File

@@ -30,7 +30,7 @@ cryptography>=46.0.7,<47.0.0
# Security: Snyk - XSS vulnerability in Mako templates
mako>=1.3.11,<2.0.0
# Security: CVE-2024-52338 (CRITICAL) - Deserialization of untrusted data in IPC/Parquet readers
pyarrow>=24.0.0,<25.0.0
pyarrow>=20.0.0,<21.0.0
# Security: CVE-2026-27459 - pyopenssl certificate validation
pyopenssl>=26.0.0,<27.0.0
# Security: CVE-2026-25645 (MEDIUM) - Insecure Temporary File

View File

@@ -50,7 +50,7 @@ cattrs==25.1.1
# via requests-cache
celery==5.5.2
# via apache-superset (pyproject.toml)
certifi==2026.5.20
certifi==2025.6.15
# via
# requests
# selenium
@@ -60,7 +60,7 @@ cffi==2.0.0
# pynacl
charset-normalizer==3.4.2
# via requests
click==8.4.1
click==8.2.1
# via
# apache-superset (pyproject.toml)
# celery
@@ -151,7 +151,7 @@ flask-sqlalchemy==2.5.1
# flask-migrate
flask-talisman==1.1.0
# via apache-superset (pyproject.toml)
flask-wtf==1.3.0
flask-wtf==1.2.2
# via
# apache-superset (pyproject.toml)
# flask-appbuilder
@@ -161,12 +161,12 @@ geopy==2.4.1
# via apache-superset (pyproject.toml)
google-auth==2.43.0
# via shillelagh
greenlet==3.5.0
greenlet==3.1.1
# via
# apache-superset (pyproject.toml)
# shillelagh
# sqlalchemy
gunicorn==25.3.0
gunicorn==23.0.0
# via apache-superset (pyproject.toml)
h11==0.16.0
# via wsproto
@@ -176,7 +176,7 @@ holidays==0.82
# via apache-superset (pyproject.toml)
humanize==4.12.3
# via apache-superset (pyproject.toml)
idna==3.15
idna==3.10
# via
# email-validator
# requests
@@ -194,7 +194,7 @@ jinja2==3.1.6
# via
# flask
# flask-babel
jsonpath-ng==1.8.0
jsonpath-ng==1.7.0
# via apache-superset (pyproject.toml)
jsonschema==4.23.0
# via
@@ -208,12 +208,12 @@ kombu==5.5.3
# via celery
limits==5.1.0
# via flask-limiter
mako==1.3.12
mako==1.3.11
# via
# -r requirements/base.in
# apache-superset (pyproject.toml)
# alembic
markdown==3.10.2
markdown==3.8.1
# via apache-superset (pyproject.toml)
markdown-it-py==3.0.0
# via rich
@@ -241,7 +241,7 @@ msgpack==1.0.8
# via apache-superset (pyproject.toml)
msgspec==0.19.0
# via flask-session
nh3==0.3.5
nh3==0.2.21
# via apache-superset (pyproject.toml)
numexpr==2.10.2
# via -r requirements/base.in
@@ -286,13 +286,15 @@ pillow==12.2.0
# via apache-superset (pyproject.toml)
platformdirs==4.3.8
# via requests-cache
ply==3.11
# via jsonpath-ng
polyline==2.0.2
# via apache-superset (pyproject.toml)
prison==0.2.1
# via flask-appbuilder
prompt-toolkit==3.0.51
# via click-repl
pyarrow==24.0.0
pyarrow==20.0.0
# via
# -r requirements/base.in
# apache-superset (pyproject.toml)
@@ -378,7 +380,7 @@ rpds-py==0.25.0
# referencing
rsa==4.9.1
# via google-auth
selenium==4.44.0
selenium==4.32.0
# via apache-superset (pyproject.toml)
setuptools==80.9.0
# via -r requirements/base.in
@@ -407,21 +409,24 @@ sqlalchemy==1.4.54
# flask-sqlalchemy
# marshmallow-sqlalchemy
# shillelagh
# sqlalchemy-continuum
# sqlalchemy-utils
sqlalchemy-continuum==1.6.0
# via apache-superset (pyproject.toml)
sqlalchemy-utils==0.42.0
# via
# apache-superset (pyproject.toml)
# apache-superset-core
# flask-appbuilder
sqlglot==30.8.0
sqlglot==28.10.0
# via
# apache-superset (pyproject.toml)
# apache-superset-core
sshtunnel==0.4.0
# via apache-superset (pyproject.toml)
tabulate==0.10.0
tabulate==0.9.0
# via apache-superset (pyproject.toml)
trio==0.33.0
trio==0.30.0
# via
# selenium
# trio-websocket
@@ -449,7 +454,7 @@ tzdata==2025.2
# pandas
url-normalize==2.2.1
# via requests-cache
urllib3==2.7.0
urllib3==2.6.3
# via
# -r requirements/base.in
# requests
@@ -478,7 +483,7 @@ wrapt==1.17.2
# via deprecated
wsproto==1.2.0
# via trio-websocket
wtforms==3.2.2
wtforms==3.2.1
# via
# apache-superset (pyproject.toml)
# flask-appbuilder
@@ -488,7 +493,7 @@ wtforms-json==0.3.5
# via apache-superset (pyproject.toml)
xlrd==2.0.1
# via pandas
xlsxwriter==3.2.9
xlsxwriter==3.0.9
# via
# apache-superset (pyproject.toml)
# pandas

View File

@@ -52,7 +52,7 @@ attrs==25.3.0
# referencing
# requests-cache
# trio
authlib==1.6.12
authlib==1.6.9
# via fastmcp
babel==2.17.0
# via
@@ -112,7 +112,7 @@ celery==5.5.2
# via
# -c requirements/base-constraint.txt
# apache-superset
certifi==2026.5.20
certifi==2025.6.15
# via
# -c requirements/base-constraint.txt
# httpcore
@@ -130,7 +130,7 @@ charset-normalizer==3.4.2
# via
# -c requirements/base-constraint.txt
# requests
click==8.4.1
click==8.2.1
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -219,7 +219,7 @@ docstring-parser==0.17.0
# via cyclopts
docutils==0.22.2
# via rich-rst
duckdb==1.5.3
duckdb==1.4.2
# via
# apache-superset
# duckdb-engine
@@ -312,12 +312,12 @@ flask-talisman==1.1.0
# apache-superset
flask-testing==0.8.1
# via apache-superset
flask-wtf==1.3.0
flask-wtf==1.2.2
# via
# -c requirements/base-constraint.txt
# apache-superset
# flask-appbuilder
fonttools==4.60.2
fonttools==4.55.0
# via matplotlib
freezegun==1.5.1
# via apache-superset
@@ -331,7 +331,7 @@ geopy==2.4.1
# via
# -c requirements/base-constraint.txt
# apache-superset
gevent==26.4.0
gevent==24.2.1
# via apache-superset
google-api-core==2.23.0
# via
@@ -346,7 +346,6 @@ google-auth==2.43.0
# google-api-core
# google-auth-oauthlib
# google-cloud-bigquery
# google-cloud-bigquery-storage
# google-cloud-core
# pandas-gbq
# pydata-google-auth
@@ -361,7 +360,7 @@ google-cloud-bigquery==3.27.0
# apache-superset
# pandas-gbq
# sqlalchemy-bigquery
google-cloud-bigquery-storage==2.26.0
google-cloud-bigquery-storage==2.19.1
# via pandas-gbq
google-cloud-core==2.4.1
# via google-cloud-bigquery
@@ -373,7 +372,7 @@ googleapis-common-protos==1.66.0
# via
# google-api-core
# grpcio-status
greenlet==3.5.0
greenlet==3.1.1
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -389,7 +388,7 @@ grpcio==1.71.0
# grpcio-status
grpcio-status==1.60.1
# via google-api-core
gunicorn==25.3.0
gunicorn==23.0.0
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -422,7 +421,7 @@ humanize==4.12.3
# apache-superset
identify==2.5.36
# via pre-commit
idna==3.15
idna==3.10
# via
# -c requirements/base-constraint.txt
# anyio
@@ -471,7 +470,7 @@ jmespath==1.1.0
# via
# boto3
# botocore
jsonpath-ng==1.8.0
jsonpath-ng==1.7.0
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -507,12 +506,12 @@ limits==5.1.0
# via
# -c requirements/base-constraint.txt
# flask-limiter
mako==1.3.12
mako==1.3.11
# via
# -c requirements/base-constraint.txt
# alembic
# apache-superset
markdown==3.10.2
markdown==3.8.1
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -566,7 +565,7 @@ msgspec==0.19.0
# flask-session
mysqlclient==2.2.6
# via apache-superset
nh3==0.3.5
nh3==0.2.21
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -674,8 +673,10 @@ platformdirs==4.3.8
# virtualenv
pluggy==1.5.0
# via pytest
polib==1.2.0
# via apache-superset
ply==3.11
# via
# -c requirements/base-constraint.txt
# jsonpath-ng
polyline==2.0.2
# via
# -c requirements/base-constraint.txt
@@ -686,7 +687,7 @@ prison==0.2.1
# via
# -c requirements/base-constraint.txt
# flask-appbuilder
progress==1.6.1
progress==1.6
# via apache-superset
prompt-toolkit==3.0.51
# via
@@ -698,7 +699,7 @@ proto-plus==1.25.0
# via
# google-api-core
# google-cloud-bigquery-storage
protobuf==5.29.6
protobuf==4.25.8
# via
# google-api-core
# google-cloud-bigquery-storage
@@ -711,7 +712,7 @@ psycopg2-binary==2.9.12
# via apache-superset
py-key-value-aio==0.4.4
# via fastmcp
pyarrow==24.0.0
pyarrow==20.0.0
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -764,7 +765,7 @@ pygments==2.20.0
# rich
pyhive==0.7.0
# via apache-superset
pyinstrument==5.1.2
pyinstrument==4.4.0
# via apache-superset
pyjwt==2.12.0
# via
@@ -834,9 +835,9 @@ python-dotenv==1.2.2
# apache-superset
# fastmcp
# pydantic-settings
python-ldap==3.4.7
python-ldap==3.4.4
# via apache-superset
python-multipart==0.0.29
python-multipart==0.0.20
# via mcp
pytz==2025.2
# via
@@ -921,7 +922,7 @@ s3transfer==0.16.0
# via boto3
secretstorage==3.5.0
# via keyring
selenium==4.44.0
selenium==4.32.0
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -975,16 +976,21 @@ sqlalchemy==1.4.54
# marshmallow-sqlalchemy
# shillelagh
# sqlalchemy-bigquery
# sqlalchemy-continuum
# sqlalchemy-utils
sqlalchemy-bigquery==1.17.0
sqlalchemy-bigquery==1.15.0
# via apache-superset
sqlalchemy-continuum==1.6.0
# via
# -c requirements/base-constraint.txt
# apache-superset
sqlalchemy-utils==0.42.0
# via
# -c requirements/base-constraint.txt
# apache-superset
# apache-superset-core
# flask-appbuilder
sqlglot==30.8.0
sqlglot==28.10.0
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -1001,13 +1007,13 @@ starlette==0.49.1
# via mcp
statsd==4.0.1
# via apache-superset
syntaqlite==0.4.2
syntaqlite==0.1.0
# via apache-superset
tabulate==0.10.0
tabulate==0.9.0
# via
# -c requirements/base-constraint.txt
# apache-superset
tiktoken==0.13.0
tiktoken==0.12.0
# via apache-superset
tomli-w==1.2.0
# via apache-superset-extensions-cli
@@ -1019,7 +1025,7 @@ tqdm==4.67.1
# prophet
trino==0.330.0
# via apache-superset
trio==0.33.0
trio==0.30.0
# via
# -c requirements/base-constraint.txt
# selenium
@@ -1068,7 +1074,7 @@ url-normalize==2.2.1
# via
# -c requirements/base-constraint.txt
# requests-cache
urllib3==2.7.0
urllib3==2.6.3
# via
# -c requirements/base-constraint.txt
# botocore
@@ -1086,7 +1092,7 @@ vine==5.1.0
# amqp
# celery
# kombu
virtualenv==20.36.1
virtualenv==20.29.2
# via pre-commit
watchdog==6.0.0
# via
@@ -1121,7 +1127,7 @@ wsproto==1.2.0
# via
# -c requirements/base-constraint.txt
# trio-websocket
wtforms==3.2.2
wtforms==3.2.1
# via
# -c requirements/base-constraint.txt
# apache-superset
@@ -1136,7 +1142,7 @@ xlrd==2.0.1
# via
# -c requirements/base-constraint.txt
# pandas
xlsxwriter==3.2.9
xlsxwriter==3.0.9
# via
# -c requirements/base-constraint.txt
# apache-superset

View File

@@ -31,9 +31,7 @@ PATTERNS = {
r"^superset/",
r"^scripts/",
r"^setup\.py",
r"^pyproject\.toml$",
r"^requirements/.+\.txt",
r"^pyproject\.toml",
r"^.pylintrc",
],
"frontend": [
@@ -157,7 +155,7 @@ def main(event_type: str, sha: str, repo: str) -> None:
def get_git_sha() -> str:
return os.getenv("GITHUB_SHA") or subprocess.check_output( # noqa: S603
["git", "rev-parse", "HEAD"] # noqa: S603, S607
["git", "rev-parse", "HEAD"] # noqa: S607
).strip().decode("utf-8")

View File

@@ -55,7 +55,7 @@ if [ ${#js_ts_files[@]} -gt 0 ]; then
echo "$output" >&2
exit 1
}
if [ -n "$output" ]; then echo "$output"; fi
[ -n "$output" ] && echo "$output"
else
echo "No JavaScript/TypeScript files to lint"
fi

View File

@@ -0,0 +1,682 @@
#!/usr/bin/env python3
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
# ----------------------------------------------------------------------
# Stress-test data generator for the composite-PK migration (sc-105349).
#
# Bulk-inserts synthetic parent rows and many-to-many junction rows for
# the eight association tables that the composite-PK migration touches.
# Useful for measuring migration runtime at varying scales — run this at
# 100K / 1M / 5M / 10M rows and time the migration at each scale to
# verify the O(N log N) extrapolation.
#
# Idempotent: rerunning with the same target is a no-op; rerunning with
# a higher target adds rows up to the new total. Batched bulk INSERTs
# (10K rows per statement) make it fast on Postgres, MySQL, and SQLite.
#
# Usage (inside the Superset container):
#
# docker exec superset-superset-1 \\
# /app/.venv/bin/python /app/scripts/seed_junction_load.py \\
# --dashboard-slices 1000000 \\
# --slice-user 100000 \\
# --dashboard-user 100000
#
# Run with no flags for the defaults shown below. Use ``--dry-run`` to
# print the planned inserts without writing anything.
#
# The script connects via Superset's standard ``DATABASE_*`` env vars
# (or ``SUPERSET__SQLALCHEMY_DATABASE_URI`` if set), so it works
# automatically inside the Superset container regardless of which
# metadata DB backend is in use.
from __future__ import annotations
import argparse
import logging
import os
import sys
import time
from contextlib import contextmanager
from typing import Iterator
from uuid import uuid4
import sqlalchemy as sa
from sqlalchemy.engine import Connection, Engine
logger = logging.getLogger("seed_junction_load")
# Bulk INSERT batch size. Larger values = fewer statements but more memory.
BATCH = 10_000
# Default per-junction-table target row counts. Tuned to mimic the shape
# of a large multi-team Superset install. Override via CLI flags.
DEFAULTS: dict[str, int] = {
"dashboard_slices": 1_000_000,
"slice_user": 100_000,
"dashboard_user": 100_000,
"dashboard_roles": 10_000,
}
# (junction_table, fk1_col, fk2_col, parent1_table, parent2_table)
# parents reference id columns; we generate (fk1, fk2) pairs by sampling
# from the parents' existing IDs.
JUNCTIONS: list[tuple[str, str, str, str, str]] = [
("dashboard_slices", "dashboard_id", "slice_id", "dashboards", "slices"),
("slice_user", "user_id", "slice_id", "ab_user", "slices"),
("dashboard_user", "user_id", "dashboard_id", "ab_user", "dashboards"),
("dashboard_roles", "dashboard_id", "role_id", "dashboards", "ab_role"),
]
# Junction tables that originally carried ``UNIQUE(fk1, fk2)`` and therefore
# cannot accept duplicate ``(fk1, fk2)`` pairs even on the pre-migration
# (downgrade) schema. The other JUNCTIONS allow duplicates pre-migration.
JUNCTIONS_WITH_UNIQUE: set[str] = {"dashboard_slices", "report_schedule_user"}
# ----------------------------------------------------------------------
# Connection setup
# ----------------------------------------------------------------------
def build_engine() -> Engine:
"""Build a SQLAlchemy engine from Superset env vars."""
if uri := os.environ.get("SUPERSET__SQLALCHEMY_DATABASE_URI"):
logger.info("Using SUPERSET__SQLALCHEMY_DATABASE_URI from env")
return sa.create_engine(uri)
try:
dialect = os.environ["DATABASE_DIALECT"]
user = os.environ["DATABASE_USER"]
password = os.environ["DATABASE_PASSWORD"]
host = os.environ["DATABASE_HOST"]
port = os.environ["DATABASE_PORT"]
db = os.environ["DATABASE_DB"]
except KeyError as exc:
sys.exit(
f"Missing env var {exc}; either set DATABASE_DIALECT/USER/PASSWORD/"
f"HOST/PORT/DB or SUPERSET__SQLALCHEMY_DATABASE_URI before running."
)
uri = f"{dialect}://{user}:{password}@{host}:{port}/{db}"
logger.info(
"Built URI from DATABASE_* env vars (dialect=%s, host=%s)", dialect, host
)
return sa.create_engine(uri)
# ----------------------------------------------------------------------
# Helpers
# ----------------------------------------------------------------------
def uuid_value(dialect_name: str) -> bytes | str:
"""Return a UUID in the form the active dialect expects.
MySQL stores UUIDs as ``BINARY(16)`` (16 raw bytes); Postgres has a
native ``UUID`` type that accepts strings; SQLite stores them as
BLOB/TEXT and accepts either. Branching here keeps the seed script
backend-agnostic without depending on Superset's custom column types.
"""
if dialect_name.startswith("mysql"):
return uuid4().bytes
return str(uuid4())
@contextmanager
def time_phase(name: str) -> Iterator[None]:
"""Log elapsed wall time for a named phase."""
start = time.monotonic()
logger.info("[%s] starting", name)
try:
yield
finally:
elapsed = time.monotonic() - start
logger.info("[%s] done in %.2fs", name, elapsed)
def count_rows(conn: Connection, table: str) -> int:
return conn.scalar(sa.text(f"SELECT COUNT(*) FROM {table}")) or 0 # noqa: S608
def existing_ids(conn: Connection, table: str, limit: int | None = None) -> list[int]:
sql = f"SELECT id FROM {table} ORDER BY id" # noqa: S608
if limit is not None:
sql += f" LIMIT {limit}"
return [row[0] for row in conn.execute(sa.text(sql))]
# ----------------------------------------------------------------------
# Parent seeders
#
# Each function ensures the named parent table has at least ``target``
# rows by inserting synthetic ones with minimal-but-valid columns.
# Returns nothing; subsequent code reads back IDs via ``existing_ids``.
# ----------------------------------------------------------------------
def seed_dashboards(conn: Connection, target: int, dry_run: bool) -> None:
current = count_rows(conn, "dashboards")
if current >= target:
logger.info(
"dashboards: %d rows (target %d) — no insert needed", current, target
)
return
needed = target - current
logger.info("dashboards: %d%d (+%d)", current, target, needed)
if dry_run:
return
dialect = conn.engine.dialect.name
sql = sa.text(
"INSERT INTO dashboards (uuid, dashboard_title, slug, published) "
"VALUES (:uuid, :title, :slug, :published)"
)
for batch_start in range(0, needed, BATCH):
rows = [
{
"uuid": uuid_value(dialect),
"title": f"seed_dashboard_{current + i}",
"slug": f"seed-dashboard-{current + i}-{uuid4().hex[:8]}",
"published": False,
}
for i in range(batch_start, min(batch_start + BATCH, needed))
]
conn.execute(sql, rows)
logger.info(" dashboards: inserted %d / %d", batch_start + len(rows), needed)
def seed_dbs(conn: Connection, dry_run: bool) -> int:
"""Ensure at least one row exists in ``dbs`` (parent of ``tables``).
Returns the id to use as ``database_id`` when seeding ``tables``."""
ids = existing_ids(conn, "dbs", limit=1)
if ids:
return ids[0]
if dry_run:
return -1 # placeholder
dialect = conn.engine.dialect.name
logger.info("dbs: inserting one synthetic database (no rows present)")
conn.execute(
sa.text(
"INSERT INTO dbs (uuid, database_name, sqlalchemy_uri, expose_in_sqllab) "
"VALUES (:uuid, :name, :uri, :expose)"
),
{
"uuid": uuid_value(dialect),
"name": f"seed_db_{uuid4().hex[:8]}",
"uri": "sqlite:///seed.db",
"expose": False,
},
)
return existing_ids(conn, "dbs", limit=1)[0]
def seed_tables(conn: Connection, target: int, dry_run: bool) -> None:
current = count_rows(conn, "tables")
if current >= target:
logger.info("tables: %d rows (target %d) — no insert needed", current, target)
return
needed = target - current
logger.info("tables: %d%d (+%d)", current, target, needed)
if dry_run:
return
database_id = seed_dbs(conn, dry_run=False)
dialect = conn.engine.dialect.name
sql = sa.text(
"INSERT INTO tables (uuid, table_name, database_id) "
"VALUES (:uuid, :name, :db_id)"
)
for batch_start in range(0, needed, BATCH):
rows = [
{
"uuid": uuid_value(dialect),
"name": f"seed_table_{current + i}",
"db_id": database_id,
}
for i in range(batch_start, min(batch_start + BATCH, needed))
]
conn.execute(sql, rows)
logger.info(" tables: inserted %d / %d", batch_start + len(rows), needed)
def seed_slices(conn: Connection, target: int, dry_run: bool) -> None:
current = count_rows(conn, "slices")
if current >= target:
logger.info("slices: %d rows (target %d) — no insert needed", current, target)
return
needed = target - current
logger.info("slices: %d%d (+%d)", current, target, needed)
if dry_run:
return
# Slices reference tables.id; ensure at least one ``tables`` row exists
# so the FK is satisfiable (datasource_id is nullable but we set it for
# realism). The migration test doesn't care, but a real Superset that
# re-renders these slices does.
seed_tables(conn, target=1, dry_run=False)
table_id = existing_ids(conn, "tables", limit=1)[0]
dialect = conn.engine.dialect.name
sql = sa.text(
"INSERT INTO slices "
"(uuid, slice_name, datasource_id, datasource_type, viz_type) "
"VALUES (:uuid, :name, :ds_id, :ds_type, :viz)"
)
for batch_start in range(0, needed, BATCH):
rows = [
{
"uuid": uuid_value(dialect),
"name": f"seed_slice_{current + i}",
"ds_id": table_id,
"ds_type": "table",
"viz": "table",
}
for i in range(batch_start, min(batch_start + BATCH, needed))
]
conn.execute(sql, rows)
logger.info(" slices: inserted %d / %d", batch_start + len(rows), needed)
def seed_users(conn: Connection, target: int, dry_run: bool) -> None:
current = count_rows(conn, "ab_user")
if current >= target:
logger.info("ab_user: %d rows (target %d) — no insert needed", current, target)
return
needed = target - current
logger.info("ab_user: %d%d (+%d)", current, target, needed)
if dry_run:
return
sql = sa.text(
"INSERT INTO ab_user (first_name, last_name, username, email, active) "
"VALUES (:first, :last, :username, :email, :active)"
)
for batch_start in range(0, needed, BATCH):
rows = [
{
"first": "seed",
"last": f"user_{current + i}",
"username": f"seed_user_{current + i}_{uuid4().hex[:8]}",
"email": f"seed_user_{current + i}_{uuid4().hex[:8]}@example.invalid",
"active": True,
}
for i in range(batch_start, min(batch_start + BATCH, needed))
]
conn.execute(sql, rows)
logger.info(" ab_user: inserted %d / %d", batch_start + len(rows), needed)
def seed_roles(conn: Connection, target: int, dry_run: bool) -> None:
current = count_rows(conn, "ab_role")
if current >= target:
logger.info("ab_role: %d rows (target %d) — no insert needed", current, target)
return
needed = target - current
logger.info("ab_role: %d%d (+%d)", current, target, needed)
if dry_run:
return
sql = sa.text("INSERT INTO ab_role (name) VALUES (:name)")
for batch_start in range(0, needed, BATCH):
rows = [
{"name": f"seed_role_{current + i}_{uuid4().hex[:8]}"}
for i in range(batch_start, min(batch_start + BATCH, needed))
]
conn.execute(sql, rows)
logger.info(" ab_role: inserted %d / %d", batch_start + len(rows), needed)
# ----------------------------------------------------------------------
# Junction seeder
# ----------------------------------------------------------------------
def _load_existing_pairs(
conn: Connection, junction: str, fk1_col: str, fk2_col: str
) -> set[tuple[int, int]]:
"""Load existing ``(fk1, fk2)`` pairs from a junction table into a set.
Used so the seeder can skip them when generating new pairs (junction
tables enforce uniqueness on the FK pair). Memory is ~32 bytes/tuple
on CPython, so 10M existing pairs is ~320MB — acceptable for a dev
machine. The junction / column names come from ``JUNCTIONS``, not
user input, so the f-string interpolation is safe.
"""
sql_text = f"SELECT {fk1_col}, {fk2_col} FROM {junction}" # noqa: S608
return {(row[0], row[1]) for row in conn.execute(sa.text(sql_text))}
def _generate_new_pairs(
p1_ids: list[int],
p2_ids: list[int],
existing_pairs: set[tuple[int, int]],
) -> Iterator[tuple[int, int]]:
"""Yield ``(fk1, fk2)`` pairs from the parent1 × parent2 cross-product
that are not already in ``existing_pairs``."""
for fk1 in p1_ids:
for fk2 in p2_ids:
if (fk1, fk2) not in existing_pairs:
yield (fk1, fk2)
def seed_junction(
conn: Connection,
junction: str,
fk1_col: str,
fk2_col: str,
parent1: str,
parent2: str,
target: int,
dry_run: bool,
) -> None:
"""Bulk-insert junction rows up to ``target`` rows total.
Generates ``(fk1, fk2)`` pairs by walking the cross-product of
parent1 IDs × parent2 IDs in row-major order, skipping pairs that
already exist. Walking the cross-product deterministically keeps
the script replayable: re-running with the same target is a no-op,
and re-running with a higher target appends new pairs in a stable
order regardless of how many runs preceded.
"""
current = count_rows(conn, junction)
if current >= target:
logger.info(
"%s: %d rows (target %d) — no insert needed", junction, current, target
)
return
needed = target - current
logger.info("%s: %d%d (+%d)", junction, current, target, needed)
if dry_run:
return
p1_ids = existing_ids(conn, parent1)
p2_ids = existing_ids(conn, parent2)
max_pairs = len(p1_ids) * len(p2_ids)
if max_pairs < target:
sys.exit(
f"Cannot reach {target} rows in {junction}: "
f"only {max_pairs} unique pairs available "
f"({len(p1_ids)} × {len(p2_ids)}). "
f"Increase parent targets and rerun."
)
existing_pairs: set[tuple[int, int]] = (
_load_existing_pairs(conn, junction, fk1_col, fk2_col) if current > 0 else set()
)
if existing_pairs:
logger.info(
" %s: loaded %d existing pairs into avoidance set",
junction,
len(existing_pairs),
)
insert_sql = sa.text(
f"INSERT INTO {junction} ({fk1_col}, {fk2_col}) " # noqa: S608
f"VALUES (:fk1, :fk2)"
)
inserted = 0
batch: list[dict[str, int]] = []
for fk1, fk2 in _generate_new_pairs(p1_ids, p2_ids, existing_pairs):
batch.append({"fk1": fk1, "fk2": fk2})
inserted += 1
if len(batch) == BATCH or inserted == needed:
conn.execute(insert_sql, batch)
logger.info(" %s: inserted %d / %d", junction, inserted, needed)
batch = []
if inserted == needed:
return
if inserted < needed:
sys.exit(
f"Ran out of unique pairs at {inserted}/{needed} for {junction} "
f"(parents have {len(p1_ids)} × {len(p2_ids)} = {max_pairs} pairs, "
f"{len(existing_pairs)} already present)"
)
# ----------------------------------------------------------------------
# Orchestration
# ----------------------------------------------------------------------
def required_parent_count(target_pairs: int, other_parent: int) -> int:
"""How many rows we need in this parent so that
(this_parent × other_parent) ≥ target_pairs."""
if other_parent == 0:
# Bootstrapping: assume we'll create at least 1
other_parent = 1
return -(-target_pairs // other_parent) # ceil(target_pairs / other_parent)
def _compute_parent_requirements(targets: dict[str, int]) -> dict[str, int]:
"""For each parent table, return the minimum row count needed so that
parent1 × parent2 ≥ target for every junction it participates in.
Allocates ceil(sqrt(target)) rows per parent, balanced across the two
parents of each junction. The actual junction seeder will then walk
the cross-product to produce the target number of unique pairs.
"""
parent_req: dict[str, int] = {}
for junction, _, _, p1, p2 in JUNCTIONS:
target = targets.get(junction, 0)
if target == 0:
continue
sqrt_n = int(target**0.5) + 1
parent_req[p1] = max(parent_req.get(p1, 0), sqrt_n)
parent_req[p2] = max(parent_req.get(p2, 0), sqrt_n)
return parent_req
def _seed_parents(conn: Connection, parent_req: dict[str, int], dry_run: bool) -> None:
"""Seed parent tables in dependency order:
independent parents (ab_user, ab_role) first, then dashboards / slices /
tables (which transitively depend on dbs, seeded inside seed_tables)."""
if "ab_user" in parent_req:
seed_users(conn, parent_req["ab_user"], dry_run)
if "ab_role" in parent_req:
seed_roles(conn, parent_req["ab_role"], dry_run)
if "dashboards" in parent_req:
seed_dashboards(conn, parent_req["dashboards"], dry_run)
if "slices" in parent_req:
seed_slices(conn, parent_req["slices"], dry_run)
if "tables" in parent_req:
seed_tables(conn, parent_req["tables"], dry_run)
def _seed_all_junctions(
conn: Connection, targets: dict[str, int], dry_run: bool
) -> None:
for junction, fk1, fk2, p1, p2 in JUNCTIONS:
target = targets.get(junction, 0)
if target == 0:
continue
with time_phase(f"junction:{junction}"):
seed_junction(conn, junction, fk1, fk2, p1, p2, target, dry_run)
def inject_duplicates(
conn: Connection,
junction: str,
fk1_col: str,
fk2_col: str,
pct: float,
dry_run: bool,
) -> None:
"""Insert duplicate ``(fk1, fk2)`` rows on a non-UNIQUE junction table.
Used to stress-test the migration's ``_dedupe_by_min_id`` phase, which
is otherwise a no-op on cleanly-seeded data. Computes ``count =
current_rows * pct / 100`` and inserts that many rows by re-sampling
existing ``(fk1, fk2)`` pairs in row-major order. The synthetic
duplicates land on top of distinct existing pairs (one duplicate per
distinct pair, then wraps), so the migration's dedupe finds and
deletes them.
**Pre-condition: the table must NOT have UNIQUE on (fk1, fk2)**, i.e.,
the schema must be the pre-migration shape (after running
``superset db downgrade``). On the post-migration schema the composite
PK rejects duplicates and this function will error.
"""
if pct == 0:
return
current = count_rows(conn, junction)
count = int(current * pct / 100)
if count == 0:
logger.info(
"%s: 0 duplicates to inject (current=%d, pct=%g)",
junction,
current,
pct,
)
return
logger.info(
"%s: injecting %d duplicate rows (%g%% of %d existing)",
junction,
count,
pct,
current,
)
if dry_run:
return
select_sql = sa.text(
f"SELECT {fk1_col}, {fk2_col} FROM {junction} ORDER BY id LIMIT :n" # noqa: S608
)
sample = conn.execute(select_sql, {"n": count}).fetchall()
if not sample:
logger.warning("%s: no rows to duplicate (table is empty)", junction)
return
insert_sql = sa.text(
f"INSERT INTO {junction} ({fk1_col}, {fk2_col}) " # noqa: S608
f"VALUES (:fk1, :fk2)"
)
inserted = 0
while inserted < count:
batch: list[dict[str, int]] = []
while len(batch) < BATCH and inserted < count:
row = sample[inserted % len(sample)]
batch.append({"fk1": row[0], "fk2": row[1]})
inserted += 1
conn.execute(insert_sql, batch)
logger.info(" %s: injected %d / %d duplicates", junction, inserted, count)
def _inject_dirty_data(conn: Connection, dirty_pct: float, dry_run: bool) -> None:
"""Inject duplicate rows on every non-UNIQUE seeded junction.
The two tables that originally carried ``UNIQUE(fk1, fk2)`` are
skipped because their composite-PK successor (and their pre-migration
UNIQUE constraint) both reject duplicate inserts.
"""
if dirty_pct == 0:
return
for junction, fk1, fk2, _, _ in JUNCTIONS:
if junction in JUNCTIONS_WITH_UNIQUE:
logger.info(
"%s: skipping duplicate injection (table has UNIQUE on FK pair)",
junction,
)
continue
with time_phase(f"dirty:{junction}"):
inject_duplicates(conn, junction, fk1, fk2, dirty_pct, dry_run)
def run(targets: dict[str, int], dry_run: bool, dirty_duplicates_pct: float) -> None:
engine = build_engine()
with engine.begin() as conn:
parent_req = _compute_parent_requirements(targets)
logger.info("Required parent row counts: %s", parent_req)
with time_phase("parents"):
_seed_parents(conn, parent_req, dry_run)
with time_phase("junctions"):
_seed_all_junctions(conn, targets, dry_run)
if dirty_duplicates_pct > 0:
with time_phase("dirty-duplicates"):
_inject_dirty_data(conn, dirty_duplicates_pct, dry_run)
# ----------------------------------------------------------------------
# CLI
# ----------------------------------------------------------------------
def main() -> None:
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter,
)
for table, default in DEFAULTS.items():
parser.add_argument(
f"--{table.replace('_', '-')}",
type=int,
default=default,
help=f"target row count for {table} (default: {default:,})",
)
parser.add_argument(
"--dry-run",
"-n",
action="store_true",
help="print planned inserts without writing to the DB",
)
parser.add_argument(
"--dirty-duplicates-pct",
type=float,
default=0,
help=(
"after seeding distinct pairs, inject this percentage of duplicate "
"rows on each non-UNIQUE junction (slice_user, dashboard_user, "
"dashboard_roles). Stress-tests the migration's _dedupe_by_min_id "
"phase. Requires the DB to be at the pre-migration revision "
"(33d7e0e21daa) — the post-migration composite PK rejects "
"duplicates and this will error. Default: 0 (no duplicates)."
),
)
parser.add_argument(
"--verbose",
"-v",
action="store_true",
help="increase log verbosity",
)
args = parser.parse_args()
logging.basicConfig(
level=logging.DEBUG if args.verbose else logging.INFO,
format="%(asctime)s [%(levelname)s] %(message)s",
datefmt="%H:%M:%S",
)
targets = {table: getattr(args, table) for table in DEFAULTS}
logger.info("Targets: %s", targets)
logger.info("Dry run: %s", args.dry_run)
logger.info("Dirty duplicates pct: %g", args.dirty_duplicates_pct)
with time_phase("total"):
run(
targets,
dry_run=args.dry_run,
dirty_duplicates_pct=args.dirty_duplicates_pct,
)
if __name__ == "__main__":
main()

View File

@@ -55,21 +55,10 @@ msgcat --sort-by-msgid --no-wrap --no-location superset/translations/messages.po
cat $LICENSE_TMP superset/translations/messages.pot > messages.pot.tmp \
&& mv messages.pot.tmp superset/translations/messages.pot
# --no-fuzzy-matching: when a *new* source string is added, Babel's fuzzy
# matcher otherwise guesses a "close" existing translation and marks it
# `#, fuzzy` in every language catalog. Those guesses are (a) usually wrong
# (e.g. a new "valuename" string mapped onto an unrelated "table name"
# translation) and (b) counted by check_translation_regression.py as a
# regression, so every PR that merely adds a translatable string failed the
# babel-extract check. Disabling fuzzy matching means new strings land as
# cleanly untranslated (empty msgstr) instead — accurate, and no spurious
# regression. Renames likewise drop the stale translation rather than
# stranding a wrong guess; the string is re-translated by the community.
pybabel update \
-i superset/translations/messages.pot \
-d superset/translations \
--ignore-obsolete \
--no-fuzzy-matching
--ignore-obsolete
# Chop off last blankline from po/pot files, see https://github.com/python-babel/babel/issues/799
for file in $( find superset/translations/** );

View File

@@ -1,686 +0,0 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Backfill missing translations in a .po file using Claude AI.
For each untranslated (empty msgstr) entry in the target language, the script
sends the English source string along with all available translations in other
languages to Claude as context, then writes the AI-generated translation back
into the .po file marked as #, fuzzy for human review.
Usage:
# Build the translation index first (one-time or when .po files change)
python scripts/translations/build_translation_index.py
# Backfill French translations
python scripts/translations/backfill_po.py --lang fr
# Dry run (print what would be translated without writing)
python scripts/translations/backfill_po.py --lang de --dry-run
# Limit to 100 entries and use a specific model
python scripts/translations/backfill_po.py --lang es --limit 100 \
--model claude-opus-4-6
Options:
--lang LANG ISO language code to backfill (required)
--batch-size N Number of strings per Claude request (default: 50)
--limit N Stop after translating N entries (default: unlimited)
--min-context N Skip entries with fewer than N existing translations across
reference languages (default: 0 — translate everything)
--model MODEL Claude model ID (default: claude-sonnet-4-6)
--index PATH Path to translation_index.json (default: auto-detect)
--dry-run Print translations without writing to .po file
--no-fuzzy Do not mark generated translations as fuzzy (default: mark fuzzy)
"""
from __future__ import annotations
import argparse
import json
import re
import shutil
import subprocess
import sys
from pathlib import Path
from typing import Any
try:
import polib # type: ignore[import-untyped]
except ImportError:
print("polib is required. Run: pip install polib", file=sys.stderr)
sys.exit(1)
TRANSLATIONS_DIR = Path(__file__).parent.parent.parent / "superset" / "translations"
DEFAULT_INDEX = TRANSLATIONS_DIR / "translation_index.json"
DEFAULT_MODEL = "claude-sonnet-4-6"
DEFAULT_BATCH_SIZE = 50
_ASF_LICENSE_HEADER = """\
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""
# Language names for the prompt, keyed by ISO code
LANGUAGE_NAMES: dict[str, str] = {
"ar": "Arabic",
"ca": "Catalan",
"de": "German",
"es": "Spanish",
"fa": "Persian (Farsi)",
"fr": "French",
"it": "Italian",
"ja": "Japanese",
"ko": "Korean",
"mi": "Māori",
"nl": "Dutch",
"pl": "Polish",
"pt": "Portuguese",
"pt_BR": "Brazilian Portuguese",
"ru": "Russian",
"sk": "Slovak",
"sl": "Slovenian",
"tr": "Turkish",
"uk": "Ukrainian",
"zh": "Chinese (Simplified)",
"zh_TW": "Chinese (Traditional)",
}
def _ensure_license_header(po_path: Path, *, dry_run: bool = False) -> None:
"""Prepend the ASF license header to the .po file if it is missing."""
content = po_path.read_text(encoding="utf-8")
if "Licensed to the Apache Software Foundation" not in content:
if dry_run:
print(
f"[dry-run] Would add ASF license header to {po_path}", file=sys.stderr
)
else:
po_path.write_text(_ASF_LICENSE_HEADER + content, encoding="utf-8")
print(f"Added ASF license header to {po_path}", file=sys.stderr)
def _lang_name(code: str) -> str:
"""Return a human-readable language name for an ISO language code."""
return LANGUAGE_NAMES.get(code, code)
def _plural_key(msgid: str, msgid_plural: str) -> str:
"""Build the translation index key used for pluralized entries."""
return f"{msgid}\x00{msgid_plural}"
def _is_missing(entry: polib.POEntry) -> bool:
"""Return True for entries that need a translation."""
if entry.obsolete:
return False
if entry.msgid_plural:
return not any(v for v in entry.msgstr_plural.values())
return not entry.msgstr
def _context_langs(
item: dict[str, Any], index: dict[str, Any], target_lang: str
) -> list[str]:
"""Return sorted list of language codes that have translations for this entry."""
key = item["index_key"]
if key not in index:
return []
return sorted(
lang for lang, val in index[key].items() if lang != target_lang and val
)
def _context_count(
item: dict[str, Any], index: dict[str, Any], target_lang: str
) -> int:
"""Return the number of other-language translations available for this entry."""
return len(_context_langs(item, index, target_lang))
def _render_item(
i: int,
item: dict[str, Any],
index: dict[str, Any],
target_lang: str,
reference_langs_sorted: list[str],
) -> list[str]:
"""Render one batch entry as prompt lines."""
lines: list[str] = []
ctx = _context_count(item, index, target_lang)
if ctx == 0:
lines.append(
f"--- [{i}] (no reference translations — translate conservatively) ---"
)
else:
plural = "s" if ctx != 1 else ""
lines.append(f"--- [{i}] ({ctx} reference translation{plural}) ---")
lines.append(f"English: {json.dumps(item['msgid'], ensure_ascii=False)}")
if item.get("msgid_plural"):
plural_json = json.dumps(item["msgid_plural"], ensure_ascii=False)
lines.append(f"English plural: {plural_json}")
key = item["index_key"]
if key in index and reference_langs_sorted:
for lang in reference_langs_sorted:
val = index[key].get(lang)
if val is None:
continue
if isinstance(val, dict):
forms = "; ".join(
f"[{k}] {json.dumps(v, ensure_ascii=False)}" for k, v in val.items()
)
lines.append(f"{_lang_name(lang)}: {forms}")
else:
lines.append(
f"{_lang_name(lang)}: {json.dumps(val, ensure_ascii=False)}"
)
lines.append("")
return lines
def build_prompt(
target_lang: str,
batch: list[dict[str, Any]],
index: dict[str, Any],
) -> str:
"""Build the Claude prompt for a batch of entries."""
lang_name = _lang_name(target_lang)
# Collect which other languages actually have translations for this batch
reference_langs: set[str] = set()
for item in batch:
key = item["index_key"]
if key in index:
reference_langs.update(
lang for lang, val in index[key].items() if lang != target_lang and val
)
reference_langs_sorted = sorted(reference_langs)
lines: list[str] = [
"You are a professional translator specializing in software UI strings.",
f"Translate the following English strings into {lang_name} ({target_lang}).",
"",
"Rules:",
"- Preserve all format placeholders exactly: %(name)s, {name}, %s, %d, etc.",
"- Preserve HTML tags if present.",
"- Keep the same tone and register as the reference translations.",
"- For plural forms, provide translations for all plural forms"
" required by the language.",
"- Return ONLY a JSON object mapping each numeric index (as a string)"
" to its translation.",
"- Do not add any explanation, preamble, or markdown fences.",
"",
"Important: Many strings are short fragments or single words that are"
" ambiguous in English (e.g. 'Scale' could mean a measurement scale,"
" to scale an image, or fish scales). Use the translations in other"
" languages as your primary signal for which meaning is intended —"
" they collectively disambiguate the intended sense. When no"
" other-language translations are available for an entry, translate"
" conservatively based on the most common meaning in a data"
" visualization UI context.",
"",
]
if reference_langs_sorted:
lines.append(
f"Reference translations are provided per string where available "
f"({', '.join(_lang_name(lc) for lc in reference_langs_sorted)})."
)
lines.append("")
lines.append("Strings to translate:")
lines.append("")
for i, item in enumerate(batch):
lines.extend(_render_item(i, item, index, target_lang, reference_langs_sorted))
# Add guidance on plural form counts per language whenever ANY entry in
# the batch is plural — batches mix singular and plural in .po order, so
# gating on the first entry would silently drop the guidance whenever
# the plural entries happen to land after a singular one.
if any(item.get("msgid_plural") for item in batch):
lines.append(
"Note: provide ALL plural forms required by the target language "
"(e.g. French needs 2, Russian needs 3, Arabic needs 6)."
)
lines.append("")
lines.append(
'Expected output format: {"0": "<translation>", "1": "<translation>", ...}'
)
lines.append("(keys are the numeric indices of the strings above)")
return "\n".join(lines)
def parse_response(text: str, batch_size: int) -> dict[int, str]:
"""Parse the JSON object from Claude's response."""
# Strip any accidental markdown fences
text = re.sub(r"^```[^\n]*\n", "", text.strip())
text = re.sub(r"\n```$", "", text)
try:
raw = json.loads(text)
except json.JSONDecodeError as exc:
raise ValueError(
f"Could not parse response as JSON: {exc}\n\nResponse:\n{text}"
) from exc
# _process_batches only catches ValueError/RuntimeError, so a non-object
# response (list, scalar, null) must surface as ValueError rather than
# bubbling up an AttributeError from .items() and aborting the whole run.
if not isinstance(raw, dict):
raise ValueError(
f"Expected a JSON object mapping indices to translations, "
f"got {type(raw).__name__}.\n\nResponse:\n{text}"
)
# Preserve dict/list values as JSON strings so plural responses (where
# v is a dict of plural forms) can be re-parsed downstream by
# _apply_translation's json.loads. str(v) on a dict produces Python
# repr ({'0': 'x'}) which is not valid JSON.
return {
int(k): (
json.dumps(v, ensure_ascii=False) if isinstance(v, (dict, list)) else str(v)
)
for k, v in raw.items()
if str(k).isdigit()
}
def translate_batch(
model: str,
target_lang: str,
batch: list[dict[str, Any]],
index: dict[str, Any],
) -> dict[int, str]:
"""Send a batch of strings to Claude via `claude -p`.
Returns a dict mapping batch index to translated string.
"""
claude_bin = shutil.which("claude")
if not claude_bin:
raise RuntimeError(
"claude CLI not found. Install Claude Code or add it to PATH."
)
prompt = build_prompt(target_lang, batch, index)
# Pipe the prompt over stdin rather than passing it as argv: a single batch
# with many reference languages can grow into the tens of KB and approach
# ARG_MAX on some platforms.
# claude_bin is resolved via shutil.which — not user-controlled input
result = subprocess.run( # noqa: S603
[claude_bin, "--model", model, "-p"],
input=prompt,
capture_output=True,
text=True,
check=False,
)
if result.returncode != 0:
raise RuntimeError(
f"claude exited with code {result.returncode}:\n{result.stderr}"
)
return parse_response(result.stdout.strip(), len(batch))
def _apply_plural_translation(entry: polib.POEntry, translation: str) -> None:
"""Distribute a model response across the entry's plural forms.
Model may return a JSON dict ({"0": "form0", "1": "form1"}), a JSON list
(["form0", "form1"], also valid since plural forms are ordered), a JSON
scalar (a single translation that fills every form), or a plain non-JSON
string (older models that ignore the JSON instruction).
"""
try:
plural_value = json.loads(translation)
except (json.JSONDecodeError, ValueError):
for k in entry.msgstr_plural:
entry.msgstr_plural[k] = translation
return
if isinstance(plural_value, dict):
entry.msgstr_plural = {int(k): str(v) for k, v in plural_value.items()}
return
if isinstance(plural_value, list) and plural_value:
# Distribute list items across plural form indices in order; if the
# model returned fewer forms than the language requires, repeat the
# last form rather than leaving slots blank.
forms = [str(v) for v in plural_value]
for k in sorted(entry.msgstr_plural):
entry.msgstr_plural[k] = forms[k] if k < len(forms) else forms[-1]
return
# Scalar (or empty list) — broadcast to every form.
fill = str(plural_value) if plural_value not in (None, []) else translation
for k in entry.msgstr_plural:
entry.msgstr_plural[k] = fill
def _apply_translation(
entry: polib.POEntry,
translation: str,
item: dict[str, Any],
model: str,
mark_fuzzy: bool,
) -> None:
"""Write a translation string into a POEntry and add attribution."""
if entry.msgid_plural:
_apply_plural_translation(entry, translation)
else:
entry.msgstr = translation
if mark_fuzzy and "fuzzy" not in entry.flags:
entry.flags.append("fuzzy")
refs = item["context_langs"]
refs_tag = f" [refs: {', '.join(refs)}]" if refs else " [no refs]"
attribution = f"Machine-translated via backfill_po.py ({model}){refs_tag}"
if entry.tcomment:
if attribution not in entry.tcomment:
entry.tcomment = f"{entry.tcomment}\n{attribution}"
else:
entry.tcomment = attribution
def _build_batch_items(
entries: list[polib.POEntry],
index: dict[str, Any],
lang: str,
) -> list[dict[str, Any]]:
"""Convert a list of POEntries into the dict format used by translate_batch."""
items: list[dict[str, Any]] = []
for entry in entries:
if entry.msgid_plural:
item: dict[str, Any] = {
"msgid": entry.msgid,
"msgid_plural": entry.msgid_plural,
"index_key": _plural_key(entry.msgid, entry.msgid_plural),
"is_plural": True,
}
else:
item = {
"msgid": entry.msgid,
"index_key": entry.msgid,
"is_plural": False,
}
item["context_langs"] = _context_langs(item, index, lang)
item["context_count"] = len(item["context_langs"])
items.append(item)
return items
def _process_batches(
missing: list[polib.POEntry],
index: dict[str, Any],
lang: str,
batch_size: int,
model: str,
dry_run: bool,
mark_fuzzy: bool,
cat: polib.POFile | None = None,
po_path: Path | None = None,
) -> tuple[int, int]:
"""Translate missing entries in batches. Returns (translated, failed) counts.
When ``cat`` and ``po_path`` are provided and ``dry_run`` is False, the
catalog is saved to disk after each batch that produced at least one
successful translation. This means a crash mid-run only loses the in-flight
batch rather than every batch translated so far.
"""
translated_count = 0
failed_count = 0
for batch_start in range(0, len(missing), batch_size):
batch_entries = missing[batch_start : batch_start + batch_size]
batch_items = _build_batch_items(batch_entries, index, lang)
end = min(batch_start + batch_size, len(missing))
print(
f" Translating entries {batch_start + 1}{end} of {len(missing)}",
file=sys.stderr,
)
try:
translations = translate_batch(model, lang, batch_items, index)
except (ValueError, RuntimeError) as exc:
print(f" ERROR in batch starting at {batch_start}: {exc}", file=sys.stderr)
failed_count += len(batch_entries)
continue
batch_applied = 0
for i, entry in enumerate(batch_entries):
translation = translations.get(i)
if translation is None:
print(
f" WARNING: no translation returned for index {i} "
f"(msgid: {entry.msgid[:60]!r})",
file=sys.stderr,
)
failed_count += 1
continue
if dry_run:
ctx = batch_items[i]["context_count"]
ctx_tag = f" [ctx:{ctx}]" if ctx < 3 else ""
print(
f" [{lang}]{ctx_tag} {entry.msgid[:60]!r}{translation[:60]!r}"
)
else:
_apply_translation(
entry, translation, batch_items[i], model, mark_fuzzy
)
batch_applied += 1
translated_count += 1
if (
not dry_run
and batch_applied > 0
and cat is not None
and po_path is not None
):
cat.save()
print(
f" Saved {po_path} ({batch_applied} entry(ies) in this batch).",
file=sys.stderr,
)
return translated_count, failed_count
def backfill(
lang: str,
*,
batch_size: int = DEFAULT_BATCH_SIZE,
limit: int | None = None,
min_context: int = 0,
model: str = DEFAULT_MODEL,
index_path: Path = DEFAULT_INDEX,
dry_run: bool = False,
mark_fuzzy: bool = True,
) -> None:
"""Backfill missing translations in the target language's .po file."""
# Defense against path traversal: ``lang`` lands in a filesystem path
# without further sanitization, so reject anything that isn't an
# ISO 639-1/639-2 code with an optional ISO 3166 region (e.g. ``pt_BR``).
if not re.fullmatch(r"[a-z]{2,3}(_[A-Z]{2})?", lang):
print(
f"Invalid language code: {lang!r} "
"(expected ISO 639 code, optionally with _<REGION>, e.g. 'fr' or 'pt_BR')",
file=sys.stderr,
)
sys.exit(1)
po_path = TRANSLATIONS_DIR / lang / "LC_MESSAGES" / "messages.po"
if not po_path.exists():
print(f"No .po file found for language '{lang}': {po_path}", file=sys.stderr)
sys.exit(1)
if not index_path.exists():
print(
f"Translation index not found at {index_path}.\n"
"Run: python scripts/translations/build_translation_index.py",
file=sys.stderr,
)
sys.exit(1)
print("Loading translation index …", file=sys.stderr)
with open(index_path, encoding="utf-8") as f:
index: dict[str, Any] = json.load(f)
_ensure_license_header(po_path, dry_run=dry_run)
print(f"Loading {po_path}", file=sys.stderr)
cat = polib.pofile(str(po_path))
missing: list[polib.POEntry] = [e for e in cat if e.msgid and _is_missing(e)]
print(f"Found {len(missing)} untranslated entries for '{lang}'.", file=sys.stderr)
if min_context > 0:
before = len(missing)
missing = [
e
for e in missing
if _context_count(
{
"index_key": (
_plural_key(e.msgid, e.msgid_plural)
if e.msgid_plural
else e.msgid
)
},
index,
lang,
)
>= min_context
]
skipped = before - len(missing)
print(
f"Skipping {skipped} entries with fewer than {min_context} reference "
f"translation(s) (use --min-context 0 to include them).",
file=sys.stderr,
)
if limit is not None:
missing = missing[:limit]
print(f"Limiting to {limit} entries.", file=sys.stderr)
if not missing:
print("Nothing to do.", file=sys.stderr)
return
translated_count, failed_count = _process_batches(
missing,
index,
lang,
batch_size,
model,
dry_run,
mark_fuzzy,
cat=cat,
po_path=po_path,
)
print(
f"\nDone. Translated: {translated_count}, Failed/skipped: {failed_count}.",
file=sys.stderr,
)
if not dry_run and translated_count > 0:
print(
f"Translations written to {po_path} (marked #, fuzzy for review).",
file=sys.stderr,
)
def main() -> None:
"""Parse CLI arguments and run translation backfill."""
parser = argparse.ArgumentParser(
description="Backfill missing .po translations using Claude AI",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=__doc__,
)
parser.add_argument(
"--lang", required=True, help="ISO language code (e.g. fr, de, ja)"
)
parser.add_argument(
"--batch-size",
type=int,
default=DEFAULT_BATCH_SIZE,
help=f"Strings per Claude request (default: {DEFAULT_BATCH_SIZE})",
)
parser.add_argument(
"--limit",
type=int,
default=None,
help="Maximum number of entries to translate (default: unlimited)",
)
parser.add_argument(
"--model",
default=DEFAULT_MODEL,
help=f"Claude model ID (default: {DEFAULT_MODEL})",
)
parser.add_argument(
"--index",
type=Path,
default=DEFAULT_INDEX,
help=f"Path to translation_index.json (default: {DEFAULT_INDEX})",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Print translations without modifying the .po file",
)
parser.add_argument(
"--min-context",
type=int,
default=0,
metavar="N",
help=(
"Skip entries with fewer than N reference translations in other languages "
"(default: 0 = translate everything). Strings with low context are more "
"likely to be ambiguous single words or fragments — set to e.g. 2 to only "
"translate strings that have been confirmed in at least 2 other languages."
),
)
parser.add_argument(
"--no-fuzzy",
dest="mark_fuzzy",
action="store_false",
default=True,
help=(
"Do not mark generated translations as #, fuzzy. "
"WARNING: fuzzy entries are excluded from compiled .mo files. "
"Removing this flag causes AI-generated translations to be served "
"to end users without human review — only use after you have "
"manually verified the .po file."
),
)
args = parser.parse_args()
backfill(
lang=args.lang,
batch_size=args.batch_size,
limit=args.limit,
min_context=args.min_context,
model=args.model,
index_path=args.index,
dry_run=args.dry_run,
mark_fuzzy=args.mark_fuzzy,
)
if __name__ == "__main__":
main()

View File

@@ -1,153 +0,0 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Build a cross-language translation index from all .po files.
Outputs a JSON file structured as:
{
"<msgid>": {
"<lang>": "<translated string or null>",
...
},
...
}
For plural entries the key is "<msgid>\x00<msgid_plural>" and the value
is a dict mapping lang -> {0: "...", 1: "..."} (or null if untranslated).
Usage:
python scripts/translations/build_translation_index.py
python scripts/translations/build_translation_index.py \
--translations-dir superset/translations \
--output /tmp/translation_index.json
"""
from __future__ import annotations
import argparse
import json
import os
import sys
from pathlib import Path
from typing import Any
try:
import polib # type: ignore[import-untyped]
except ImportError:
print("polib is required. Install with: pip install polib", file=sys.stderr)
sys.exit(1)
TRANSLATIONS_DIR = Path(__file__).parent.parent.parent / "superset" / "translations"
DEFAULT_OUTPUT = (
Path(__file__).parent.parent.parent
/ "superset"
/ "translations"
/ "translation_index.json"
)
def _is_translated(entry: polib.POEntry) -> bool:
"""Return True if the entry has a non-empty, non-fuzzy translation."""
if "fuzzy" in entry.flags:
return False
if entry.msgid_plural:
return any(v for v in entry.msgstr_plural.values())
return bool(entry.msgstr)
def _plural_key(entry: polib.POEntry) -> str:
"""Build the combined key used for plural translation entries."""
return f"{entry.msgid}\x00{entry.msgid_plural}"
def build_index(translations_dir: Path) -> dict[str, Any]:
"""Read all .po files and build a combined translation index."""
index: dict[str, dict[str, Any]] = {}
langs = sorted(
d
for d in os.listdir(translations_dir)
if (translations_dir / d / "LC_MESSAGES" / "messages.po").exists()
and d != "en" # en has empty msgstr by convention (source = target)
)
for lang in langs:
po_path = translations_dir / lang / "LC_MESSAGES" / "messages.po"
cat = polib.pofile(str(po_path))
for entry in cat:
if not entry.msgid:
continue # skip header entry
if entry.msgid_plural:
key = _plural_key(entry)
if key not in index:
index[key] = {}
# Fuzzy entries are unreviewed (often machine-generated drafts),
# so excluding them prevents feeding unverified translations
# back into the AI backfill prompt as trusted context.
index[key][lang] = (
dict(entry.msgstr_plural) if _is_translated(entry) else None
)
else:
key = entry.msgid
if key not in index:
index[key] = {}
index[key][lang] = entry.msgstr if _is_translated(entry) else None
# Ensure every entry has a slot for every language (null if missing)
for key in index:
for lang in langs:
index[key].setdefault(lang, None)
return index
def main() -> None:
"""Parse arguments, build the translation index, and write it to disk."""
parser = argparse.ArgumentParser(
description="Build cross-language translation index"
)
parser.add_argument(
"--translations-dir",
type=Path,
default=TRANSLATIONS_DIR,
help="Path to the translations directory (default: superset/translations)",
)
parser.add_argument(
"--output",
"-o",
type=Path,
default=DEFAULT_OUTPUT,
help=(
"Output JSON file path"
" (default: superset/translations/translation_index.json)"
),
)
args = parser.parse_args()
print(f"Reading .po files from {args.translations_dir}", file=sys.stderr)
index = build_index(args.translations_dir)
print(f"Indexed {len(index)} message IDs.", file=sys.stderr)
args.output.parent.mkdir(parents=True, exist_ok=True)
with open(args.output, "w", encoding="utf-8") as f:
json.dump(index, f, ensure_ascii=False, indent=2)
print(f"Written to {args.output}", file=sys.stderr)
if __name__ == "__main__":
main()

View File

@@ -1,342 +0,0 @@
#!/usr/bin/env python3
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""
Check that source-code changes don't cause translation regressions.
What counts as a regression
---------------------------
A regression is an *existing translation that a source change invalidated*.
The check keys on the **increase in fuzzy entries** rather than a drop in the
translated count, because a count drop happens identically for a benign
*deletion* and a real *rename*, so it cannot distinguish the two — whereas a
``#, fuzzy`` marker unambiguously flags a stranded translation.
Note ``babel_update.sh`` runs ``pybabel update`` with ``--no-fuzzy-matching``,
so *adding* (or renaming) a source string does **not** auto-generate a fuzzy
guess against an unrelated existing translation — new strings land as cleanly
untranslated (empty ``msgstr``). This deliberately avoids the prior behaviour
where *every* PR that merely added a translatable string tripped this check on
spurious fuzzies. As a result the check now guards against ``#, fuzzy`` entries
that arrive another way — e.g. a committed ``.po`` edit — rather than ones the
update step synthesises. *Deleting* a string is still not a regression: with
``--ignore-obsolete`` it is simply dropped and no fuzzy is created.
Usage
-----
Count translated + fuzzy entries in all .po files and write JSON to stdout:
python check_translation_regression.py --count
Compare the current .po state against a previously-recorded baseline and fail
if a source change invalidated existing translations (new fuzzies):
python check_translation_regression.py --compare /path/to/before.json
Optionally write a markdown report to a file (used by CI to post a PR comment):
python check_translation_regression.py --compare before.json --report report.md
Use a translations directory other than the repo default (used by CI to count
against a separate base-branch worktree):
python check_translation_regression.py --count \\
--translations-dir /tmp/base-worktree/superset/translations
Typical CI workflow
-------------------
1. Create a base-branch worktree alongside the PR worktree
2. Run babel_update.sh in the base worktree (extract from BASE source)
3. Record baseline: python ... --count --translations-dir BASE_TREE > before.json
4. Run babel_update.sh in the PR worktree (extract from PR source and keep
any committed PR .po updates)
5. Compare: python ... --compare before.json [--report report.md]
Running babel_update on the base branch first isolates regressions caused by
the PR's source diff from any pre-existing drift on the base branch, while the
PR worktree run still allows committed .po updates to resolve the fuzzies (and
thus clear the regression) before merging.
"""
import argparse
import json
import re
import subprocess
import sys
from pathlib import Path
from typing import Optional
DEFAULT_TRANSLATIONS_DIR = (
Path(__file__).resolve().parent.parent.parent / "superset" / "translations"
)
# English .po files use empty msgstr by convention (source language == target),
# so they always show 0 translated entries and should not be checked.
SKIP_LANGS = {"en"}
def count_stats(po_file: Path) -> dict[str, int]:
"""Return ``{"translated": int, "fuzzy": int}`` for a .po file.
``translated`` is the number of non-fuzzy translated messages; ``fuzzy`` is
the number of fuzzy translations. The fuzzy count is what the regression
check keys on — a source rename invalidates an existing translation by
making it fuzzy, whereas a deletion simply drops it (``--ignore-obsolete``).
Raises:
subprocess.CalledProcessError: if ``msgfmt`` fails (e.g. malformed
.po file). The regression check exists to surface translation
problems, so a silent zero would defeat its purpose — let the
caller see a malformed file as a hard failure.
"""
import shutil # noqa: PLC0415
msgfmt = shutil.which("msgfmt") or "msgfmt"
result = subprocess.run( # noqa: S603
[msgfmt, "--statistics", "-o", "/dev/null", str(po_file)],
capture_output=True,
text=True,
check=True,
)
# stderr: "123 translated messages, 4 fuzzy translations, 56 untranslated messages."
# The fuzzy and untranslated clauses are omitted by msgfmt when they are 0.
translated_match = re.search(r"(\d+) translated message", result.stderr)
if not translated_match:
raise RuntimeError(
f"Could not parse msgfmt --statistics output for {po_file}: "
f"{result.stderr!r}"
)
fuzzy_match = re.search(r"(\d+) fuzzy translation", result.stderr)
return {
"translated": int(translated_match.group(1)),
"fuzzy": int(fuzzy_match.group(1)) if fuzzy_match else 0,
}
def get_counts(
translations_dir: Path,
failures: Optional[set[str]] = None,
) -> dict[str, dict[str, int]]:
"""Count translated/fuzzy entries for every ``.po`` file in a directory.
If ``failures`` is provided, the name of each language whose ``.po`` file
is present on disk but could not be counted (msgfmt non-zero exit, or
unparseable output) is added to it. Such a language is deliberately absent
from the returned mapping — but, unlike a language whose catalog was simply
deleted, it must not be mistaken for an intentional removal: a caller that
cares about the distinction (see :func:`cmd_compare`) can inspect
``failures`` and treat it as a hard error.
"""
counts: dict[str, dict[str, int]] = {}
for po_file in sorted(translations_dir.glob("*/LC_MESSAGES/messages.po")):
lang = po_file.parent.parent.name
if lang in SKIP_LANGS:
continue
try:
counts[lang] = count_stats(po_file)
except (subprocess.CalledProcessError, RuntimeError) as exc:
# A malformed .po file (msgfmt non-zero exit, or stderr we
# can't parse) is a real problem worth seeing, but it shouldn't
# take the whole regression check down with it — that would
# hide every other language's status. Skip and warn here; the
# caller is told which langs failed via ``failures`` so it can
# decide whether a present-but-uncountable catalog is fatal.
if failures is not None:
failures.add(lang)
print(
f"WARNING: skipping {lang}{po_file} could not be counted: {exc}",
file=sys.stderr,
)
return counts
def _normalize(entry: object) -> dict[str, int]:
"""Coerce a baseline entry into ``{"translated", "fuzzy"}``.
Tolerates the legacy baseline format where each language mapped directly to
an integer translated count (no fuzzy data); such entries contribute a
fuzzy baseline of 0.
"""
if isinstance(entry, dict):
return {
"translated": int(entry.get("translated", 0)),
"fuzzy": int(entry.get("fuzzy", 0)),
}
if isinstance(entry, int):
return {"translated": entry, "fuzzy": 0}
raise TypeError(f"Unsupported baseline entry: {entry!r}")
def build_regression_report(regressions: list[tuple[str, int, int]]) -> str:
"""Build a markdown report for posting as a PR comment.
Each regression tuple is ``(lang, before_fuzzy, after_fuzzy)``.
"""
rows = "\n".join(
f"| `{lang}` | {b} | {a} | +{a - b} |" for lang, b, a in regressions
)
affected = ", ".join(f"`{lang}`" for lang, _, _ in regressions)
return (
"## ⚠️ Translation Regression Detected\n\n"
f"A source change in this PR renamed or reworded strings, invalidating "
f"existing translations (they are now `#, fuzzy`) in {affected}. Please "
f"resolve the affected `.po` files before merging.\n\n"
"_Note: intentionally **deleting** a translatable string is not a "
"regression and is not flagged here — only translations invalidated by "
"a renamed/reworded source string are._\n\n"
"| Language | Fuzzy before | Fuzzy after | New |\n"
"|----------|-------------:|------------:|----:|\n"
f"{rows}\n\n"
"### How to fix\n\n"
"**1. Install dependencies** (if not already set up):\n\n"
"```bash\n"
"pip install -r superset/translations/requirements.txt\n"
"sudo apt-get install gettext # or: brew install gettext\n"
"```\n\n"
"**2. Re-extract strings and sync `.po` files:**\n\n"
"```bash\n"
"./scripts/translations/babel_update.sh\n"
"```\n\n"
"This rewrites `superset/translations/messages.pot` from the current "
"source files and merges the changes into every `.po` file. Strings "
"whose `msgid` changed will be marked `#, fuzzy`.\n\n"
f"**3. Resolve the fuzzy entries** in the affected language files "
f"({affected}):\n\n"
"```bash\n"
"grep -n '#, fuzzy' superset/translations/<lang>/LC_MESSAGES/messages.po\n"
"```\n\n"
"For each fuzzy entry, either rewrite the `msgstr` to match the new "
"string and remove the `#, fuzzy` line, or clear the `msgstr` to "
'`""` if you cannot provide a translation.\n\n'
"**4. Commit your changes to the `.po` files.**\n"
)
def cmd_count(translations_dir: Path) -> None:
counts = get_counts(translations_dir)
print(json.dumps(counts, indent=2))
def cmd_compare(
before_path: str,
translations_dir: Path,
report_path: Optional[str] = None,
) -> None:
with open(before_path) as f:
before_raw: dict[str, object] = json.load(f)
before = {lang: _normalize(entry) for lang, entry in before_raw.items()}
failures: set[str] = set()
after = get_counts(translations_dir, failures=failures)
# A baseline language whose catalog is *missing* from `after` is fine —
# that's an intentional catalog deletion (handled below like any other
# deletion). But a language whose .po file is still present yet could not
# be counted (msgfmt failed / output unparseable) is a hard error: leaving
# it out silently would let a corrupt catalog pass as "no regression".
broken = sorted(lang for lang in failures if lang in before)
if broken:
print("Translation check failed!\n")
for lang in broken:
print(f" {lang}: catalog present but could not be counted (msgfmt error)")
print(
"\nFix the malformed .po file(s) above before merging — a catalog "
"that cannot be parsed must not be silently dropped."
)
sys.exit(1)
# A regression is an *increase* in fuzzy entries: the PR's source diff
# renamed/reworded strings, leaving their committed translations stranded.
# A plain drop in the translated count is NOT used — deleting a string
# lowers it identically to a rename but is a legitimate change, and with
# `pybabel update --ignore-obsolete` a deletion creates no fuzzy entry.
regressions: list[tuple[str, int, int]] = []
for lang, before_stats in sorted(before.items()):
after_stats = after.get(lang, {"translated": 0, "fuzzy": 0})
if after_stats["fuzzy"] > before_stats["fuzzy"]:
regressions.append((lang, before_stats["fuzzy"], after_stats["fuzzy"]))
if regressions:
print("Translation regression detected!\n")
for lang, b, a in regressions:
print(
f" {lang}: {a - b} translation(s) invalidated "
f"(fuzzy {b} -> {a}) by a renamed/reworded source string"
)
print(
"\nResolve the newly-fuzzy entries in the affected .po files "
"before merging."
)
if report_path:
Path(report_path).write_text(
build_regression_report(regressions), encoding="utf-8"
)
sys.exit(1)
# All good — print a summary so it's easy to read in CI logs.
print("No translation regressions.\n")
for lang in sorted(after):
before_stats = before.get(lang, {"translated": 0, "fuzzy": 0})
after_stats = after[lang]
t_delta = after_stats["translated"] - before_stats["translated"]
f_delta = after_stats["fuzzy"] - before_stats["fuzzy"]
print(
f" {lang}: translated {before_stats['translated']} -> "
f"{after_stats['translated']} ({t_delta:+d}), fuzzy "
f"{before_stats['fuzzy']} -> {after_stats['fuzzy']} ({f_delta:+d})"
)
def main() -> None:
parser = argparse.ArgumentParser(
description="Check for translation regressions in .po files."
)
action = parser.add_mutually_exclusive_group(required=True)
action.add_argument(
"--count",
action="store_true",
help="Output translation counts per language as JSON.",
)
action.add_argument(
"--compare",
metavar="BEFORE_JSON",
help="Compare current counts against a baseline JSON file.",
)
parser.add_argument(
"--report",
metavar="REPORT_MD",
help="When --compare detects regressions, write a markdown report here.",
)
parser.add_argument(
"--translations-dir",
type=Path,
default=DEFAULT_TRANSLATIONS_DIR,
help=(
"Path to the translations directory containing per-language "
"LC_MESSAGES/messages.po files (default: <repo>/superset/translations)."
),
)
args = parser.parse_args()
if args.count:
cmd_count(args.translations_dir)
else:
cmd_compare(args.compare, args.translations_dir, args.report)
if __name__ == "__main__":
main()

View File

@@ -22,41 +22,15 @@ set -e
# If not already running in Docker, run this script inside Docker
if [ -z "$RUNNING_IN_DOCKER" ]; then
# Extract "current" Python version from CI config (single source of truth)
PYTHON_VERSION=$(grep -A 1 'if.*"current"' .github/actions/setup-backend/action.yml | grep 'RESOLVED_VERSION=' | sed 's/.*RESOLVED_VERSION="\([0-9.]*\)".*/\1/')
if [ -z "$PYTHON_VERSION" ]; then
echo "Failed to determine Python version from .github/actions/setup-backend/action.yml" >&2
exit 1
fi
PYTHON_VERSION=$(grep -A 1 'if.*"current"' .github/actions/setup-backend/action.yml | grep 'PYTHON_VERSION=' | sed 's/.*PYTHON_VERSION=\([0-9.]*\).*/\1/')
echo "Running in Docker (Python ${PYTHON_VERSION} on Linux)..."
IMAGE="python:${PYTHON_VERSION}-slim"
# Pre-pull the image with a few retries to absorb transient Docker Hub
# registry failures ("context deadline exceeded" / anonymous rate-limit blips
# on shared CI runners). Without this a flaky pull fails the whole
# check-python-deps job on an infrastructure hiccup rather than a real
# dependency drift. The pull is in the `until` condition so `set -e` does not
# abort on an individual failed attempt.
attempt=1
max_attempts=4
until docker pull "$IMAGE"; do
if [ "$attempt" -ge "$max_attempts" ]; then
echo "docker pull $IMAGE failed after ${max_attempts} attempts" >&2
exit 1
fi
delay=$((attempt * 10))
echo "docker pull $IMAGE failed (attempt ${attempt}/${max_attempts}); retrying in ${delay}s..." >&2
sleep "$delay"
attempt=$((attempt + 1))
done
docker run --rm \
-v "$(pwd)":/app \
-w /app \
-e RUNNING_IN_DOCKER=1 \
"$IMAGE" \
python:${PYTHON_VERSION}-slim \
bash -c "pip install uv && ./scripts/uv-pip-compile.sh $*"
exit $?

View File

@@ -48,7 +48,7 @@ dependencies = [
"pydantic>=2.8.0",
"sqlalchemy>=1.4.0,<2.0",
"sqlalchemy-utils>=0.38.0, <0.43", # expanding lowerbound to work with pydoris
"sqlglot>=30.8.0, <31",
"sqlglot>=28.10.0, <29",
"typing-extensions>=4.0.0",
]

View File

@@ -92,26 +92,6 @@ class Dimension:
grain: Grain | None = None
class AggregationType(str, enum.Enum):
"""
Aggregation function applied by a metric.
Additivity (across an arbitrary set of grouping dimensions):
* ``SUM``, ``COUNT``: fully additive — sub-group sums roll up via ``sum``.
* ``MIN``, ``MAX``: roll up via ``min`` / ``max`` of sub-group values.
* ``AVG``, ``COUNT_DISTINCT``, ``OTHER``: not safely roll-uppable from
sub-aggregates without auxiliary data.
"""
SUM = "SUM"
COUNT = "COUNT"
MIN = "MIN"
MAX = "MAX"
AVG = "AVG"
COUNT_DISTINCT = "COUNT_DISTINCT"
OTHER = "OTHER"
@dataclass(frozen=True)
class Metric:
id: str
@@ -120,7 +100,6 @@ class Metric:
definition: str
description: str | None = None
aggregation: AggregationType | None = None
@dataclass(frozen=True)

View File

@@ -1 +1 @@
v24.16.0
v22.22.0

View File

@@ -29,8 +29,8 @@ Embedding is done by inserting an iframe, containing a Superset page, into the h
## Prerequisites
- Activate the feature flag `EMBEDDED_SUPERSET`
- Set a strong password in configuration variable `GUEST_TOKEN_JWT_SECRET` (see configuration file config.py). Be aware that its default value must be changed in production.
* Activate the feature flag `EMBEDDED_SUPERSET`
* Set a strong password in configuration variable `GUEST_TOKEN_JWT_SECRET` (see configuration file config.py). Be aware that its default value must be changed in production.
## Embedding a Dashboard
@@ -41,37 +41,32 @@ npm install --save @superset-ui/embedded-sdk
```
```js
import { embedDashboard } from '@superset-ui/embedded-sdk';
import { embedDashboard } from "@superset-ui/embedded-sdk";
embedDashboard({
id: 'abc123', // given by the Superset embedding UI
supersetDomain: 'https://superset.example.com',
mountPoint: document.getElementById('my-superset-container'), // any html element that can contain an iframe
id: "abc123", // given by the Superset embedding UI
supersetDomain: "https://superset.example.com",
mountPoint: document.getElementById("my-superset-container"), // any html element that can contain an iframe
fetchGuestToken: () => fetchGuestTokenFromBackend(),
dashboardUiConfig: {
// dashboard UI config: hideTitle, hideTab, hideChartControls, filters.visible, filters.expanded (optional), urlParams (optional)
hideTitle: true,
filters: {
expanded: true,
},
urlParams: {
foo: 'value1',
bar: 'value2',
// themeMode: 'dark', // set the initial theme: 'dark' | 'system' | 'default' (default: 'default')
// ...
},
dashboardUiConfig: { // dashboard UI config: hideTitle, hideTab, hideChartControls, filters.visible, filters.expanded (optional), urlParams (optional)
hideTitle: true,
filters: {
expanded: true,
},
urlParams: {
foo: 'value1',
bar: 'value2',
// ...
}
},
// optional additional iframe sandbox attributes
iframeSandboxExtras: [
'allow-top-navigation',
'allow-popups-to-escape-sandbox',
],
iframeSandboxExtras: ['allow-top-navigation', 'allow-popups-to-escape-sandbox'],
// optional Permissions Policy features
iframeAllowExtras: ['clipboard-write', 'fullscreen'],
// optional config to enforce a particular referrerPolicy
referrerPolicy: 'same-origin',
referrerPolicy: "same-origin",
// optional callback to customize permalink URLs
resolvePermalinkUrl: ({ key }) => `https://my-app.com/analytics/share/${key}`,
resolvePermalinkUrl: ({ key }) => `https://my-app.com/analytics/share/${key}`
});
```
@@ -102,7 +97,7 @@ Guest tokens can have Row Level Security rules which filter data for the user ca
The agent making the `POST` request must be authenticated with the `can_grant_guest_token` permission.
Within your app, using the Guest Token will then allow authentication to your Superset instance via creating an Anonymous user object. This guest anonymous user will default to the public role as per this setting `GUEST_ROLE_NAME = "Public"`.
Within your app, using the Guest Token will then allow authentication to your Superset instance via creating an Anonymous user object. This guest anonymous user will default to the public role as per this setting `GUEST_ROLE_NAME = "Public"`.
The user parameters in the example below are optional and are provided as a means of passing user attributes that may be accessed in jinja templates inside your charts.
@@ -115,13 +110,13 @@ Example `POST /security/guest_token` payload:
"first_name": "Stan",
"last_name": "Lee"
},
"resources": [
{
"type": "dashboard",
"id": "abc123"
}
],
"rls": [{ "clause": "publisher = 'Nintendo'" }]
"resources": [{
"type": "dashboard",
"id": "abc123"
}],
"rls": [
{ "clause": "publisher = 'Nintendo'" }
]
}
```
@@ -157,43 +152,15 @@ In this example, the configuration file includes the following setting:
GUEST_TOKEN_JWT_AUDIENCE="superset"
```
### Setting the Initial Theme Mode
Use the `themeMode` URL parameter to control the embedded dashboard's initial colour scheme:
```js
embedDashboard({
id: 'abc123',
supersetDomain: 'https://superset.example.com',
mountPoint: document.getElementById('my-superset-container'),
fetchGuestToken: () => fetchGuestTokenFromBackend(),
dashboardUiConfig: {
urlParams: {
themeMode: 'dark', // 'dark' | 'system' | 'default' (default: 'default')
},
},
});
```
The supported values are:
| Value | Behaviour |
| --------- | --------------------------------------------------------- |
| `default` | Light theme (Superset default) |
| `dark` | Dark theme |
| `system` | Follows the user's OS preference (`prefers-color-scheme`) |
The theme can also be changed at runtime via `embeddedDashboard.setThemeMode(mode)`.
### Sandbox iframe
The Embedded SDK creates an iframe with [sandbox](https://developer.mozilla.org/es/docs/Web/HTML/Element/iframe#sandbox) mode by default
which applies certain restrictions to the iframe's content.
To pass additional sandbox attributes you can use `iframeSandboxExtras`:
```js
// optional additional iframe sandbox attributes
iframeSandboxExtras: ['allow-top-navigation', 'allow-popups-to-escape-sandbox'];
// optional additional iframe sandbox attributes
iframeSandboxExtras: ['allow-top-navigation', 'allow-popups-to-escape-sandbox']
```
### Permissions Policy
@@ -201,12 +168,11 @@ iframeSandboxExtras: ['allow-top-navigation', 'allow-popups-to-escape-sandbox'];
To enable specific browser features within the embedded iframe, use `iframeAllowExtras` to set the iframe's [Permissions Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/Permissions_Policy) (the `allow` attribute):
```js
// optional Permissions Policy features
iframeAllowExtras: ['clipboard-write', 'fullscreen'];
// optional Permissions Policy features
iframeAllowExtras: ['clipboard-write', 'fullscreen']
```
Common permissions you might need:
- `clipboard-write` - Required for "Copy permalink to clipboard" functionality
- `fullscreen` - Required for fullscreen chart viewing
- `camera`, `microphone` - If your dashboards include media capture features
@@ -225,16 +191,16 @@ When users click share buttons inside an embedded dashboard, Superset generates
```js
embedDashboard({
id: 'abc123',
supersetDomain: 'https://superset.example.com',
mountPoint: document.getElementById('my-superset-container'),
id: "abc123",
supersetDomain: "https://superset.example.com",
mountPoint: document.getElementById("my-superset-container"),
fetchGuestToken: () => fetchGuestTokenFromBackend(),
// Customize permalink URLs
resolvePermalinkUrl: ({ key }) => {
// key: the permalink key (e.g., "xyz789")
return `https://my-app.com/analytics/share/${key}`;
},
}
});
```
@@ -245,15 +211,15 @@ To restore the dashboard state from a permalink in your app:
const permalinkKey = routeParams.key;
embedDashboard({
id: 'abc123',
supersetDomain: 'https://superset.example.com',
mountPoint: document.getElementById('my-superset-container'),
id: "abc123",
supersetDomain: "https://superset.example.com",
mountPoint: document.getElementById("my-superset-container"),
fetchGuestToken: () => fetchGuestTokenFromBackend(),
resolvePermalinkUrl: ({ key }) => `https://my-app.com/analytics/share/${key}`,
dashboardUiConfig: {
urlParams: {
permalink_key: permalinkKey, // Restores filters, tabs, chart states, and scrolls to anchor
},
},
permalink_key: permalinkKey, // Restores filters, tabs, chart states, and scrolls to anchor
}
}
});
```

View File

@@ -27,6 +27,19 @@
"webpack-cli": "^5.1.4"
}
},
"node_modules/@ampproject/remapping": {
"version": "2.3.0",
"resolved": "https://registry.npmjs.org/@ampproject/remapping/-/remapping-2.3.0.tgz",
"integrity": "sha512-30iZtAPgz+LTIYoeivqYo853f02jBYSd5uGnGpkFV0M3xOt9aN73erkgYAmZU43x4VfqcnLxW9Kpg3R5LC4YYw==",
"dev": true,
"dependencies": {
"@jridgewell/gen-mapping": "^0.3.5",
"@jridgewell/trace-mapping": "^0.3.24"
},
"engines": {
"node": ">=6.0.0"
}
},
"node_modules/@babel/cli": {
"version": "7.25.6",
"resolved": "https://registry.npmjs.org/@babel/cli/-/cli-7.25.6.tgz",
@@ -58,12 +71,12 @@
}
},
"node_modules/@babel/code-frame": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.29.7.tgz",
"integrity": "sha512-Aup7aUOfpbAUg2ROOJN6Iw5f9DMBlzu0mIkm/malLQFN/YQgO48wCj0Kxa3sEHJvPVFg7siR+qRInwXd2qhQKw==",
"version": "7.29.0",
"resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.29.0.tgz",
"integrity": "sha512-9NhCeYjq9+3uxgdtp20LSiJXJvN0FeCtNGpJxuMFZ1Kv3cWUNb6DOhJwUvcVCzKGR66cw4njwM6hrJLqgOwbcw==",
"dev": true,
"dependencies": {
"@babel/helper-validator-identifier": "^7.29.7",
"@babel/helper-validator-identifier": "^7.28.5",
"js-tokens": "^4.0.0",
"picocolors": "^1.1.1"
},
@@ -72,30 +85,32 @@
}
},
"node_modules/@babel/compat-data": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/compat-data/-/compat-data-7.29.7.tgz",
"integrity": "sha512-locTkQyKvwIEgBzVrn8693ebc97F2U8ZHjbXwDXJ5Fn2TCpNwTlKcaKLkdHop5c/icOFE7qt7Q9JC5hnKNa6Gg==",
"version": "7.25.4",
"resolved": "https://registry.npmjs.org/@babel/compat-data/-/compat-data-7.25.4.tgz",
"integrity": "sha512-+LGRog6RAsCJrrrg/IO6LGmpphNe5DiK30dGjCoxxeGv49B10/3XYGxPsAwrDlMFcFEvdAUavDT8r9k/hSyQqQ==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/core": {
"version": "7.29.6",
"resolved": "https://registry.npmjs.org/@babel/core/-/core-7.29.6.tgz",
"integrity": "sha512-QdxmAo/ikZqqRGA8s43ww8lcql6naWRvEz0FFrl6MIlc7Gi6TroXnSdWa5U/kq6fzcpqpHesicQxFZIieZbyIA==",
"version": "7.25.2",
"resolved": "https://registry.npmjs.org/@babel/core/-/core-7.25.2.tgz",
"integrity": "sha512-BBt3opiCOxUr9euZ5/ro/Xv8/V7yJ5bjYMqG/C1YAo8MIKAnumZalCN+msbci3Pigy4lIQfPUpfMM27HMGaYEA==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/code-frame": "^7.29.0",
"@babel/generator": "^7.29.6",
"@babel/helper-compilation-targets": "^7.28.6",
"@babel/helper-module-transforms": "^7.28.6",
"@babel/helpers": "^7.29.2",
"@babel/parser": "^7.29.3",
"@babel/template": "^7.28.6",
"@babel/traverse": "^7.29.0",
"@babel/types": "^7.29.0",
"@jridgewell/remapping": "^2.3.5",
"@ampproject/remapping": "^2.2.0",
"@babel/code-frame": "^7.24.7",
"@babel/generator": "^7.25.0",
"@babel/helper-compilation-targets": "^7.25.2",
"@babel/helper-module-transforms": "^7.25.2",
"@babel/helpers": "^7.25.0",
"@babel/parser": "^7.25.0",
"@babel/template": "^7.25.0",
"@babel/traverse": "^7.25.2",
"@babel/types": "^7.25.2",
"convert-source-map": "^2.0.0",
"debug": "^4.1.0",
"gensync": "^1.0.0-beta.2",
@@ -111,13 +126,13 @@
}
},
"node_modules/@babel/generator": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/generator/-/generator-7.29.7.tgz",
"integrity": "sha512-DkXD5OJQaAQIdZ1bt3UZdEnHAn9Imd3IVBdX03UFe+ony9Ojw5pzr9YVKGDY1jt+Gcn/FnGkNf8r+Vj5NOJWtQ==",
"version": "7.29.1",
"resolved": "https://registry.npmjs.org/@babel/generator/-/generator-7.29.1.tgz",
"integrity": "sha512-qsaF+9Qcm2Qv8SRIMMscAvG4O3lJ0F1GuMo5HR/Bp02LopNgnZBC/EkbevHFeGs4ls/oPz9v+Bsmzbkbe+0dUw==",
"dev": true,
"dependencies": {
"@babel/parser": "^7.29.7",
"@babel/types": "^7.29.7",
"@babel/parser": "^7.29.0",
"@babel/types": "^7.29.0",
"@jridgewell/gen-mapping": "^0.3.12",
"@jridgewell/trace-mapping": "^0.3.28",
"jsesc": "^3.0.2"
@@ -154,14 +169,15 @@
}
},
"node_modules/@babel/helper-compilation-targets": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helper-compilation-targets/-/helper-compilation-targets-7.29.7.tgz",
"integrity": "sha512-wem6WaBj4NaVYVdNhLPPVacES6ZJ+KBBfSkTMD3YZxbP3rm3Di85tJU5ljaUNhaOynt+Aj0xruhYuzQBt8n71g==",
"version": "7.25.2",
"resolved": "https://registry.npmjs.org/@babel/helper-compilation-targets/-/helper-compilation-targets-7.25.2.tgz",
"integrity": "sha512-U2U5LsSaZ7TAt3cfaymQ8WHh0pxvdHoEk6HVpaexxixjyEquMh0L0YNJNM6CTGKMXV1iksi0iZkGw4AcFkPaaw==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/compat-data": "^7.29.7",
"@babel/helper-validator-option": "^7.29.7",
"browserslist": "^4.24.0",
"@babel/compat-data": "^7.25.2",
"@babel/helper-validator-option": "^7.24.8",
"browserslist": "^4.23.1",
"lru-cache": "^5.1.1",
"semver": "^6.3.1"
},
@@ -366,28 +382,29 @@
}
},
"node_modules/@babel/helper-string-parser": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helper-string-parser/-/helper-string-parser-7.29.7.tgz",
"integrity": "sha512-Pb5ijPrZ89GDH8223L4UP8i6QApWxs04RbPQJTeWDV0/keR2E36MeKnyr6LYmUUvqRRI+Iv87SuF1W6ErINzYw==",
"version": "7.27.1",
"resolved": "https://registry.npmjs.org/@babel/helper-string-parser/-/helper-string-parser-7.27.1.tgz",
"integrity": "sha512-qMlSxKbpRlAridDExk92nSobyDdpPijUq2DW6oDnUqd0iOGxmQjyqhMIihI9+zv4LPyZdRje2cavWPbCbWm3eA==",
"dev": true,
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-validator-identifier": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-identifier/-/helper-validator-identifier-7.29.7.tgz",
"integrity": "sha512-qehxGkRj55h/ff8EMaJ+cYhyaKlHIxqYDn682wQD7RNp9UujOQsHog2uS0r2vzr4pW+sXf90NeeayjcNaX3fFg==",
"version": "7.28.5",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-identifier/-/helper-validator-identifier-7.28.5.tgz",
"integrity": "sha512-qSs4ifwzKJSV39ucNjsvc6WVHs6b7S03sOh2OcHF9UHfVPqWWALUsNUVzhSBiItjRZoLHx7nIarVjqKVusUZ1Q==",
"dev": true,
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-validator-option": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-option/-/helper-validator-option-7.29.7.tgz",
"integrity": "sha512-N9ZErrD+yW5geCDtBqnOoxmR8+tNKiGuxKlDpuJxfsqpa2dFcexaziGAE/qoHLiDDreVNMupxGmSoNlyvsA3gw==",
"version": "7.24.8",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-option/-/helper-validator-option-7.24.8.tgz",
"integrity": "sha512-xb8t9tD1MHLungh/AIoWYN+gVHaB9kwlu8gffXGSt3FFEIT7RjS+xWbc2vUD1UTZdIpKj/ab3rdqJ7ufngyi2Q==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=6.9.0"
}
@@ -408,25 +425,26 @@
}
},
"node_modules/@babel/helpers": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helpers/-/helpers-7.29.7.tgz",
"integrity": "sha512-1k2lAGRMfHTcwuNYcCNUmaUffmQv8KWMfh2iJUUeRlwlwH4FdNG7mfPI10NPfLHJFThE4Tyr4mv7kTNZOiPuBg==",
"version": "7.25.6",
"resolved": "https://registry.npmjs.org/@babel/helpers/-/helpers-7.25.6.tgz",
"integrity": "sha512-Xg0tn4HcfTijTwfDwYlvVCl43V6h4KyVVX2aEm4qdO/PC6L2YvzLHFdmxhoeSA3eslcE6+ZVXHgWwopXYLNq4Q==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/template": "^7.29.7",
"@babel/types": "^7.29.7"
"@babel/template": "^7.25.0",
"@babel/types": "^7.25.6"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/parser": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/parser/-/parser-7.29.7.tgz",
"integrity": "sha512-hnORnjP/1P/zFEndoeX+n+t1RwWRJiJpM/jO7FW32Kn9r5+sJB2JWOdYo4L6k78j15eCwY3Gm/7364B1EMwtNg==",
"version": "7.29.3",
"resolved": "https://registry.npmjs.org/@babel/parser/-/parser-7.29.3.tgz",
"integrity": "sha512-b3ctpQwp+PROvU/cttc4OYl4MzfJUWy6FZg+PMXfzmt/+39iHVF0sDfqay8TQM3JA2EUOyKcFZt75jWriQijsA==",
"dev": true,
"dependencies": {
"@babel/types": "^7.29.7"
"@babel/types": "^7.29.0"
},
"bin": {
"parser": "bin/babel-parser.js"
@@ -1825,14 +1843,14 @@
}
},
"node_modules/@babel/template": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/template/-/template-7.29.7.tgz",
"integrity": "sha512-puq+Gf35oI24FeN11LkoUQFqv9uwNeWpxXZi/Ji3rRIoKAzKnxRaZ+Gkj0vKS9ZCiTESfng1N9LyOyXvo+m+Gg==",
"version": "7.28.6",
"resolved": "https://registry.npmjs.org/@babel/template/-/template-7.28.6.tgz",
"integrity": "sha512-YA6Ma2KsCdGb+WC6UpBVFJGXL58MDA6oyONbjyF/+5sBgxY/dwkhLogbMT2GXXyU84/IhRw/2D1Os1B/giz+BQ==",
"dev": true,
"dependencies": {
"@babel/code-frame": "^7.29.7",
"@babel/parser": "^7.29.7",
"@babel/types": "^7.29.7"
"@babel/code-frame": "^7.28.6",
"@babel/parser": "^7.28.6",
"@babel/types": "^7.28.6"
},
"engines": {
"node": ">=6.9.0"
@@ -1857,13 +1875,13 @@
}
},
"node_modules/@babel/types": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/types/-/types-7.29.7.tgz",
"integrity": "sha512-4zBIxpPzowiZpusoFkyGVwakdRJUyuH5PxQ/PrqghfdFWWasvnCdPfQXHrenDai+gyLARulZjZowCOj6fjT4pA==",
"version": "7.29.0",
"resolved": "https://registry.npmjs.org/@babel/types/-/types-7.29.0.tgz",
"integrity": "sha512-LwdZHpScM4Qz8Xw2iKSzS+cfglZzJGvofQICy7W7v4caru4EaAmyUuO6BGrbyQ2mYV11W0U8j5mBhd14dd3B0A==",
"dev": true,
"dependencies": {
"@babel/helper-string-parser": "^7.29.7",
"@babel/helper-validator-identifier": "^7.29.7"
"@babel/helper-string-parser": "^7.27.1",
"@babel/helper-validator-identifier": "^7.28.5"
},
"engines": {
"node": ">=6.9.0"
@@ -2631,16 +2649,6 @@
"@jridgewell/trace-mapping": "^0.3.24"
}
},
"node_modules/@jridgewell/remapping": {
"version": "2.3.5",
"resolved": "https://registry.npmjs.org/@jridgewell/remapping/-/remapping-2.3.5.tgz",
"integrity": "sha512-LI9u/+laYG4Ds1TDKSJW2YPrIlcVYOwi2fUC6xB43lueCjgxV4lffOCZCtYFiH6TNOX+tQKXx97T4IKHbhyHEQ==",
"dev": true,
"dependencies": {
"@jridgewell/gen-mapping": "^0.3.5",
"@jridgewell/trace-mapping": "^0.3.24"
}
},
"node_modules/@jridgewell/resolve-uri": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/@jridgewell/resolve-uri/-/resolve-uri-3.1.0.tgz",
@@ -7975,6 +7983,16 @@
}
},
"dependencies": {
"@ampproject/remapping": {
"version": "2.3.0",
"resolved": "https://registry.npmjs.org/@ampproject/remapping/-/remapping-2.3.0.tgz",
"integrity": "sha512-30iZtAPgz+LTIYoeivqYo853f02jBYSd5uGnGpkFV0M3xOt9aN73erkgYAmZU43x4VfqcnLxW9Kpg3R5LC4YYw==",
"dev": true,
"requires": {
"@jridgewell/gen-mapping": "^0.3.5",
"@jridgewell/trace-mapping": "^0.3.24"
}
},
"@babel/cli": {
"version": "7.25.6",
"resolved": "https://registry.npmjs.org/@babel/cli/-/cli-7.25.6.tgz",
@@ -7993,38 +8011,38 @@
}
},
"@babel/code-frame": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.29.7.tgz",
"integrity": "sha512-Aup7aUOfpbAUg2ROOJN6Iw5f9DMBlzu0mIkm/malLQFN/YQgO48wCj0Kxa3sEHJvPVFg7siR+qRInwXd2qhQKw==",
"version": "7.29.0",
"resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.29.0.tgz",
"integrity": "sha512-9NhCeYjq9+3uxgdtp20LSiJXJvN0FeCtNGpJxuMFZ1Kv3cWUNb6DOhJwUvcVCzKGR66cw4njwM6hrJLqgOwbcw==",
"dev": true,
"requires": {
"@babel/helper-validator-identifier": "^7.29.7",
"@babel/helper-validator-identifier": "^7.28.5",
"js-tokens": "^4.0.0",
"picocolors": "^1.1.1"
}
},
"@babel/compat-data": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/compat-data/-/compat-data-7.29.7.tgz",
"integrity": "sha512-locTkQyKvwIEgBzVrn8693ebc97F2U8ZHjbXwDXJ5Fn2TCpNwTlKcaKLkdHop5c/icOFE7qt7Q9JC5hnKNa6Gg==",
"version": "7.25.4",
"resolved": "https://registry.npmjs.org/@babel/compat-data/-/compat-data-7.25.4.tgz",
"integrity": "sha512-+LGRog6RAsCJrrrg/IO6LGmpphNe5DiK30dGjCoxxeGv49B10/3XYGxPsAwrDlMFcFEvdAUavDT8r9k/hSyQqQ==",
"dev": true
},
"@babel/core": {
"version": "7.29.6",
"resolved": "https://registry.npmjs.org/@babel/core/-/core-7.29.6.tgz",
"integrity": "sha512-QdxmAo/ikZqqRGA8s43ww8lcql6naWRvEz0FFrl6MIlc7Gi6TroXnSdWa5U/kq6fzcpqpHesicQxFZIieZbyIA==",
"version": "7.25.2",
"resolved": "https://registry.npmjs.org/@babel/core/-/core-7.25.2.tgz",
"integrity": "sha512-BBt3opiCOxUr9euZ5/ro/Xv8/V7yJ5bjYMqG/C1YAo8MIKAnumZalCN+msbci3Pigy4lIQfPUpfMM27HMGaYEA==",
"dev": true,
"requires": {
"@babel/code-frame": "^7.29.0",
"@babel/generator": "^7.29.6",
"@babel/helper-compilation-targets": "^7.28.6",
"@babel/helper-module-transforms": "^7.28.6",
"@babel/helpers": "^7.29.2",
"@babel/parser": "^7.29.3",
"@babel/template": "^7.28.6",
"@babel/traverse": "^7.29.0",
"@babel/types": "^7.29.0",
"@jridgewell/remapping": "^2.3.5",
"@ampproject/remapping": "^2.2.0",
"@babel/code-frame": "^7.24.7",
"@babel/generator": "^7.25.0",
"@babel/helper-compilation-targets": "^7.25.2",
"@babel/helper-module-transforms": "^7.25.2",
"@babel/helpers": "^7.25.0",
"@babel/parser": "^7.25.0",
"@babel/template": "^7.25.0",
"@babel/traverse": "^7.25.2",
"@babel/types": "^7.25.2",
"convert-source-map": "^2.0.0",
"debug": "^4.1.0",
"gensync": "^1.0.0-beta.2",
@@ -8033,13 +8051,13 @@
}
},
"@babel/generator": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/generator/-/generator-7.29.7.tgz",
"integrity": "sha512-DkXD5OJQaAQIdZ1bt3UZdEnHAn9Imd3IVBdX03UFe+ony9Ojw5pzr9YVKGDY1jt+Gcn/FnGkNf8r+Vj5NOJWtQ==",
"version": "7.29.1",
"resolved": "https://registry.npmjs.org/@babel/generator/-/generator-7.29.1.tgz",
"integrity": "sha512-qsaF+9Qcm2Qv8SRIMMscAvG4O3lJ0F1GuMo5HR/Bp02LopNgnZBC/EkbevHFeGs4ls/oPz9v+Bsmzbkbe+0dUw==",
"dev": true,
"requires": {
"@babel/parser": "^7.29.7",
"@babel/types": "^7.29.7",
"@babel/parser": "^7.29.0",
"@babel/types": "^7.29.0",
"@jridgewell/gen-mapping": "^0.3.12",
"@jridgewell/trace-mapping": "^0.3.28",
"jsesc": "^3.0.2"
@@ -8065,14 +8083,14 @@
}
},
"@babel/helper-compilation-targets": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helper-compilation-targets/-/helper-compilation-targets-7.29.7.tgz",
"integrity": "sha512-wem6WaBj4NaVYVdNhLPPVacES6ZJ+KBBfSkTMD3YZxbP3rm3Di85tJU5ljaUNhaOynt+Aj0xruhYuzQBt8n71g==",
"version": "7.25.2",
"resolved": "https://registry.npmjs.org/@babel/helper-compilation-targets/-/helper-compilation-targets-7.25.2.tgz",
"integrity": "sha512-U2U5LsSaZ7TAt3cfaymQ8WHh0pxvdHoEk6HVpaexxixjyEquMh0L0YNJNM6CTGKMXV1iksi0iZkGw4AcFkPaaw==",
"dev": true,
"requires": {
"@babel/compat-data": "^7.29.7",
"@babel/helper-validator-option": "^7.29.7",
"browserslist": "^4.24.0",
"@babel/compat-data": "^7.25.2",
"@babel/helper-validator-option": "^7.24.8",
"browserslist": "^4.23.1",
"lru-cache": "^5.1.1",
"semver": "^6.3.1"
}
@@ -8211,21 +8229,21 @@
}
},
"@babel/helper-string-parser": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helper-string-parser/-/helper-string-parser-7.29.7.tgz",
"integrity": "sha512-Pb5ijPrZ89GDH8223L4UP8i6QApWxs04RbPQJTeWDV0/keR2E36MeKnyr6LYmUUvqRRI+Iv87SuF1W6ErINzYw==",
"version": "7.27.1",
"resolved": "https://registry.npmjs.org/@babel/helper-string-parser/-/helper-string-parser-7.27.1.tgz",
"integrity": "sha512-qMlSxKbpRlAridDExk92nSobyDdpPijUq2DW6oDnUqd0iOGxmQjyqhMIihI9+zv4LPyZdRje2cavWPbCbWm3eA==",
"dev": true
},
"@babel/helper-validator-identifier": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-identifier/-/helper-validator-identifier-7.29.7.tgz",
"integrity": "sha512-qehxGkRj55h/ff8EMaJ+cYhyaKlHIxqYDn682wQD7RNp9UujOQsHog2uS0r2vzr4pW+sXf90NeeayjcNaX3fFg==",
"version": "7.28.5",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-identifier/-/helper-validator-identifier-7.28.5.tgz",
"integrity": "sha512-qSs4ifwzKJSV39ucNjsvc6WVHs6b7S03sOh2OcHF9UHfVPqWWALUsNUVzhSBiItjRZoLHx7nIarVjqKVusUZ1Q==",
"dev": true
},
"@babel/helper-validator-option": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-option/-/helper-validator-option-7.29.7.tgz",
"integrity": "sha512-N9ZErrD+yW5geCDtBqnOoxmR8+tNKiGuxKlDpuJxfsqpa2dFcexaziGAE/qoHLiDDreVNMupxGmSoNlyvsA3gw==",
"version": "7.24.8",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-option/-/helper-validator-option-7.24.8.tgz",
"integrity": "sha512-xb8t9tD1MHLungh/AIoWYN+gVHaB9kwlu8gffXGSt3FFEIT7RjS+xWbc2vUD1UTZdIpKj/ab3rdqJ7ufngyi2Q==",
"dev": true
},
"@babel/helper-wrap-function": {
@@ -8240,22 +8258,22 @@
}
},
"@babel/helpers": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/helpers/-/helpers-7.29.7.tgz",
"integrity": "sha512-1k2lAGRMfHTcwuNYcCNUmaUffmQv8KWMfh2iJUUeRlwlwH4FdNG7mfPI10NPfLHJFThE4Tyr4mv7kTNZOiPuBg==",
"version": "7.25.6",
"resolved": "https://registry.npmjs.org/@babel/helpers/-/helpers-7.25.6.tgz",
"integrity": "sha512-Xg0tn4HcfTijTwfDwYlvVCl43V6h4KyVVX2aEm4qdO/PC6L2YvzLHFdmxhoeSA3eslcE6+ZVXHgWwopXYLNq4Q==",
"dev": true,
"requires": {
"@babel/template": "^7.29.7",
"@babel/types": "^7.29.7"
"@babel/template": "^7.25.0",
"@babel/types": "^7.25.6"
}
},
"@babel/parser": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/parser/-/parser-7.29.7.tgz",
"integrity": "sha512-hnORnjP/1P/zFEndoeX+n+t1RwWRJiJpM/jO7FW32Kn9r5+sJB2JWOdYo4L6k78j15eCwY3Gm/7364B1EMwtNg==",
"version": "7.29.3",
"resolved": "https://registry.npmjs.org/@babel/parser/-/parser-7.29.3.tgz",
"integrity": "sha512-b3ctpQwp+PROvU/cttc4OYl4MzfJUWy6FZg+PMXfzmt/+39iHVF0sDfqay8TQM3JA2EUOyKcFZt75jWriQijsA==",
"dev": true,
"requires": {
"@babel/types": "^7.29.7"
"@babel/types": "^7.29.0"
}
},
"@babel/plugin-bugfix-firefox-class-in-computed-class-key": {
@@ -9139,14 +9157,14 @@
}
},
"@babel/template": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/template/-/template-7.29.7.tgz",
"integrity": "sha512-puq+Gf35oI24FeN11LkoUQFqv9uwNeWpxXZi/Ji3rRIoKAzKnxRaZ+Gkj0vKS9ZCiTESfng1N9LyOyXvo+m+Gg==",
"version": "7.28.6",
"resolved": "https://registry.npmjs.org/@babel/template/-/template-7.28.6.tgz",
"integrity": "sha512-YA6Ma2KsCdGb+WC6UpBVFJGXL58MDA6oyONbjyF/+5sBgxY/dwkhLogbMT2GXXyU84/IhRw/2D1Os1B/giz+BQ==",
"dev": true,
"requires": {
"@babel/code-frame": "^7.29.7",
"@babel/parser": "^7.29.7",
"@babel/types": "^7.29.7"
"@babel/code-frame": "^7.28.6",
"@babel/parser": "^7.28.6",
"@babel/types": "^7.28.6"
}
},
"@babel/traverse": {
@@ -9165,13 +9183,13 @@
}
},
"@babel/types": {
"version": "7.29.7",
"resolved": "https://registry.npmjs.org/@babel/types/-/types-7.29.7.tgz",
"integrity": "sha512-4zBIxpPzowiZpusoFkyGVwakdRJUyuH5PxQ/PrqghfdFWWasvnCdPfQXHrenDai+gyLARulZjZowCOj6fjT4pA==",
"version": "7.29.0",
"resolved": "https://registry.npmjs.org/@babel/types/-/types-7.29.0.tgz",
"integrity": "sha512-LwdZHpScM4Qz8Xw2iKSzS+cfglZzJGvofQICy7W7v4caru4EaAmyUuO6BGrbyQ2mYV11W0U8j5mBhd14dd3B0A==",
"dev": true,
"requires": {
"@babel/helper-string-parser": "^7.29.7",
"@babel/helper-validator-identifier": "^7.29.7"
"@babel/helper-string-parser": "^7.27.1",
"@babel/helper-validator-identifier": "^7.28.5"
}
},
"@bcoe/v8-coverage": {
@@ -9753,16 +9771,6 @@
"@jridgewell/trace-mapping": "^0.3.24"
}
},
"@jridgewell/remapping": {
"version": "2.3.5",
"resolved": "https://registry.npmjs.org/@jridgewell/remapping/-/remapping-2.3.5.tgz",
"integrity": "sha512-LI9u/+laYG4Ds1TDKSJW2YPrIlcVYOwi2fUC6xB43lueCjgxV4lffOCZCtYFiH6TNOX+tQKXx97T4IKHbhyHEQ==",
"dev": true,
"requires": {
"@jridgewell/gen-mapping": "^0.3.5",
"@jridgewell/trace-mapping": "^0.3.24"
}
},
"@jridgewell/resolve-uri": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/@jridgewell/resolve-uri/-/resolve-uri-3.1.0.tgz",

Some files were not shown because too many files have changed in this diff Show More