Snapshots all four versioned Docusaurus sections at v6.1.0, cut from
master after the version-cutting tooling (#39837) and broken-internal-
links fixes (#40102) landed. Captures fresh auto-generated content and
freezes data dependencies so the historical snapshot stays correct.
Versioning behavior: lastVersion stays at current for every section,
so the canonical URLs (/docs/..., /admin-docs/..., /developer-docs/...,
/components/...) continue to render content from master. The current
version is consistently labeled "Next" with an unreleased banner, and
6.1.0 is a historical pin accessible only via its explicit version
segment.
Component playground: previously disabled: true in versions-config.json,
now enabled and versioned. The plugin block in docusaurus.config.ts
was already gated only by the disabled flag, so no other code changes
were needed to bring it back online.
Snapshot includes:
- All MDX content for the four sections.
- Auto-gen captured fresh: 74 database pages (engine spec metadata),
~1,800 API reference files (openapi.json), 59 component pages
(Storybook stories).
- Data imports frozen at cut time into snapshot-local _versioned_data/
dirs:
versioned_docs/version-6.1.0/_versioned_data/src/data/databases.json
(canonical 80-database diagnostics from master, preserved by the
generator's input-hash cache)
admin_docs_versioned_docs/version-6.1.0/_versioned_data/data/countries.json
admin_docs_versioned_docs/version-6.1.0/_versioned_data/static/feature-flags.json
developer_docs_versioned_docs/version-6.1.0/_versioned_data/static/data/components.json
- Import paths in deeply-nested files rewritten so they still resolve
from one directory deeper inside the snapshot.
Verified via full yarn build: exit 0, no broken links surfaced by
onBrokenLinks: throw. Anchor warnings present are pre-existing on
master (community#superset-community-calendar) and unrelated.
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Joe Li <joe@preset.io>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Joe Li <joe@preset.io>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Joe Li <joe@preset.io>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Joe Li <joe@preset.io>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: codeant-ai-for-open-source[bot] <244253245+codeant-ai-for-open-source[bot]@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Joe Li <joe@preset.io>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Joe Li <joe@preset.io>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: codeant-ai-for-open-source[bot] <244253245+codeant-ai-for-open-source[bot]@users.noreply.github.com>
Co-authored-by: Diego Pucci <diegopucci.me@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: codeant-ai-for-open-source[bot] <244253245+codeant-ai-for-open-source[bot]@users.noreply.github.com>
We kindly ask you to include the following information in your report:
- Apache Superset version that you are using
- A sanitized copy of your `superset_config.py` file or any config overrides
-Detailed steps to reproduce the vulnerability
**Submission Standards & AI Policy**
To ensure engineering focus remains on verified risks and to manage high reporting volumes, all reports must meet the following criteria:
-Plain Text Format: In accordance with Apache guidelines, please provide all details in plain text within the email body. Avoid sending PDFs, Word documents, or password-protected archives.
- Mandatory AI Disclosure: If you utilized Large Language Models (LLMs) or AI tools to identify a flaw or assist in writing a report, you must disclose this in your submission so our triage team can contextualize the findings.
- Human-Verified PoC: All submissions must include a manual, step-by-step Proof of Concept (PoC) performed on a supported release. Raw AI outputs, hypothetical chat transcripts, or unverified scanner logs will be closed as Invalid.
We kindly ask you to include the following information in your report to assist our developers in triaging and remediating issues efficiently:
- Version/Commit: The specific version of Apache Superset or the Git commit hash you are using.
- Configuration: A sanitized copy of your `superset_config.py` file or any config overrides.
- Environment: Your deployment method (e.g., Docker Compose, Helm, or source) and relevant OS/Browser details.
- Impacted Component: Identification of the affected area (e.g., Python backend, React frontend, or a specific database connector).
- Expected vs. Actual Behavior: A clear description of the intended system behavior versus the observed vulnerability.
- Detailed Reproduction Steps: Clear, manual steps to reproduce the vulnerability.
**Out of Scope Vulnerabilities**
To prioritize engineering efforts on genuine architectural risks, the following scenarios are explicitly out of scope and will not be issued a CVE:
- Attacks requiring Admin privileges: (e.g., CSS injection, template manipulation, dashboard ownership overrides, or modifying global system settings). Per the CVE vulnerability definition in CNA Operational Rules 4.1, a qualifying vulnerability must allow violation of a security policy. The Admin role is a fully trusted operational boundary defined by Apache Superset's security policy; actions within this boundary do not violate that policy and are therefore considered intended capabilities 'by design,' not vulnerabilities.
- Brute Force and Rate Limiting: Reports targeting a lack of resource exhaustion protections, generic rate-limiting, or volumetric Denial of Service (DoS) attempts.
- Theoretical attack vectors: Issues without a demonstrable, reproducible exploit path.
- Non-Exploitable Findings: Missing security headers, generic banner disclosures, or descriptive error messages that do not lead to a direct, documented exploit.
**Outcome of Reports**
Reports that are deemed out-of-scope for a CVE but represent valid security best practices or hardening opportunities may be converted into public GitHub issues. This allows the community to contribute to the general hardening of the platform even when a specific vulnerability threshold is not met.
Note that Apache Superset is not responsible for any third-party dependencies that may
have security issues. Any vulnerabilities found in third-party dependencies should be
@@ -29,6 +51,13 @@ reported to the maintainers of those projects. Results from security scans of Ap
Superset dependencies found on its official Docker image can be remediated at release time
by extending the image itself.
**Vulnerability Aggregation & CVE Attribution**
In accordance with MITRE CNA Operational Rules (4.1.10, 4.1.11, and 4.2.13), Apache Superset issues CVEs based on the underlying architectural root cause rather than the number of affected endpoints or exploit payloads.
- Aggregation: If multiple exploit vectors stem from the same programmatic failure or shared vulnerable code, they must be aggregated into a single, comprehensive report.
- Independent Fixes: Separate CVEs will only be assigned if the vulnerabilities reside in decoupled architectural modules and can be fixed independently of one another.
Reports that fail to aggregate related findings will be merged during triage to ensure an accurate and defensible CVE record.
**Your responsible disclosure and collaboration are invaluable.**
`❗ @${pull.user.login} Your base branch \`${currentBranch}\` has ` +
'also updated `superset/migrations`.\n' +
'\n' +
'**Please consider rebasing your branch and [resolving potential db migration conflicts](https://github.com/apache/superset/blob/master/CONTRIBUTING.md#merging-db-migrations).**',
'**Please consider rebasing your branch and [resolving potential db migration conflicts](https://superset.apache.org/docs/contributing/development#merging-db-migrations).**',
Looks like your PR contains new `.js` or `.jsx` files:
```
${{steps.check.outputs.js_files_added}}
```
As decided in [SIP-36](https://github.com/apache/superset/issues/9101), all new frontend code should be written in TypeScript. Please convert above files to TypeScript then re-request review.
A modern, enterprise-ready business intelligence web application.
### Documentation
- **[User Guide](https://superset.apache.org/user-docs/)** — For analysts and business users. Explore data, build charts, create dashboards, and connect databases.
- **[Administrator Guide](https://superset.apache.org/admin-docs/)** — Install, configure, and operate Superset. Covers security, scaling, and database drivers.
- **[Developer Guide](https://superset.apache.org/developer-docs/)** — Contribute to Superset or build on its REST API and extension framework.
[**Why Superset?**](#why-superset) |
[**Supported Databases**](#supported-databases) |
[**Installation and Configuration**](#installation-and-configuration) |
Superset can query data from any SQL-speaking datastore or data engine (Presto, Trino, Athena, [and more](https://superset.apache.org/docs/configuration/databases)) that has a Python DB-API driver and a SQLAlchemy dialect.
Superset can query data from any SQL-speaking datastore or data engine (Presto, Trino, Athena, [and more](https://superset.apache.org/docs/databases)) that has a Python DB-API driver and a SQLAlchemy dialect.
Here are some of the major database solutions that are supported:
**A more comprehensive list of supported databases** along with the configuration instructions can be found [here](https://superset.apache.org/docs/configuration/databases).
**A more comprehensive list of supported databases** along with the configuration instructions can be found [here](https://superset.apache.org/docs/databases).
Want to add support for your datastore or data engine? Read more [here](https://superset.apache.org/docs/frequently-asked-questions#does-superset-work-with-insert-database-engine-here) about the technical requirements.
@@ -165,14 +195,14 @@ Try out Superset's [quickstart](https://superset.apache.org/docs/quickstart/) gu
to find resources around contributing along with a detailed guide on
how to set up a development environment.
## Resources
- [Superset "In the Wild"](https://superset.apache.org/inTheWild) - see who's using Superset, and [add your organization](https://github.com/apache/superset/edit/master/RESOURCES/INTHEWILD.yaml) to the list!
- [Feature Flags](https://github.com/apache/superset/blob/master/RESOURCES/FEATURE_FLAGS.md) - the status of Superset's Feature Flags.
- [Feature Flags](https://superset.apache.org/docs/configuration/feature-flags) - the status of Superset's Feature Flags.
- [Standard Roles](https://github.com/apache/superset/blob/master/RESOURCES/STANDARD_ROLES.md) - How RBAC permissions map to roles.
- [Superset Wiki](https://github.com/apache/superset/wiki) - Tons of additional community resources: best practices, community content and other information.
- [Superset SIPs](https://github.com/orgs/apache/projects/170) - The status of Superset's SIPs (Superset Improvement Proposals) for both consensus and implementation status.
# Part 2: Verify RSA key - this is the same as running `gpg --verify {release}.asc {release}` and comparing the RSA key and email address against the KEYS file # noqa: E501
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Superset Feature Flags
This is a list of the current Superset optional features. See config.py for default values. These features can be turned on/off by setting your preferred values in superset_config.py to True/False respectively
## In Development
These features are considered **unfinished** and should only be used on development environments.
[//]: # "PLEASE KEEP THE LIST SORTED ALPHABETICALLY"
- ALERT_REPORT_TABS
- DATE_RANGE_TIMESHIFTS_ENABLED
- ENABLE_ADVANCED_DATA_TYPES
- PRESTO_EXPAND_DATA
- SHARE_QUERIES_VIA_KV_STORE
- TAGGING_SYSTEM
- CHART_PLUGINS_EXPERIMENTAL
## In Testing
These features are **finished** but currently being tested. They are usable, but may still contain some bugs.
[//]: # "PLEASE KEEP THE LIST SORTED ALPHABETICALLY"
These features flags currently default to True and **will be removed in a future major release**. For this current release you can turn them off by setting your config to False, but it is advised to remove or set these flags in your local configuration to **True** so that you do not experience any unexpected changes in a future release.
[//]: # "PLEASE KEEP THE LIST SORTED ALPHABETICALLY"
When the feature flag is enabled, these permissions are enforced on both the frontend (disabled buttons with tooltips) and backend (403 responses from API endpoints). When disabled, legacy `can_csv` behavior is preserved.
**Migration behavior:** All three new permissions are granted to every role that currently has `can_csv`, preserving existing access. Admins can then selectively revoke individual export permissions from specific roles as needed.
### Deck.gl MapBox viewport and opacity controls are functional
The Deck.gl MapBox chart's **Opacity**, **Default longitude**, **Default latitude**, and **Zoom** controls were previously non-functional — changing them had no effect on the rendered map. These controls are now wired up correctly.
**Behavior change for existing charts:** Previously, the viewport controls had hard-coded default values (`-122.405293`, `37.772123`, zoom `11` — San Francisco) that were stored in each chart's `form_data` but never applied. The map always used `fitBounds` to center on the data. With this fix, those stored values are now respected, which means existing MapBox charts may open centered on the old default coordinates instead of fitting to data bounds.
**To restore fit-to-data behavior:** Open the chart in Explore, clear the **Default longitude**, **Default latitude**, and **Zoom** fields in the Viewport section, and re-save the chart.
### Combined datasource list endpoint
Added a new combined datasource list endpoint at `GET /api/v1/datasource/` to serve datasets and semantic views in one response.
- The endpoint is available to users with at least one of `can_read` on `Dataset` or `SemanticView`.
- Semantic views are included only when the `SEMANTIC_LAYERS` feature flag is enabled.
- The endpoint enforces strict `order_column` validation and returns `400` for invalid sort columns.
### ClickHouse minimum driver version bump
The minimum required version of `clickhouse-connect` has been raised to `>=0.13.0`. If you are using the ClickHouse connector, please upgrade your `clickhouse-connect` package. The `_mutate_label` workaround that appended hash suffixes to column aliases has also been removed, as it is no longer needed with modern versions of the driver.
### MCP Tool Observability
MCP (Model Context Protocol) tools now include enhanced observability instrumentation for monitoring and debugging:
**Two-layer instrumentation:**
1.**Middleware layer** (`LoggingMiddleware`): Automatically logs all MCP tool calls with `duration_ms` and `success` status in the audit log (Action Log UI, logs table)
2.**Sub-operation tracking**: All 19 MCP tools include granular `event_logger.log_context()` blocks for tracking individual operations like validation, database writes, and query execution
**Security note:** Sensitive parameters (passwords, API keys, tokens) are automatically redacted in logs as `[REDACTED]`.
### Distributed Coordination Backend
A new `DISTRIBUTED_COORDINATION_CONFIG` configuration provides a unified Redis-based backend for real-time coordination features in Superset. This backend enables:
- **Pub/sub messaging** for real-time event notifications between workers
- **Atomic distributed locking** using Redis SET NX EX (more performant than database-backed locks)
- **Event-based coordination** for background task management
The distributed coordination is used by the Global Task Framework (GTF) for abort notifications and task completion signaling, and will eventually replace `GLOBAL_ASYNC_QUERIES_CACHE_BACKEND` as the standard signaling backend. Configuring this is recommended for Redis enabled production deployments.
Example configuration in `superset_config.py`:
```python
DISTRIBUTED_COORDINATION_CONFIG={
"CACHE_TYPE":"RedisCache",
"CACHE_KEY_PREFIX":"signal_",
"CACHE_REDIS_URL":"redis://localhost:6379/1",
"CACHE_DEFAULT_TIMEOUT":300,
}
```
See `superset/config.py` for complete configuration options.
### WebSocket config for GAQ with Docker
[35896](https://github.com/apache/superset/pull/35896) and [37624](https://github.com/apache/superset/pull/37624) updated documentation on how to run and configure Superset with Docker. Specifically for the WebSocket configuration, a new `docker/superset-websocket/config.example.json` was added to the repo, so that users could copy it to create a `docker/superset-websocket/config.json` file. The existing `docker/superset-websocket/config.json` was removed and git-ignored, so if you're using GAQ / WebSocket make sure to:
- Stash/backup your existing `config.json` file, to re-apply it after (will get git-ignored going forward)
- Update the `volumes` configuration for the `superset-websocket` service in your `docker-compose.override.yml` file, to include the `docker/superset-websocket/config.json` file. For example:
- Auto-discovery: create `superset/examples/my_dataset/data.parquet` to add a new example
- Parquet is an Apache project format: compressed (~27% smaller), self-describing schema
- YAML configs define datasets, charts, and dashboards declaratively
- Removed Python-based data generation from individual example files
#### Test Data Reorganization
- Moved `big_data.py` to `superset/cli/test_loaders.py` - better reflects its purpose as a test utility
- Fixed inverted logic for `--load-test-data` flag (now correctly includes .test.yaml files when flag is set)
- Clarified CLI flags:
- `--force` / `-f`: Force reload even if tables exist
- `--only-metadata` / `-m`: Create table metadata without loading data
- `--load-test-data` / `-t`: Include test dashboards and .test.yaml configs
- `--load-big-data` / `-b`: Generate synthetic stress-test data
#### Bug Fixes
- Fixed numpy array serialization for PostgreSQL (converts complex types to JSON strings)
- Fixed KeyError for `allow_csv_upload` field in database configs (now optional with default)
- Fixed test data loading logic that was incorrectly filtering files
### MCP Service
The MCP (Model Context Protocol) service enables AI assistants and automation tools to interact programmatically with Superset.
@@ -128,6 +269,28 @@ See `superset/mcp_service/PRODUCTION.md` for deployment guides.
- [35062](https://github.com/apache/superset/pull/35062): Changed the function signature of `setupExtensions` to `setupCodeOverrides` with options as arguments.
### Breaking Changes
- [37370](https://github.com/apache/superset/pull/37370): The `APP_NAME` configuration variable no longer controls the browser window/tab title or other frontend branding. Application names should now be configured using the theme system with the `brandAppName` token. The `APP_NAME` config is still used for backend contexts (MCP service, logs, etc.) and serves as a fallback if `brandAppName` is not set.
- **Migration:**
```python
# Before (Superset 5.x)
APP_NAME = "My Custom App"
# After (Superset 6.x) - Option 1: Use theme system (recommended)
# After (Superset 6.x) - Option 2: Temporary fallback
# Keep APP_NAME for now (will be used as fallback for brandAppName)
APP_NAME = "My Custom App"
# But you should migrate to THEME_DEFAULT.token.brandAppName
```
- **Note:** For dark mode, set the same tokens in `THEME_DARK` configuration.
- [36317](https://github.com/apache/superset/pull/36317): The `CUSTOM_FONT_URLS` configuration option has been removed. Use the new per-theme `fontUrls` token in `THEME_DEFAULT` or database-managed themes instead.
- **Before:**
```python
@@ -142,7 +305,7 @@ See `superset/mcp_service/PRODUCTION.md` for deployment guides.
"fontUrls": [
"https://fonts.example.com/myfont.css",
],
# ... other tokens
# ... other tokens
}
}
```
@@ -165,14 +328,14 @@ Note: Pillow is now a required dependency (previously optional) to support image
- [33116](https://github.com/apache/superset/pull/33116) In Echarts Series charts (e.g. Line, Area, Bar, etc.) charts, the `x_axis_sort_series` and `x_axis_sort_series_ascending` form data items have been renamed with `x_axis_sort` and `x_axis_sort_asc`.
There's a migration added that can potentially affect a significant number of existing charts.
- [32317](https://github.com/apache/superset/pull/32317) The horizontal filter bar feature is now out of testing/beta development and its feature flag `HORIZONTAL_FILTER_BAR` has been removed.
- [31590](https://github.com/apache/superset/pull/31590) Marks the begining of intricate work around supporting dynamic Theming, and breaks support for [THEME_OVERRIDES](https://github.com/apache/superset/blob/732de4ac7fae88e29b7f123b6cbb2d7cd411b0e4/superset/config.py#L671) in favor of a new theming system based on AntD V5. Likely this will be in disrepair until settling over the 5.x lifecycle.
- [32432](https://github.com/apache/superset/pull/31260) Moves the List Roles FAB view to the frontend and requires `FAB_ADD_SECURITY_API` to be enabled in the configuration and `superset init` to be executed.
- [31590](https://github.com/apache/superset/pull/31590) Marks the beginning of intricate work around supporting dynamic Theming, and breaks support for [THEME_OVERRIDES](https://github.com/apache/superset/blob/732de4ac7fae88e29b7f123b6cbb2d7cd411b0e4/superset/config.py#L671) in favor of a new theming system based on AntD V5. Likely this will be in disrepair until settling over the 5.x lifecycle.
- [32432](https://github.com/apache/superset/pull/32432) Moves the List Roles FAB view to the frontend and requires `FAB_ADD_SECURITY_API` to be enabled in the configuration and `superset init` to be executed.
- [34319](https://github.com/apache/superset/pull/34319) Drill to Detail and Drill By is now supported in Embedded mode, and also with the `DASHBOARD_RBAC` FF. If you don't want to expose these features in Embedded / `DASHBOARD_RBAC`, make sure the roles used for Embedded / `DASHBOARD_RBAC`don't have the required permissions to perform D2D actions.
## 5.0.0
- [31976](https://github.com/apache/superset/pull/31976) Removed the `DISABLE_LEGACY_DATASOURCE_EDITOR` feature flag. The previous value of the feature flag was `True` and now the feature is permanently removed.
- [31959](https://github.com/apache/superset/pull/32000) Removes CSV_UPLOAD_MAX_SIZE config, use your web server to control file upload size.
- [32000](https://github.com/apache/superset/pull/32000) Removes CSV_UPLOAD_MAX_SIZE config, use your web server to control file upload size.
- [31959](https://github.com/apache/superset/pull/31959) Removes the following endpoints from data uploads: `/api/v1/database/<id>/<file type>_upload` and `/api/v1/database/<file type>_metadata`, in favour of new one (Details on the PR). And simplifies permissions.
- [31844](https://github.com/apache/superset/pull/31844) The `ALERT_REPORTS_EXECUTE_AS` and `THUMBNAILS_EXECUTE_AS` config parameters have been renamed to `ALERT_REPORTS_EXECUTORS` and `THUMBNAILS_EXECUTORS` respectively. A new config flag `CACHE_WARMUP_EXECUTORS` has also been introduced to be able to control which user is used to execute cache warmup tasks. Finally, the config flag `THUMBNAILS_SELENIUM_USER` has been removed. To use a fixed executor for async tasks, use the new `FixedExecutor` class. See the config and docs for more info on setting up different executor profiles.
- [31894](https://github.com/apache/superset/pull/31894) Domain sharding is deprecated in favor of HTTP2. The `SUPERSET_WEBSERVER_DOMAINS` configuration will be removed in the next major version (6.0)
@@ -180,7 +343,7 @@ Note: Pillow is now a required dependency (previously optional) to support image
- [31774](https://github.com/apache/superset/pull/31774): Fixes the spelling of the `USE-ANALAGOUS-COLORS` feature flag. Please update any scripts/configuration item to use the new/corrected `USE-ANALOGOUS-COLORS` flag spelling.
- [31582](https://github.com/apache/superset/pull/31582) Removed the legacy Area, Bar, Event Flow, Heatmap, Histogram, Line, Sankey, and Sankey Loop charts. They were all automatically migrated to their ECharts counterparts with the exception of the Event Flow and Sankey Loop charts which were removed as they were not actively maintained and not widely used. If you were using the Event Flow or Sankey Loop charts, you will need to find an alternative solution.
- [31198](https://github.com/apache/superset/pull/31198) Disallows by default the use of the following ClickHouse functions: "version", "currentDatabase", "hostName".
- [29798](https://github.com/apache/superset/pull/29798) Since 3.1.0, the intial schedule for an alert or report was mistakenly offset by the specified timezone's relation to UTC. The initial schedule should now begin at the correct time.
- [29798](https://github.com/apache/superset/pull/29798) Since 3.1.0, the initial schedule for an alert or report was mistakenly offset by the specified timezone's relation to UTC. The initial schedule should now begin at the correct time.
- [30021](https://github.com/apache/superset/pull/30021) The `dev` layer in our Dockerfile no long includes firefox binaries, only Chromium to reduce bloat/docker-build-time.
- [30099](https://github.com/apache/superset/pull/30099) Translations are no longer included in the default docker image builds. If your environment requires translations, you'll want to set the docker build arg `BUILD_TRANSLATIONS=true`.
- [31262](https://github.com/apache/superset/pull/31262) NOTE: deprecated `pylint` in favor of `ruff` as our only python linter. Only affect development workflows positively (not the release itself). It should cover most important rules, be much faster, but some things linting rules that were enforced before may not be enforce in the exact same way as before.
@@ -193,7 +356,7 @@ Note: Pillow is now a required dependency (previously optional) to support image
- [25166](https://github.com/apache/superset/pull/25166) Changed the default configuration of `UPLOAD_FOLDER` from `/app/static/uploads/` to `/static/uploads/`. It also removed the unused `IMG_UPLOAD_FOLDER` and `IMG_UPLOAD_URL` configuration options.
- [30284](https://github.com/apache/superset/pull/30284) Deprecated GLOBAL_ASYNC_QUERIES_REDIS_CONFIG in favor of the new GLOBAL_ASYNC_QUERIES_CACHE_BACKEND configuration. To leverage Redis Sentinel, set CACHE_TYPE to RedisSentinelCache, or use RedisCache for standalone Redis
- [31961](https://github.com/apache/superset/pull/31961) Upgraded React from version 16.13.1 to 17.0.2. If you are using custom frontend extensions or plugins, you may need to update them to be compatible with React 17.
- [31260](https://github.com/apache/superset/pull/31260) Docker images now use `uv pip install` instead of `pip install` to manage the python envrionment. Most docker-based deployments will be affected, whether you derive one of the published images, or have custom bootstrap script that install python libraries (drivers)
- [31260](https://github.com/apache/superset/pull/31260) Docker images now use `uv pip install` instead of `pip install` to manage the python environment. Most docker-based deployments will be affected, whether you derive one of the published images, or have custom bootstrap script that install python libraries (drivers)
### Potential Downtime
@@ -270,7 +433,7 @@ Note: Pillow is now a required dependency (previously optional) to support image
- [26462](https://github.com/apache/superset/issues/26462): Removes the Profile feature given that it's not actively maintained and not widely used.
- [26377](https://github.com/apache/superset/pull/26377): Removes the deprecated Redirect API that supported short URLs used before the permalink feature.
- [26329](https://github.com/apache/superset/issues/26329): Removes the deprecated `DASHBOARD_NATIVE_FILTERS` feature flag. The previous value of the feature flag was `True` and now the feature is permanently enabled.
- [25510](https://github.com/apache/superset/pull/25510): Reenforces that any newly defined Python data format (other than epoch) must adhere to the ISO 8601 standard (enforced by way of validation at the API and database level) after a previous relaxation to include slashes in addition to dashes. From now on when specifying new columns, dataset owners will need to use a SQL expression instead to convert their string columns of the form %Y/%m/%d etc. to a `DATE`, `DATETIME`, etc. type.
- [25510](https://github.com/apache/superset/pull/25510): Reinforces that any newly defined Python data format (other than epoch) must adhere to the ISO 8601 standard (enforced by way of validation at the API and database level) after a previous relaxation to include slashes in addition to dashes. From now on when specifying new columns, dataset owners will need to use a SQL expression instead to convert their string columns of the form %Y/%m/%d etc. to a `DATE`, `DATETIME`, etc. type.
- [26372](https://github.com/apache/superset/issues/26372): Removes the deprecated `GENERIC_CHART_AXES` feature flag. The previous value of the feature flag was `True` and now the feature is permanently enabled.
WEBDRIVER_BASEURL=f"http://superset_app{os.environ.get('SUPERSET_APP_ROOT','/')}/"# When using docker compose baseurl should be http://superset_nginx{ENV{BASEPATH}}/ # noqa: E501
## Core Principle: Stories Are the Single Source of Truth
When working on the Storybook-to-MDX documentation system:
**ALWAYS fix the story first. NEVER add workarounds to the generator.**
## Why This Matters
The generator (`scripts/generate-superset-components.mjs`) should be lightweight - it extracts data from stories and passes it through. When you add special cases to the generator:
- It becomes harder to maintain
- Stories diverge from their docs representation
- Future stories need to know about generator quirks
When you fix stories to match the expected patterns:
- Stories work identically in Storybook and Docs
- The generator stays simple and predictable
- Patterns are consistent and learnable
## Story Patterns for Docs Generation
### Required Structure
```tsx
// Use inline export default (NOT const meta = ...; export default meta)
exportdefault{
title:'Components/MyComponent',
component: MyComponent,
};
// Name interactive stories with Interactive prefix
exportconstInteractiveMyComponent: Story={
args:{
// Default prop values
},
argTypes:{
// Control definitions - MUST be at story level, not meta level
propName:{
control:{type:'select'},
options:['a','b','c'],
description:'What this prop does',
},
},
};
```
### For Components with Variants (size × style grids)
```tsx
constsizes=['small','medium','large'];
constvariants=['primary','secondary','danger'];
InteractiveButton.parameters={
docs:{
gallery:{
component:'Button',
sizes,
styles: variants,
sizeProp:'size',
styleProp:'variant',
},
},
};
```
### For Components Requiring Children
```tsx
InteractiveIconTooltip.parameters={
docs:{
// Component descriptors with dot notation for nested components
Each section maintains its own version history and can be versioned independently.
@@ -36,23 +37,45 @@ Each section maintains its own version history and can be versioned independentl
To create a new version for any section, use the Docusaurus version command with the appropriate plugin ID or use our automated scripts:
#### Before You Cut
The cut snapshots whatever's on disk into a frozen historical version, including auto-generated content (database pages from `superset/db_engine_specs/`, API reference from `static/resources/openapi.json`, component pages from Storybook stories). The cut script refreshes these via `generate:smart` before snapshotting, but the **`databases.json` diagnostics file** needs special care to capture full detail:
1.**Canonical release cut**: download the `database-diagnostics` artifact from a green `Python-Integration` run on master, place it at `docs/src/data/databases.json`, then run the cut script with `--skip-generate` to preserve it. This is what the production deploy uses and includes full Flask-context diagnostics (driver versions, feature support matrix, etc.).
2.**Local dev cut**: just run the script normally. `generate:smart` will regenerate `databases.json` using your local Flask environment — accurate to whatever drivers/extras you have installed, but typically less complete than the CI artifact.
3.**No Flask available**: also fine — the database generator falls back to AST parsing of engine spec files. The MDX pages are still correct; only the diagnostics JSON is leaner.
Also: confirm `master` CI is green, and that your local checkout matches the SHA you intend to cut from.
#### Using Automated Scripts (Required)
**⚠️ Important:** Always use these custom commands instead of the native Docusaurus commands. These scripts ensure that both the Docusaurus versioning system AND the `versions-config.json` file are updated correctly.
**⚠️ Important:** Always use these custom commands instead of the native Docusaurus commands. These scripts ensure that both the Docusaurus versioning system AND the `versions-config.json` file are updated correctly, AND that auto-generated content is refreshed before snapshotting.
```bash
# Main Documentation
yarn version:add:docs 1.2.0
# Developer Portal
yarn version:add:developer_portal 1.2.0
# Admin Docs
yarn version:add:admin_docs 1.2.0
# Component Playground (when enabled)
# Developer Docs
yarn version:add:developer_docs 1.2.0
# Component Playground
yarn version:add:components 1.2.0
```
What the script does:
1. Refreshes auto-generated content via `generate:smart` (database pages, API reference, component pages).
2. Calls `yarn docusaurus docs:version` (or the per-section equivalent) to snapshot the section.
3. Freezes any data-file imports (`@site/static/*.json`, `../../data/*.json`) into a snapshot-local `_versioned_data/` dir so the historical version doesn't silently mutate when the source files change.
4. Adjusts relative import paths (`../../src/...` → `../../../src/...`) for files now one directory deeper.
5. Updates `versions-config.json` and `<section>_versions.json`.
**Do NOT use** the native Docusaurus commands directly (`yarn docusaurus docs:version`), as they will:
- ❌ Create version files but NOT update `versions-config.json`
- ❌ Skip auto-gen refresh, freezing whatever was on disk
- ❌ Skip data-import freezing, leaving the snapshot pointed at live data
- ❌ Cause versions to not appear in dropdown menus
- ❌ Require manual fixes to synchronize the configuration
@@ -90,8 +113,11 @@ If creating versions manually, you'll need to:
# Main Documentation
yarn version:remove:docs 1.0.0
# Developer Portal
yarn version:remove:developer_portal 1.0.0
# Admin Docs
yarn version:remove:admin_docs 1.0.0
# Developer Docs
yarn version:remove:developer_docs 1.0.0
# Component Playground
yarn version:remove:components 1.0.0
@@ -102,17 +128,20 @@ To manually remove a version:
1. **Delete the version folder** from the appropriate location:
- Main docs: `versioned_docs/version-X.X.X/` (no prefix for main)
Users can configure automated alerts and reports to send dashboards or charts to an email recipient or Slack channel.
- *Alerts* are sent when a SQL condition is reached
- *Reports* are sent on a schedule
Alerts and reports are disabled by default. To turn them on, you'll need to change configuration settings and install a suitable headless browser in your environment.
## Requirements
### Commons
#### In your `superset_config.py` or `superset_config_docker.py`
- `"ALERT_REPORTS"` [feature flag](/admin-docs/configuration/configuring-superset#feature-flags) must be turned to True.
- `beat_schedule` in CeleryConfig must contain schedule for `reports.scheduler`.
- At least one of those must be configured, depending on what you want to use:
- emails: `SMTP_*` settings
- Slack messages: `SLACK_API_TOKEN`
- Users can customize the email subject by including date code placeholders, which will automatically be replaced with the corresponding UTC date when the email is sent. To enable this functionality, activate the `"DATE_FORMAT_IN_EMAIL_SUBJECT"` [feature flag](/admin-docs/configuration/configuring-superset#feature-flags). This enables date formatting in email subjects, preventing all reporting emails from being grouped into the same thread (optional for the reporting feature).
- Use date codes from [strftime.org](https://strftime.org/) to create the email subject.
- If no date code is provided, the original string will be used as the email subject.
##### Disable dry-run mode
Screenshots will be taken but no messages actually sent as long as `ALERT_REPORTS_NOTIFICATION_DRY_RUN = True`, its default value in `docker/pythonpath_dev/superset_config.py`. To disable dry-run mode and start receiving email/Slack notifications, set `ALERT_REPORTS_NOTIFICATION_DRY_RUN` to `False` in [superset config](https://github.com/apache/superset/blob/master/docker/pythonpath_dev/superset_config.py).
#### In your `Dockerfile`
You'll need to extend the Superset image to include a headless browser. Your options include:
- Use Playwright with Chrome: this is the recommended approach as of version 4.1.x or greater. A working example of a Dockerfile that installs these tools is provided under "Building your own production Docker image" on the [Docker Builds](/admin-docs/installation/docker-builds#building-your-own-production-docker-image) page. Read the code comments there as you'll also need to change a feature flag in your config.
- Use Firefox: you'll need to install geckodriver and Firefox.
- Use Chrome without Playwright: you'll need to install Chrome and set the value of `WEBDRIVER_TYPE` to `"chrome"` in your `superset_config.py`.
In Superset versions <=4.0x, users installed Firefox or Chrome and that was documented here.
Only the worker container needs the browser.
### Slack integration
To send alerts and reports to Slack channels, you need to create a new Slack Application on your workspace.
1. Connect to your Slack workspace, then head to [https://api.slack.com/apps].
2. Create a new app.
3. Go to "OAuth & Permissions" section, and give the following scopes to your app:
- `incoming-webhook`
- `files:write`
- `chat:write`
- `channels:read`
- `groups:read`
4. At the top of the "OAuth and Permissions" section, click "install to workspace".
5. Select a default channel for your app and continue.
(You can post to any channel by inviting your Superset app into that channel).
6. The app should now be installed in your workspace, and a "Bot User OAuth Access Token" should have been created. Copy that token in the `SLACK_API_TOKEN` variable of your `superset_config.py`.
7. Ensure the feature flag `ALERT_REPORT_SLACK_V2` is set to True in `superset_config.py`
8. Restart the service (or run `superset init`) to pull in the new configuration.
Note: when you configure an alert or a report, the Slack channel list takes channel names without the leading '#' e.g. use `alerts` instead of `#alerts`.
#### Large Slack Workspaces (10k+ channels)
For workspaces with many channels, fetching the complete channel list can take several minutes and may encounter Slack API rate limits. Add the following to your `superset_config.py`:
When a report includes file attachments (CSV, PDF, or PNG screenshots), the request is sent as `multipart/form-data` instead. In that case, each top-level payload field (`name`, `text`, `description`, `url`) becomes its own form field, and nested structures like `header` are serialized as a JSON-encoded string in their own field. Every attachment is added as a repeated form field named `files`:
Webhook consumers should branch on `Content-Type`: parse the body as JSON when `application/json`, or read the individual form fields (decoding `header` as JSON) when `multipart/form-data`.
#### HTTPS Enforcement
To require HTTPS webhook URLs (recommended for production), set:
```python
ALERT_REPORTS_WEBHOOK_HTTPS_ONLY = True
```
When enabled, Superset rejects webhook configurations that use `http://` URLs.
#### Retry Behavior
Superset automatically retries webhook deliveries on `429 Too Many Requests` and `5xx` server errors using exponential backoff.
### Kubernetes-specific
- You must have a `celery beat` pod running. If you're using the chart included in the GitHub repository under [helm/superset](https://github.com/apache/superset/tree/master/helm/superset), you need to put `supersetCeleryBeat.enabled = true` in your values override.
- You can see the dedicated docs about [Kubernetes installation](/admin-docs/installation/kubernetes) for more details.
### Docker Compose specific
#### You must have in your `docker-compose.yml`
- A Redis message broker
- PostgreSQL DB instead of SQLlite
- One or more `celery worker`
- A single `celery beat`
This process also works in a Docker swarm environment, you would just need to add `Deploy:` to the Superset, Redis and Postgres services along with your specific configs for your swarm.
### Detailed config
The following configurations need to be added to the `superset_config.py` file. This file is loaded when the image runs, and any configurations in it will override the default configurations found in the `config.py`.
You can find documentation about each field in the default `config.py` in the GitHub repository under [superset/config.py](https://github.com/apache/superset/blob/master/superset/config.py).
You need to replace default values with your custom Redis, Slack and/or SMTP config.
Superset uses Celery beat and Celery worker(s) to send alerts and reports.
- The beat is the scheduler that tells the worker when to perform its tasks. This schedule is defined when you create the alert or report.
- The worker will process the tasks that need to be performed when an alert or report is fired.
In the `CeleryConfig`, only the `beat_schedule` is relevant to this feature, the rest of the `CeleryConfig` can be changed for your needs.
Alternatively, you can assign a function to `ALERT_MINIMUM_INTERVAL` and/or `REPORT_MINIMUM_INTERVAL`. This is useful to dynamically retrieve a value as needed:
For security, Superset rewrites external links in alert/report email HTML so
they go through a warning page before the user is navigated to the external
site. Internal links (matching your configured base URL) are not affected.
```python
# Disable external link redirection entirely (default: True)
ALERT_REPORTS_ENABLE_LINK_REDIRECT = False
```
The feature uses `WEBDRIVER_BASEURL_USER_FRIENDLY` (or `WEBDRIVER_BASEURL`)
to determine which hosts are internal.
## Troubleshooting
There are many reasons that reports might not be working. Try these steps to check for specific issues.
### Confirm feature flag is enabled and you have sufficient permissions
If you don't see "Alerts & Reports" under the *Manage* section of the Settings dropdown in the Superset UI, you need to enable the `ALERT_REPORTS` feature flag (see above). Enable another feature flag and check to see that it took effect, to verify that your config file is getting loaded.
Log in as an admin user to ensure you have adequate permissions.
### Check the logs of your Celery worker
This is the best source of information about the problem. In a docker compose deployment, you can do this with a command like `docker logs superset_worker --since 1h`.
### Check web browser and webdriver installation
To take a screenshot, the worker visits the dashboard or chart using a headless browser, then takes a screenshot. If you are able to send a chart as CSV or text but can't send as PNG, your problem may lie with the browser.
If you are handling the installation of the headless browser on your own, do your own verification to ensure that the headless browser opens successfully in the worker environment.
### Send a test email
One symptom of an invalid connection to an email server is receiving an error of `[Errno 110] Connection timed out` in your logs when the report tries to send.
Confirm via testing that your outbound email configuration is correct. Here is the simplest test, for an un-authenticated email SMTP email service running on port 25. If you are sending over SSL, for instance, study how [Superset's codebase sends emails](https://github.com/apache/superset/blob/master/superset/utils/core.py#L818) and then test with those commands and arguments.
Start Python in your worker environment, replace all example values, and run:
- Some cloud hosts disable outgoing unauthenticated SMTP email to prevent spam. For instance, [Azure blocks port 25 by default on some machines](https://learn.microsoft.com/en-us/azure/virtual-network/troubleshoot-outbound-smtp-connectivity). Enable that port or use another sending method.
- Use another set of SMTP credentials that you verify works in this setup.
### Browse to your report from the worker
The worker may be unable to reach the report. It will use the value of `WEBDRIVER_BASEURL` to browse to the report. If that route is invalid, or presents an authentication challenge that the worker can't pass, the report screenshot will fail.
Check this by attempting to `curl` the URL of a report that you see in the error logs of your worker. For instance, from the worker environment, run `curl http://superset_app:8088/superset/dashboard/1/`. You may get different responses depending on whether the dashboard exists - for example, you may need to change the `1` in that URL. If there's a URL in your logs from a failed report screenshot, that's a good place to start. The goal is to determine a valid value for `WEBDRIVER_BASEURL` and determine if an issue like HTTPS or authentication is redirecting your worker.
In a deployment with authentication measures enabled like HTTPS and Single Sign-On, it may make sense to have the worker navigate directly to the Superset application running in the same location, avoiding the need to sign in. For instance, you could use `WEBDRIVER_BASEURL="http://superset_app:8088"` for a docker compose deployment, and set `"force_https": False,` in your `TALISMAN_CONFIG`.
### Duplicate report deliveries
In some deployment configurations a scheduled report can be delivered more than once around its planned time. This typically happens when more than one process is responsible for running the alerts & reports schedule (for example, multiple schedulers or Celery beat instances). To avoid duplicate emails or notifications:
- Ensure that only a **single scheduler/beat process** is configured to trigger alerts and reports for a given environment.
- If you run **multiple Celery workers**, verify that there is still only one component responsible for scheduling the report tasks (workers should execute tasks, not schedule them independently).
- Review your deployment/orchestration setup (for example systemd, Docker, or Kubernetes) to make sure the alerts & reports scheduler is **not started from multiple places by accident**.
## Scheduling Queries as Reports
You can optionally allow your users to schedule queries directly in SQL Lab. This is done by adding
extra metadata to saved queries, which are then picked up by an external scheduled (like
[Apache Airflow](https://airflow.apache.org/)).
To allow scheduled queries, add the following to `SCHEDULED_QUERIES` in your configuration file:
```python
SCHEDULED_QUERIES = {
# This information is collected when the user clicks "Schedule query",
# and saved into the `extra` field of saved queries.
[react-jsonschema-form](https://github.com/mozilla-services/react-jsonschema-form) and will add a
menu item called “Schedule” to SQL Lab. When the menu item is clicked, a modal will show up where
the user can add the metadata required for scheduling the query.
This information can then be retrieved from the endpoint `/api/v1/saved_query/` and used to
schedule the queries that have `schedule_info` in their JSON metadata. For schedulers other than
Airflow, additional fields can be easily added to the configuration file above.
:::resources
- [Tutorial: Automated Alerts and Reporting via Slack/Email in Superset](https://dev.to/ngtduc693/apache-superset-topic-5-automated-alerts-and-reporting-via-slackemail-in-superset-2gbe)
- [Blog: Integrating Slack alerts and Apache Superset for better data observability](https://medium.com/affinityanswers-tech/integrating-slack-alerts-and-apache-superset-for-better-data-observability-fd2f9a12c350)
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
*/}
---
title: AWS IAM Authentication
sidebar_label: AWS IAM Authentication
sidebar_position: 15
---
# AWS IAM Authentication for AWS Databases
Superset supports IAM-based authentication for **Amazon Aurora** (PostgreSQL and MySQL) and **Amazon Redshift**. IAM auth eliminates the need for database passwords — Superset generates a short-lived auth token using temporary AWS credentials instead.
Cross-account IAM role assumption via STS `AssumeRole` is supported, allowing a Superset deployment in one AWS account to connect to databases in a different account.
## Prerequisites
- Enable the `AWS_DATABASE_IAM_AUTH` feature flag in `superset_config.py`. IAM authentication is gated behind this flag; if it is disabled, connections using `aws_iam` fail with *"AWS IAM database authentication is not enabled."*
```python
FEATURE_FLAGS = {
"AWS_DATABASE_IAM_AUTH": True,
}
```
- `boto3` must be installed in your Superset environment:
```bash
pip install boto3
```
- The Superset server's IAM role (or static credentials) must have permission to call `sts:AssumeRole` (for cross-account) or the same-account permissions for the target service:
- **Redshift Serverless**: `redshift-serverless:GetCredentials` and `redshift-serverless:GetWorkgroup`
- SSL must be enabled on the Aurora / Redshift endpoint (required for IAM token auth).
## Configuration
IAM authentication is configured via the **encrypted_extra** field of the database connection. Access this field in the **Advanced** → **Security** section of the database connection form, under **Secure Extra**.
**3. Configure the database connection in Superset** using the `role_arn` and `external_id` from the trust policy (as shown in the configuration example above).
## Credential Caching
STS credentials are cached in memory keyed by `(role_arn, region, external_id)` with a 10-minute TTL. This reduces the number of STS API calls when multiple queries are executed with the same connection. Tokens are refreshed automatically before expiry.
Caching can be configured by providing dictionaries in
`superset_config.py` that comply with [the Flask-Caching config specifications](https://flask-caching.readthedocs.io/en/latest/#configuring-flask-caching).
The following cache configurations can be customized in this way:
- Dashboard filter state (required): `FILTER_STATE_CACHE_CONFIG`.
- Explore chart form data (required): `EXPLORE_FORM_DATA_CACHE_CONFIG`
- Metadata cache (optional): `CACHE_CONFIG`
- Charting data queried from datasets (optional): `DATA_CACHE_CONFIG`
For example, to configure the filter state cache using Redis:
```python
FILTER_STATE_CACHE_CONFIG = {
'CACHE_TYPE': 'RedisCache',
'CACHE_DEFAULT_TIMEOUT': 86400,
'CACHE_KEY_PREFIX': 'superset_filter_cache',
'CACHE_REDIS_URL': 'redis://localhost:6379/0'
}
```
## Dependencies
In order to use dedicated cache stores, additional python libraries must be installed
- For Redis: we recommend the [redis](https://pypi.python.org/pypi/redis) Python package
- Memcached: we recommend using [pylibmc](https://pypi.org/project/pylibmc/) client library as
`python-memcached` does not handle storing binary data correctly.
These libraries can be installed using pip.
## Fallback Metastore Cache
Note, that some form of Filter State and Explore caching are required. If either of these caches
are undefined, Superset falls back to using a built-in cache that stores data in the metadata
database. While it is recommended to use a dedicated cache, the built-in cache can also be used
to cache other data.
For example, to use the built-in cache to store chart data, use the following config:
```python
DATA_CACHE_CONFIG = {
"CACHE_TYPE": "SupersetMetastoreCache",
"CACHE_KEY_PREFIX": "superset_results", # make sure this string is unique to avoid collisions
The cache timeout for charts may be overridden by the settings for an individual chart, dataset, or
database. Each of these configurations will be checked in order before falling back to the default
value defined in `DATA_CACHE_CONFIG`.
Note, that by setting the cache timeout to `-1`, caching for charting data can be disabled, either
per chart, dataset or database, or by default if set in `DATA_CACHE_CONFIG`.
## SQL Lab Query Results
Caching for SQL Lab query results is used when async queries are enabled and is configured using
`RESULTS_BACKEND`.
Note that this configuration does not use a flask-caching dictionary for its configuration, but
instead requires a cachelib object.
See [Async Queries via Celery](/admin-docs/configuration/async-queries-celery) for details.
## Caching Thumbnails
This is an optional feature that can be turned on by activating its [feature flag](/admin-docs/configuration/configuring-superset#feature-flags) on config:
```
FEATURE_FLAGS = {
"THUMBNAILS": True,
"THUMBNAILS_SQLA_LISTENERS": True,
}
```
By default thumbnails are rendered per user, and will fall back to the Selenium user for anonymous users.
To always render thumbnails as a fixed user (`admin` in this example), use the following configuration:
```python
from superset.tasks.types import FixedExecutor
THUMBNAIL_EXECUTORS = [FixedExecutor("admin")]
```
For this feature you will need a cache system and celery workers. All thumbnails are stored on cache
and are processed asynchronously by the workers.
An example config where images are stored on S3 could be:
```python
from flask import Flask
from s3cache.s3cache import S3Cache
...
class CeleryConfig(object):
broker_url = "redis://localhost:6379/0"
imports = (
"superset.sql_lab",
"superset.tasks.thumbnails",
)
result_backend = "redis://localhost:6379/0"
worker_prefetch_multiplier = 10
task_acks_late = True
CELERY_CONFIG = CeleryConfig
def init_thumbnail_cache(app: Flask) -> S3Cache:
return S3Cache("bucket_name", 'thumbs_cache/')
THUMBNAIL_CACHE_CONFIG = init_thumbnail_cache
```
Using the above example cache keys for dashboards will be `superset_thumb__dashboard__{ID}`. You can
Thumbnail and screenshot endpoints return `ETag` response headers based on the cached content digest. Clients can use conditional requests to avoid downloading unchanged images:
```
GET /api/v1/chart/42/thumbnail/
If-None-Match: "abc123..."
→ 304 Not Modified (if unchanged)
→ 200 OK (with new image if changed)
```
This is particularly useful for embedded dashboards and external integrations that periodically poll for updated screenshots — unchanged thumbnails return immediately with no payload.
## Distributed Coordination Backend
Superset supports an optional distributed coordination (`DISTRIBUTED_COORDINATION_CONFIG`) for
high-performance distributed operations. This configuration enables:
- **Distributed locking**: Moves lock operations from the metadata database to Redis, improving
performance and reducing metastore load
- **Real-time event notifications**: Enables instant pub/sub messaging for task abort signals and
completion notifications instead of polling-based approaches
:::note
This requires Redis or Valkey specifically—it uses Redis-specific features (pub/sub, `SET NX EX`)
that are not available in general Flask-Caching backends.
:::
### Configuration
The distributed coordination uses Flask-Caching style configuration for consistency with other cache
backends. Configure `DISTRIBUTED_COORDINATION_CONFIG` in `superset_config.py`:
Individual lock acquisitions can override this value when needed.
### Database-Only Mode
When `DISTRIBUTED_COORDINATION_CONFIG` is not configured, Superset uses database-backed operations:
- **Locking**: Uses the KeyValue table with periodic cleanup of expired entries
- **Event notifications**: Uses database polling instead of pub/sub
While database-backed operations work reliably, the Redis backend is recommended for production
deployments where low latency and reduced database load are important.
:::resources
- [Blog: The Data Engineer's Guide to Lightning-Fast Superset Dashboards](https://preset.io/blog/the-data-engineers-guide-to-lightning-fast-apache-superset-dashboards/)
- [Blog: Accelerating Dashboards with Materialized Views](https://preset.io/blog/accelerating-apache-superset-dashboards-with-materialized-views/)
At the very least, you'll want to change `SECRET_KEY` and `SQLALCHEMY_DATABASE_URI`. Continue reading for more about each of these.
## Specifying a SECRET_KEY
### Adding an initial SECRET_KEY
Superset requires a user-specified SECRET_KEY to start up. This requirement was [added in version 2.1.0 to force secure configurations](https://preset.io/blog/superset-security-update-default-secret_key-vulnerability/). Add a strong SECRET_KEY to your `superset_config.py` file like:
Properly setting up metadata store is beyond the scope of this documentation. We recommend
using a hosted managed service such as [Amazon RDS](https://aws.amazon.com/rds/) or
[Google Cloud Databases](https://cloud.google.com/products/databases?hl=en) to handle
service and supporting infrastructure and backup strategy.
:::
To configure Superset metastore set `SQLALCHEMY_DATABASE_URI` config key on `superset_config`
to the appropriate connection string.
## Running on a WSGI HTTP Server
While you can run Superset on NGINX or Apache, we recommend using Gunicorn in async mode. This
enables impressive concurrency even and is fairly easy to install and configure. Please refer to the
documentation of your preferred technology to set up this Flask WSGI application in a way that works
well in your environment. Here’s an async setup known to work well in production:
```
-w 10 \
-k gevent \
--worker-connections 1000 \
--timeout 120 \
-b 0.0.0.0:6666 \
--limit-request-line 0 \
--limit-request-field_size 0 \
--statsd-host localhost:8125 \
"superset.app:create_app()"
```
Refer to the [Gunicorn documentation](https://docs.gunicorn.org/en/stable/design.html) for more
information. _Note that the development web server (`superset run` or `flask run`) is not intended
for production use._
If you're not using Gunicorn, you may want to disable the use of `flask-compress` by setting
`COMPRESS_REGISTER = False` in your `superset_config.py`.
Currently, the Google BigQuery Python SDK is not compatible with `gevent`, due to some dynamic monkeypatching on python core library by `gevent`.
So, when you use `BigQuery` datasource on Superset, you have to use `gunicorn` worker type except `gevent`.
## HTTPS Configuration
You can configure HTTPS upstream via a load balancer or a reverse proxy (such as nginx) and do SSL/TLS Offloading before traffic reaches the Superset application. In this setup, local traffic from a Celery worker taking a snapshot of a chart for Alerts & Reports can access Superset at a `http://` URL, from behind the ingress point.
You can also configure [SSL in Gunicorn](https://docs.gunicorn.org/en/stable/settings.html#ssl) (the Python webserver) if you are using an official Superset Docker image.
## Configuration Behind a Load Balancer
If you are running superset behind a load balancer or reverse proxy (e.g. NGINX or ELB on AWS), you
may need to utilize a healthcheck endpoint so that your load balancer knows if your superset
instance is running. This is provided at `/health` which will return a 200 response containing “OK”
if the webserver is running.
If the load balancer is inserting `X-Forwarded-For/X-Forwarded-Proto` headers, you should set
`ENABLE_PROXY_FIX = True` in the superset config file (`superset_config.py`) to extract and use the
headers.
In case the reverse proxy is used for providing SSL encryption, an explicit definition of the
`X-Forwarded-Proto` may be required. For the Apache webserver this can be set as follows:
```
RequestHeader set X-Forwarded-Proto "https"
```
## Configuring the application root
*Please be advised that this feature is in BETA.*
Superset supports running the application under a non-root path. The root path
prefix can be specified in one of three ways:
- Customizing the [Flask entrypoint](https://github.com/apache/superset/blob/master/superset/app.py#L29)
by passing the `superset_app_root` variable; or
- Setting the `SUPERSET_APP_ROOT` environment variable to the desired prefix; or
- Setting the `APPLICATION_ROOT` config in your `superset_config.py` file.
Note, the prefix should start with a `/`.
### Customizing the Flask entrypoint
To configure a prefix, e.g `/analytics`, pass the `superset_app_root` argument to
`create_app` when calling flask run either through the `FLASK_APP`
# Will allow user self registration, allowing to create Flask users from Authorized User
AUTH_USER_REGISTRATION = True
# The default user self registration role
AUTH_USER_REGISTRATION_ROLE = "Public"
```
In case you want to assign the `Admin` role on new user registration, it can be assigned as follows:
```python
AUTH_USER_REGISTRATION_ROLE = "Admin"
```
If you encounter the [issue](https://github.com/apache/superset/issues/13243) of not being able to list users from the Superset main page settings, although a newly registered user has an `Admin` role, please re-run `superset init` to sync the required permissions. Below is the command to re-run `superset init` using docker compose.
```
docker-compose exec superset superset init
```
Then, create a `CustomSsoSecurityManager` that extends `SupersetSecurityManager` and overrides
`oauth_user_info`:
```python
import logging
from superset.security import SupersetSecurityManager
class CustomSsoSecurityManager(SupersetSecurityManager):
For public OAuth2 clients that cannot securely store a client secret, enable Proof Key for Code Exchange (PKCE) by adding `code_challenge_method` to the `remote_app` configuration:
```python
OAUTH_PROVIDERS = [
{
'name': 'myProvider',
'remote_app': {
'client_id': 'myClientId',
'client_secret': 'mySecret', # may be empty for pure public clients
PKCE (`S256`) is recommended for all OAuth2 flows, even when a client secret is present, as it protects against authorization code interception attacks.
## LDAP Authentication
FAB supports authenticating user credentials against an LDAP server.
To use LDAP you must install the [python-ldap](https://www.python-ldap.org/en/latest/installing.html) package.
See [FAB's LDAP documentation](https://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-ldap)
for details.
## Mapping LDAP or OAUTH groups to Superset roles
AUTH_ROLES_MAPPING in Flask-AppBuilder is a dictionary that maps from LDAP/OAUTH group names to FAB roles.
It is used to assign roles to users who authenticate using LDAP or OAuth.
### Mapping OAUTH groups to Superset roles
The following `AUTH_ROLES_MAPPING` dictionary would map the OAUTH group "superset_users" to the Superset roles "Gamma" as well as "Alpha", and the OAUTH group "superset_admins" to the Superset role "Admin".
```python
AUTH_ROLES_MAPPING = {
"superset_users": ["Gamma","Alpha"],
"superset_admins": ["Admin"],
}
```
### Mapping LDAP groups to Superset roles
The following `AUTH_ROLES_MAPPING` dictionary would map the LDAP DN "cn=superset_users,ou=groups,dc=example,dc=com" to the Superset roles "Gamma" as well as "Alpha", and the LDAP DN "cn=superset_admins,ou=groups,dc=example,dc=com" to the Superset role "Admin".
Note: This requires `AUTH_LDAP_SEARCH` to be set. For more details, please see the [FAB Security documentation](https://flask-appbuilder.readthedocs.io/en/latest/security.html).
### Syncing roles at login
You can also use the `AUTH_ROLES_SYNC_AT_LOGIN` configuration variable to control how often Flask-AppBuilder syncs the user's roles with the LDAP/OAUTH groups. If `AUTH_ROLES_SYNC_AT_LOGIN` is set to True, Flask-AppBuilder will sync the user's roles each time they log in. If `AUTH_ROLES_SYNC_AT_LOGIN` is set to False, Flask-AppBuilder will only sync the user's roles when they first register.
## Flask app Configuration Hook
`FLASK_APP_MUTATOR` is a configuration function that can be provided in your environment, receives
the app object and can alter it in any way. For example, add `FLASK_APP_MUTATOR` into your
`superset_config.py` to setup session cookie expiration time to 24 hours:
To support a diverse set of users, Superset has some features that are not enabled by default. For
example, some users have stronger security restrictions, while some others may not. So Superset
allows users to enable or disable some features by config. For feature owners, you can add optional
functionalities in Superset, but will be only affected by a subset of users.
You can enable or disable features with flag from `superset_config.py`:
```python
FEATURE_FLAGS = {
'PRESTO_EXPAND_DATA': False,
}
```
A current list of feature flags can be found in the [Feature Flags](/admin-docs/configuration/feature-flags) documentation.
## Security Configuration
### HASH_ALGORITHM
Controls the hashing algorithm used for internal checksums and cache keys (thumbnails, cache keys, etc.). The default is `sha256`, which satisfies environments with stricter compliance requirements (e.g., FedRAMP). Set it to `md5` to retain the legacy behavior from older Superset deployments:
```python
HASH_ALGORITHM = "sha256" # default; set to "md5" for legacy behavior
```
A companion `HASH_ALGORITHM_FALLBACKS` list (default: `["md5"]`) lets UUID lookups fall back to older algorithms, which enables gradual migration without breaking existing entries. Set it to `[]` for strict mode (use only `HASH_ALGORITHM`).
:::note
This setting affects internal Superset operations only, not user passwords or authentication tokens. Changing it in an existing deployment may invalidate cached values but does not require a database migration.
:::
## SQL Lab Query History Pruning
SQL Lab query history is stored in the metadata database and is **not** pruned by default. To trim older rows, enable the `prune_query` Celery beat task by uncommenting it in `CELERY_BEAT_SCHEDULE` and choosing a retention window:
Adjust `retention_period_days` to control how long query rows are kept. Companion opt-in tasks (`prune_logs`, `prune_tasks`) exist for pruning the logs and tasks tables; see the commented-out examples in `superset/config.py`. Without enabling these tasks, the metadata database will grow unbounded over time.
:::resources
- [Blog: Feature Flags in Apache Superset](https://preset.io/blog/feature-flags-in-apache-superset-and-preset/)
The superset cli allows you to import and export datasources from and to YAML. Datasources include
databases. The data is expected to be organized in the following hierarchy:
:::info
Superset's ZIP-based import/export also covers **dashboards**, **charts**, and **saved queries**, exercised through the UI and REST API. The [Dashboard Import Overwrite Behavior](#dashboard-import-overwrite-behavior) and [UUIDs in API Responses](#uuids-in-api-responses) sections below document the behavior shared across all asset types.
:::
```text
├──databases
| ├──database_1
| | ├──table_1
| | | ├──columns
| | | | ├──column_1
| | | | ├──column_2
| | | | └──... (more columns)
| | | └──metrics
| | | ├──metric_1
| | | ├──metric_2
| | | └──... (more metrics)
| | └── ... (more tables)
| └── ... (more databases)
```
:::note
When you export a database connection, the `masked_encrypted_extra` field (used for sensitive connection parameters such as service account JSON, OAuth tokens, and other encrypted credentials) is included in the export. When importing on another instance, these values are decrypted and re-encrypted using the destination instance's `SECRET_KEY`. Ensure the receiving instance has a valid `SECRET_KEY` configured before importing.
:::
## Exporting Datasources to YAML
You can print your current datasources to stdout by running:
```bash
superset export_datasources
```
To save your datasources to a ZIP file run:
```bash
superset export_datasources -f <filename>
```
By default, default (null) values will be omitted. Use the -d flag to include them. If you want back
references to be included (e.g. a column to include the table id it belongs to) use the -b flag.
Alternatively, you can export datasources using the UI:
1. Open **Sources -> Databases** to export all tables associated to a single or multiple databases.
(**Tables** for one or more tables)
2. Select the items you would like to export.
3. Click **Actions -> Export** to YAML
4. If you want to import an item that you exported through the UI, you will need to nest it inside
its parent element, e.g. a database needs to be nested under databases a table needs to be nested
inside a database element.
In order to obtain an **exhaustive list of all fields** you can import using the YAML import run:
```bash
superset export_datasource_schema
```
As a reminder, you can use the `-b` flag to include back references.
## Importing Datasources
In order to import datasources from a ZIP file, run:
```bash
superset import_datasources -p <path / filename>
```
The optional username flag **-u** sets the user used for the datasource import. The default is 'admin'. Example:
When importing a dashboard ZIP with the **overwrite** option enabled, any existing charts that are part of the dashboard are **replaced** rather than duplicated. This applies to:
- Charts whose UUID matches a chart already present in the target instance
- The full chart configuration (query, visualization type, columns, metrics) is replaced by the imported version
If you import without the overwrite flag, existing charts with conflicting UUIDs are left unchanged and the import skips those objects. Use overwrite when you want to push a fully updated dashboard (including chart definitions) from a development or staging environment to production.
## UUIDs in API Responses
The REST API POST endpoints for **datasets**, **charts**, and **dashboards** include the auto-generated `uuid` field in the response body:
```json
{
"id": 42,
"uuid": "b8a8d5c3-1234-4abc-8def-0123456789ab",
...
}
```
UUIDs remain stable across import/export cycles and can be used for cross-environment workflows — for example, recording a UUID when creating a chart in development and using it to identify the matching chart after importing into production.
## Legacy Importing Datasources
### From older versions of Superset to current version
When using Superset version 4.x.x to import from an older version (2.x.x or 3.x.x) importing is supported as the command `legacy_import_datasources` and expects a JSON or directory of JSONs. The options are `-r` for recursive and `-u` for specifying a user. Example of legacy import without options:
```bash
superset legacy_import_datasources -p <path or filename>
```
### From older versions of Superset to older versions
When using an older Superset version (2.x.x & 3.x.x) of Superset, the command is `import_datasources`. ZIP and YAML files are supported and to switch between them the feature flag `VERSIONED_EXPORT` is used. When `VERSIONED_EXPORT` is `True`, `import_datasources` expects a ZIP file, otherwise YAML. Example:
```bash
superset import_datasources -p <path or filename>
```
When `VERSIONED_EXPORT` is `False`, if you supply a path all files ending with **yaml** or **yml** will be parsed. You can apply
additional flags (e.g. to search the supplied path recursively):
```bash
superset import_datasources -p <path> -r
```
The sync flag **-s** takes parameters in order to sync the supplied elements with your file. Be
careful this can delete the contents of your meta database. Example:
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# MCP Server Deployment & Authentication
Superset includes a built-in [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) server that lets AI assistants -- Claude, ChatGPT, and other MCP-compatible clients -- interact with your Superset instance. Through MCP, clients can list dashboards, query datasets, execute SQL, create charts, and more.
This guide covers how to run, secure, and deploy the MCP server.
:::tip Looking for user docs?
See **[Using AI with Superset](/user-docs/using-superset/using-ai-with-superset)** for a guide on what AI can do with Superset and how to connect your AI client.
Both containers share the same `superset_config.py`, so authentication settings, database connections, and feature flags stay in sync.
### Multi-Pod (Kubernetes)
For high-availability deployments, configure Redis so that replicas share session state:
```mermaid
flowchart TD
LB["Load Balancer"] --> M1["MCP Pod 1"]
LB --> M2["MCP Pod 2"]
LB --> M3["MCP Pod 3"]
M1 --> R[("Redis<br/>(session store)")]
M2 --> R
M3 --> R
M1 --> DB[("Postgres")]
M2 --> DB
M3 --> DB
```
**superset_config.py:**
```python
MCP_STORE_CONFIG = {
"enabled": True,
"CACHE_REDIS_URL": "redis://redis-host:6379/0",
"event_store_max_events": 100,
"event_store_ttl": 3600,
}
```
When `CACHE_REDIS_URL` is set, the MCP server uses a Redis-backed EventStore for session management, allowing replicas to share state. Without Redis, each pod manages its own in-memory sessions and stateful MCP interactions may fail when requests hit different replicas.
---
## Configuration Reference
All MCP settings go in `superset_config.py`. Defaults are defined in `superset/mcp_service/mcp_config.py`.
### Core
| Setting | Default | Description |
|---------|---------|-------------|
| `MCP_SERVICE_HOST` | `"localhost"` | Host the MCP server binds to |
| `MCP_SERVICE_PORT` | `5008` | Port the MCP server binds to |
| `MCP_SERVICE_URL` | `None` | Public base URL for MCP-generated links (set this when behind a reverse proxy) |
| `MCP_DEBUG` | `False` | Enable debug logging |
| `MCP_DEV_USERNAME` | -- | Superset username for development mode (no auth) |
| `MCP_RBAC_ENABLED` | `True` | Enforce Superset's role-based access control on MCP tool calls. When `True`, each tool checks that the authenticated user has the required FAB permission before executing. Disable only for testing or trusted-network deployments. |
| `MCP_USER_RESOLVER` | `None` | Custom function `(app, access_token) -> username` to extract a Superset username from a validated JWT token. When `None`, the default resolver checks `preferred_username`, `username`, `email`, and `sub` claims in that order. |
### Response Size Guard
Limits response sizes to prevent exceeding LLM context windows:
| `event_store_max_events` | `100` | Maximum events retained per session |
| `event_store_ttl` | `3600` | Event TTL in seconds |
### Tool Search
By default the MCP server exposes a lightweight tool-search interface instead of advertising every tool at once. This reduces the initial context sent to the LLM by ~70%, which lowers cost and latency. The AI client discovers tools on demand by calling `search_tools` and then invokes them via `call_tool`.
```python
MCP_TOOL_SEARCH_CONFIG = {
"enabled": True,
"strategy": "bm25", # "bm25" (natural language) or "regex"
| `max_results` | `5` | Maximum tools returned per search query |
| `always_visible` | See above | Tools that always appear in `list_tools`, regardless of search |
| `include_schemas` | `False` | When `False` (default, "summary mode"), search results omit `inputSchema` entirely and include a lightweight `parameters_hint` listing top-level parameter names. Set to `True` to include the full `inputSchema` in search results. Full schemas are always used when a tool is actually invoked via `call_tool`. |
| `compact_schemas` | `True` | Strip `$defs` / `$ref` and replace with `{"type": "object"}` in search results to reduce token cost. Only takes effect when `include_schemas=True` — ignored in summary mode. |
| `max_description_length` | `300` | Truncate tool descriptions in search results (0 = no truncation). Applies in both summary and full-schema modes. |
:::tip
Set `enabled: False` to revert to the traditional "show all tools at once" behavior, which some clients or workflows may prefer.
:::
Tool search reduces the initial token cost from ~15–20K tokens (full catalog) down to ~4–5K tokens (pinned tools + search interface) — roughly 85% savings at the start of each conversation.
### Session & CSRF
These values are flat-merged into the Flask app config used by the MCP server process:
```python
MCP_SESSION_CONFIG = {
"SESSION_COOKIE_HTTPONLY": True,
"SESSION_COOKIE_SECURE": False,
"SESSION_COOKIE_SAMESITE": "Lax",
"SESSION_COOKIE_NAME": "superset_session",
"PERMANENT_SESSION_LIFETIME": 86400,
}
MCP_CSRF_CONFIG = {
"WTF_CSRF_ENABLED": True,
"WTF_CSRF_TIME_LIMIT": None,
}
```
---
## Access Control
### RBAC Enforcement
The MCP server respects Superset's full role-based access control (RBAC). Every authenticated user can only access the data and operations their Superset roles permit — the same rules that apply in the Superset UI apply through MCP.
Each tool declares one or more required FAB permissions. The table below maps tool groups to their permission requirements:
| Tool group | Required FAB permission |
|------------|------------------------|
| `list_charts`, `get_chart_info`, `get_chart_data`, `get_chart_preview`, `generate_chart`, `update_chart` | `can_read` on `Chart` (read), `can_write` on `Chart` (mutate) |
| `list_dashboards`, `get_dashboard_info`, `generate_dashboard`, `add_chart_to_existing_dashboard` | `can_read` on `Dashboard` (read), `can_write` on `Dashboard` (mutate) |
| `list_datasets`, `get_dataset_info`, `create_virtual_dataset` | `can_read` on `Dataset` (read), `can_write` on `Dataset` (mutate) |
| `list_databases`, `get_database_info` | `can_read` on `Database` |
| `execute_sql` | `can_execute_sql_query` on `SQLLab` |
| `open_sql_lab_with_context` | `can_read` on `SQLLab` |
| `save_sql_query` | `can_write` on `SavedQuery` |
| `health_check` | None (public) |
To disable RBAC checking globally (for trusted-network deployments or testing), set:
```python
# superset_config.py
MCP_RBAC_ENABLED = False
```
:::warning
Disabling RBAC removes all permission checks from MCP tool calls. Only do this on isolated, internal deployments where all MCP users are trusted admins.
:::
### Audit Log
All MCP tool calls are recorded in Superset's action log. You can view them at **Settings → Action Log** (admin only). Each log entry records:
- The tool name (e.g., `mcp.generate_chart.db_write`)
- The authenticated user
- A timestamp
This makes MCP activity fully auditable alongside regular Superset activity. The action log uses the same event logger as the rest of Superset, so existing log ingestion pipelines (e.g., sending logs to Elasticsearch or a SIEM) capture MCP events automatically.
### Middleware Pipeline
Every MCP request passes through a middleware stack before reaching the tool function. The default stack (assembled in `build_middleware_list()` in `server.py`) is:
| Middleware | Purpose | Default |
|------------|---------|---------|
| `StructuredContentStripperMiddleware` | Strips `structuredContent` from responses for Claude.ai bridge compatibility | Enabled |
| `LoggingMiddleware` | Logs each tool call with user, parameters, and duration | Enabled |
| `GlobalErrorHandlerMiddleware` | Catches unhandled exceptions and sanitizes sensitive data before it reaches the client | Enabled |
| `ResponseSizeGuardMiddleware` | Estimates token count, warns at 80% of limit, blocks at limit | Enabled (configurable via `MCP_RESPONSE_SIZE_CONFIG`) |
| `ResponseCachingMiddleware` | Caches read-heavy tool responses (in-memory or Redis) | Disabled (enable via `MCP_CACHE_CONFIG`) |
Additional middleware classes (`RateLimitMiddleware`, `FieldPermissionsMiddleware`, `PrivateToolMiddleware`) are implemented in `superset/mcp_service/middleware.py` but are not added to the default pipeline. They are available for operators who want to layer them in via a custom startup path.
### Error Sanitization
The `GlobalErrorHandlerMiddleware` automatically redacts sensitive information from all error messages before they reach the LLM client. The following are replaced with generic messages:
- **Database connection strings** — replaced with a generic connection error message
- **API keys and tokens** — redacted from error traces
- **File system paths** — stripped to prevent information disclosure
- **IP addresses** — removed from error context
This ensures that a misconfigured database connection or an unexpected exception never leaks credentials or internal topology to the LLM or its users. All regex patterns used for redaction are bounded to prevent ReDoS attacks.
---
## Performance
### Connection Pooling
Each MCP server process maintains its own SQLAlchemy connection pool to the database. For multi-worker deployments, total open connections = **workers × pool size**.
```python
# superset_config.py
SQLALCHEMY_POOL_SIZE = 5
SQLALCHEMY_MAX_OVERFLOW = 10
SQLALCHEMY_POOL_TIMEOUT = 30
SQLALCHEMY_POOL_RECYCLE = 3600 # Recycle connections after 1 hour
```
For a 3-pod Kubernetes deployment with the defaults above, expect up to 3 × (5 + 10) = 45 connections. Size your database's `max_connections` accordingly.
### Response Caching
Enable response caching for read-heavy workloads (dashboards/datasets that don't change frequently). With the in-memory backend (default when `MCP_STORE_CONFIG` is disabled), caching is per-process. Use Redis-backed caching for consistent cache hits across multiple pods:
Mutating tools (`generate_chart`, `update_chart`, `execute_sql`, `generate_dashboard`) are always excluded from caching regardless of this setting.
---
## Troubleshooting
### Server won't start
- Verify `fastmcp` is installed: `pip install fastmcp`
- Check that `MCP_DEV_USERNAME` is set if auth is disabled -- the server requires a user identity
- Confirm the port is not already in use: `lsof -i :5008`
### 401 Unauthorized
- Verify your JWT token has not expired (`exp` claim)
- Check that `MCP_JWT_ISSUER` and `MCP_JWT_AUDIENCE` match the token's `iss` and `aud` claims exactly
- For RS256 with JWKS: confirm the JWKS URI is reachable from the MCP server
- For RS256 with static key: confirm the public key string includes the `BEGIN`/`END` markers
- For HS256: confirm the secret matches between the token issuer and `MCP_JWT_SECRET`
- Enable `MCP_JWT_DEBUG_ERRORS = True` for detailed server-side logging (errors are never leaked to the client)
### Tool not found
- Ensure the MCP server and Superset share the same `superset_config.py`
- Check server logs at startup -- tool registration errors are logged with the tool name and reason
### Client can't connect
- Verify the MCP server URL is reachable from the client machine
- For Claude Desktop: fully quit the app (not just close the window) and restart after config changes
- For remote access: ensure your firewall and reverse proxy allow traffic to the MCP port
- Confirm the URL path ends with `/mcp` (e.g., `http://localhost:5008/mcp`)
### Permission errors on tool calls
- The MCP server enforces Superset's RBAC permissions -- the authenticated user must have the required roles
- In development mode, ensure `MCP_DEV_USERNAME` maps to a user with appropriate roles (e.g., Admin)
- Check `superset/security/manager.py` for the specific permission tuples required by each tool domain (e.g., `("can_execute_sql_query", "SQLLab")`)
### Response too large
- If a tool call returns an error about exceeding token limits, the response size guard is blocking an oversized result
- Reduce `page_size` or `limit` parameters, use `select_columns` to exclude large fields, or add filters to narrow results
- To adjust the threshold, change `token_limit` in `MCP_RESPONSE_SIZE_CONFIG`
- To disable the guard entirely, set `MCP_RESPONSE_SIZE_CONFIG = {"enabled": False}`
---
## Audit Events
All MCP tool calls are logged to Superset's event logger, the same system used by the web UI (viewable at **Settings → Action Log**). Each event captures:
- **User**: the resolved Superset username from the JWT or dev config
- **Timestamp**: when the operation ran
This means MCP activity is auditable alongside normal user activity. No additional configuration is required — logging is on by default whenever the event logger is enabled in your Superset deployment.
## Tool Pagination
MCP list tools (`list_datasets`, `list_charts`, `list_dashboards`, `list_databases`) use **offset pagination** via `page` (1-based) and `page_size` parameters. Responses include `page`, `page_size`, `total_count`, `total_pages`, `has_previous`, and `has_next`. To iterate through all results:
```python
# Example: fetch all charts across pages
all_charts = []
page = 1
while True:
result = mcp.list_charts(page=page, page_size=50)
all_charts.extend(result["charts"])
if not result.get("has_next"):
break
page += 1
```
## Security Best Practices
- **Use TLS** for all production MCP endpoints -- place the server behind a reverse proxy with HTTPS
- **Enable JWT authentication** for any internet-facing deployment
- **RBAC enforcement** -- The MCP server respects Superset's role-based access control. Users can only access data their roles permit
- **Secrets management** -- Store `MCP_JWT_SECRET`, database credentials, and API keys in environment variables or a secrets manager, never in config files committed to version control
- **Scoped tokens** -- Use `MCP_REQUIRED_SCOPES` to limit what operations a token can perform
- **Network isolation** -- In Kubernetes, restrict MCP pod network policies to only allow traffic from your AI client endpoints
- Review the **[Security documentation](/developer-docs/extensions/security)** for additional extension security guidance
---
## Next Steps
- **[Using AI with Superset](/user-docs/using-superset/using-ai-with-superset)** -- What AI can do with Superset and how to get started
- **[MCP Integration](/developer-docs/extensions/mcp)** -- Build custom MCP tools and prompts via Superset extensions
- **[Security](/developer-docs/extensions/security)** -- Security best practices for extensions
- **[Deployment](/developer-docs/extensions/deployment)** -- Package and deploy Superset extensions
Note that Superset bundles [flask-talisman](https://pypi.org/project/talisman/)
Self-described as a small Flask extension that handles setting HTTP headers that can help
protect against a few common web application security issues.
## HTML Embedding of Dashboards and Charts
There are two ways to embed a dashboard: Using the [SDK](https://www.npmjs.com/package/@superset-ui/embedded-sdk) or embedding a direct link. Note that in the latter case everybody who knows the link is able to access the dashboard.
### Embedding a Public Direct Link to a Dashboard
This works by first changing the content security policy (CSP) of [flask-talisman](https://github.com/GoogleCloudPlatform/flask-talisman) to allow for certain domains to display Superset content. Then a dashboard can be made publicly accessible, i.e. **bypassing authentication**. Once made public, the dashboard's URL can be added to an iframe in another website's HTML code.
#### Changing flask-talisman CSP
Add to `superset_config.py` the entire `TALISMAN_CONFIG` section from `config.py` and include a `frame-ancestors` section:
A chart's embed code can be generated by going to a chart's edit view and then clicking at the top right on `...` > `Share` > `Embed code`
### Enabling Embedding via the SDK
Clicking on `...` next to `EDIT DASHBOARD` on the top right of the dashboard's overview page should yield a drop-down menu including the entry "Embed dashboard".
To enable this entry, add the following line to the `.env` file:
```text
SUPERSET_FEATURE_EMBEDDED_SUPERSET=true
```
### Hiding the Logout Button in Embedded Contexts
When Superset is embedded in an application that manages authentication via SSO (OAuth2, SAML, or JWT), the logout button should be hidden since session management is handled by the parent application.
To hide the logout button in embedded contexts, add to `superset_config.py`:
```python
FEATURE_FLAGS = {
"DISABLE_EMBEDDED_SUPERSET_LOGOUT": True,
}
```
This flag only hides the logout button when Superset detects it is running inside an iframe. Users accessing Superset directly (not embedded) will still see the logout button regardless of this setting.
:::note
When embedding with SSO, also set `SESSION_COOKIE_SAMESITE = 'None'` and `SESSION_COOKIE_SECURE = True`. See [Security documentation](/admin-docs/security/securing_superset) for details.
:::
## CSRF settings
Similarly, [flask-wtf](https://flask-wtf.readthedocs.io/en/0.15.x/config/) is used to manage
some CSRF configurations. If you need to exempt endpoints from CSRF (e.g. if you are
running a custom auth postback endpoint), you can add the endpoints to `WTF_CSRF_EXEMPT_LIST`:
## SSH Tunneling
1. Turn on feature flag
- Change [`SSH_TUNNELING`](https://github.com/apache/superset/blob/eb8386e3f0647df6d1bbde8b42073850796cc16f/superset/config.py#L489) to `True`
- If you want to add more security when establishing the tunnel we allow users to overwrite the `SSHTunnelManager` class [here](https://github.com/apache/superset/blob/eb8386e3f0647df6d1bbde8b42073850796cc16f/superset/config.py#L507)
- You can also set the [`SSH_TUNNEL_LOCAL_BIND_ADDRESS`](https://github.com/apache/superset/blob/eb8386e3f0647df6d1bbde8b42073850796cc16f/superset/config.py#L508) this the host address where the tunnel will be accessible on your VPC
2. Create database w/ ssh tunnel enabled
- With the feature flag enabled you should now see ssh tunnel toggle.
- Click the toggle to enable SSH tunneling and add your credentials accordingly.
- Superset allows for two different types of authentication (Basic + Private Key). These credentials should come from your service provider.
3. Verify data is flowing
- Once SSH tunneling has been enabled, go to SQL Lab and write a query to verify data is properly flowing.
## Domain Sharding
:::note
Domain Sharding is deprecated as of Superset 5.0.0, and will be removed in Superset 6.0.0. Please Enable HTTP2 to keep more open connections per domain.
:::
Chrome allows up to 6 open connections per domain at a time. When there are more than 6 slices in
dashboard, a lot of time fetch requests are queued up and wait for next available socket.
[PR 5039](https://github.com/apache/superset/pull/5039) adds domain sharding to Superset,
and this feature will be enabled by configuration only (by default Superset doesn’t allow
cross-domain request).
Add the following setting in your `superset_config.py` file:
- `SUPERSET_WEBSERVER_DOMAINS`: list of allowed hostnames for domain sharding feature.
Please create your domain shards as subdomains of your main domain for authorization to
For a user-focused guide on writing Jinja templates in SQL Lab and virtual datasets, see the [SQL Templating User Guide](/user-docs/using-superset/sql-templating). This page covers administrator configuration options.
:::
## Jinja Templates
SQL Lab and Explore supports [Jinja templating](https://jinja.palletsprojects.com/en/2.11.x/) in queries.
To enable templating, the `ENABLE_TEMPLATE_PROCESSING` [feature flag](/admin-docs/configuration/configuring-superset#feature-flags) needs to be enabled in `superset_config.py`.
:::warning[Security Warning]
While powerful, this feature executes template code on the server. Within the Superset security model, this is **intended functionality**, as users with permissions to edit charts and virtual datasets are considered **trusted users**.
If you grant these permissions to untrusted users, this feature can be exploited as a **Server-Side Template Injection (SSTI)** vulnerability. Do not enable `ENABLE_TEMPLATE_PROCESSING` unless you fully understand and accept the associated security risks.
Additionally:
- The `url_param()` macro allows URL parameters to influence the rendered SQL. Always validate or restrict `url_param()` values in your templates rather than interpolating them directly.
- `filter.get('val')` returns raw filter values without escaping. Use the safe helpers described below (`|where_in`, `| replace("'", "''")`) rather than concatenating values directly into SQL strings.
:::
:::tip
`ENABLE_TEMPLATE_PROCESSING` defaults to `False`. Only enable it if your deployment requires Jinja templates and all users with dataset/chart edit access are administrators or fully trusted internal users.
:::
When templating is enabled, python code can be embedded in virtual datasets and
in Custom SQL in the filter and metric controls in Explore. By default, the following variables are
made available in the Jinja context:
- `columns`: columns which to group by in the query
- `filter`: filters applied in the query
- `from_dttm`: start `datetime` value from the selected time range (`None` if undefined). **Note:** Only available in virtual datasets when a time range filter is applied in Explore/Chart views—not available in standalone SQL Lab queries. (deprecated beginning in version 5.0, use `get_time_filter` instead)
- `to_dttm`: end `datetime` value from the selected time range (`None` if undefined). **Note:** Only available in virtual datasets when a time range filter is applied in Explore/Chart views—not available in standalone SQL Lab queries. (deprecated beginning in version 5.0, use `get_time_filter` instead)
- `groupby`: columns which to group by in the query (deprecated)
- `metrics`: aggregate expressions in the query
- `row_limit`: row limit of the query
- `row_offset`: row offset of the query
- `table_columns`: columns available in the dataset
- `time_column`: temporal column of the query (`None` if undefined)
- `time_grain`: selected time grain (`None` if undefined)
For example, to add a time range to a virtual dataset, you can write the following:
```sql
SELECT *
FROM tbl
WHERE dttm_col > '{{ from_dttm }}' and dttm_col < '{{ to_dttm }}'
```
You can also use [Jinja's logic](https://jinja.palletsprojects.com/en/2.11.x/templates/#tests)
to make your query robust to clearing the timerange filter:
```sql
SELECT *
FROM tbl
WHERE (
{% if from_dttm is not none %}
dttm_col > '{{ from_dttm }}' AND
{% endif %}
{% if to_dttm is not none %}
dttm_col < '{{ to_dttm }}' AND
{% endif %}
1 = 1
)
```
The `1 = 1` at the end ensures a value is present for the `WHERE` clause even when
the time filter is not set. For many database engines, this could be replaced with `true`.
Note that the Jinja parameters are called within _double_ brackets in the query and with
_single_ brackets in the logic blocks.
### Understanding Context Availability
Some Jinja variables like `from_dttm`, `to_dttm`, and `filter` are **only available when a chart or dashboard provides them**. They are populated from:
- Time range filters applied in Explore/Chart views
- Dashboard native filters
- Filter components
**These variables are NOT available in standalone SQL Lab queries** because there's no filter context. If you try to use `{{ from_dttm }}` directly in SQL Lab, you'll get an "undefined parameter" error.
#### Testing Time-Filtered Queries in SQL Lab
To test queries that use time variables in SQL Lab, you have several options:
**Option 1: Use Jinja defaults (recommended)**
```sql
SELECT *
FROM tbl
WHERE dttm_col > '{{ from_dttm | default("2024-01-01", true) }}'
AND dttm_col < '{{ to_dttm | default("2024-12-31", true) }}'
```
**Option 2: Use SQL Lab Parameters**
Set parameters in the SQL Lab UI (Parameters menu):
```json
{
"from_dttm": "2024-01-01",
"to_dttm": "2024-12-31"
}
```
**Option 3: Use `{% set %}` for testing**
```sql
{% set from_dttm = "2024-01-01" %}
{% set to_dttm = "2024-12-31" %}
SELECT *
FROM tbl
WHERE dttm_col > '{{ from_dttm }}' AND dttm_col < '{{ to_dttm }}'
```
:::tip
When you save a SQL Lab query as a virtual dataset and use it in a chart with time filters,
the actual filter values will override any defaults or test values you set.
:::
To add custom functionality to the Jinja context, you need to overload the default Jinja
context in your environment by defining the `JINJA_CONTEXT_ADDONS` in your superset configuration
(`superset_config.py`). Objects referenced in this dictionary are made available for users to use
where the Jinja context is made available.
```python
JINJA_CONTEXT_ADDONS = {
'my_crazy_macro': lambda x: x*2,
}
```
Default values for jinja templates can be specified via `Parameters` menu in the SQL Lab user interface.
In the UI you can assign a set of parameters as JSON
```json
{
"my_table": "foo"
}
```
The parameters become available in your SQL (example: `SELECT * FROM {{ my_table }}` ) by using Jinja templating syntax.
SQL Lab template parameters are stored with the dataset as `TEMPLATE PARAMETERS`.
There is a special ``_filters`` parameter which can be used to test filters used in the jinja template.
```json
{
"_filters": [
{
"col": "action_type",
"op": "IN",
"val": ["sell", "buy"]
}
]
}
```
```sql
SELECT action, count(*) as times
FROM logs
WHERE action in {{ filter_values('action_type')|where_in }}
GROUP BY action
```
Note ``_filters`` is not stored with the dataset. It's only used within the SQL Lab UI.
Besides default Jinja templating, SQL lab also supports self-defined template processor by setting
the `CUSTOM_TEMPLATE_PROCESSORS` in your superset configuration. The values in this dictionary
overwrite the default Jinja template processors of the specified database engine. The example below
configures a custom presto template processor which implements its own logic of processing macro
template with regex parsing. It uses the `$` style macro instead of `{{ }}` style in Jinja
templating.
By configuring it with `CUSTOM_TEMPLATE_PROCESSORS`, a SQL template on a presto database is
processed by the custom one rather than the default one.
In this section, we'll walkthrough the pre-defined Jinja macros in Superset.
### Current Username
The `{{ current_username() }}` macro returns the `username` of the currently logged in user.
If you have caching enabled in your Superset configuration, then by default the `username` value will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
You can disable the inclusion of the `username` value in the calculation of the
cache key by adding the following parameter to your Jinja code:
```python
{{ current_username(add_to_cache_keys=False) }}
```
### Current User ID
The `{{ current_user_id() }}` macro returns the account ID of the currently logged in user.
If you have caching enabled in your Superset configuration, then by default the account `id` value will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
You can disable the inclusion of the account `id` value in the calculation of the
cache key by adding the following parameter to your Jinja code:
```python
{{ current_user_id(add_to_cache_keys=False) }}
```
### Current User Email
The `{{ current_user_email() }}` macro returns the email address of the currently logged in user.
If you have caching enabled in your Superset configuration, then by default the email address value will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
You can disable the inclusion of the email value in the calculation of the
cache key by adding the following parameter to your Jinja code:
```python
{{ current_user_email(add_to_cache_keys=False) }}
```
### Current User Roles
The `{{ current_user_roles() }}` macro returns an array of roles for the logged in user.
If you have caching enabled in your Superset configuration, then by default the roles value will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
You can disable the inclusion of the roles value in the calculation of the
cache key by adding the following parameter to your Jinja code:
```python
{{ current_user_roles(add_to_cache_keys=False) }}
```
You can json-stringify the array by adding `|tojson` to your Jinja code:
```python
{{ current_user_roles()|tojson }}
```
You can use the `|where_in` filter to use your roles in a SQL statement. For example, if `current_user_roles()` returns `['admin', 'viewer']`, the following template:
```python
SELECT * FROM users WHERE role IN {{ current_user_roles()|where_in }}
```
Will be rendered as:
```sql
SELECT * FROM users WHERE role IN ('admin', 'viewer')
```
### Current User RLS Rules
The `{{ current_user_rls_rules() }}` macro returns an array of RLS rules applied to the current dataset for the logged in user.
If you have caching enabled in your Superset configuration, then the list of RLS Rules will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
### Custom URL Parameters
The `{{ url_param('custom_variable') }}` macro lets you define arbitrary URL
parameters and reference them in your SQL code.
:::warning
Always treat `url_param()` values as untrusted input. Escaping behaviour varies by context and configuration, so do not rely on it. Restrict values to an explicit allowlist before using them in SQL:
```sql
{% set cc = url_param('countrycode') %}
{% if cc not in ('US', 'ES', 'FR') %}{% set cc = 'US' %}{% endif %}
WHERE country_code = '{{ cc }}'
```
:::
Here's a concrete example:
- You write the following query in SQL Lab:
```sql
SELECT count(*)
FROM ORDERS
WHERE country_code = '{{ url_param('countrycode') }}'
```
- You're hosting Superset at the domain www.example.com and you send your
coworker in Spain the following SQL Lab URL `www.example.com/superset/sqllab?countrycode=ES`
and your coworker in the USA the following SQL Lab URL `www.example.com/superset/sqllab?countrycode=US`
- For your coworker in Spain, the SQL Lab query will be rendered as:
```sql
SELECT count(*)
FROM ORDERS
WHERE country_code = 'ES'
```
- For your coworker in the USA, the SQL Lab query will be rendered as:
```sql
SELECT count(*)
FROM ORDERS
WHERE country_code = 'US'
```
### Explicitly Including Values in Cache Key
The `{{ cache_key_wrapper() }}` function explicitly instructs Superset to add a value to the
accumulated list of values used in the calculation of the cache key.
This function is only needed when you want to wrap your own custom function return values
Note that this function powers the caching of the `user_id` and `username` values
in the `current_user_id()` and `current_username()` function calls (if you have caching enabled).
### Filter Values
You can retrieve the value for a specific filter as a list using `{{ filter_values() }}`.
This is useful if:
- You want to use a filter component to filter a query where the name of filter component column doesn't match the one in the select statement
- You want to have the ability to filter inside the main query for performance purposes
Here's a concrete example:
```sql
SELECT action, count(*) as times
FROM logs
WHERE
action in {{ filter_values('action_type')|where_in }}
GROUP BY action
```
There `where_in` filter converts the list of values from `filter_values('action_type')` into a string suitable for an `IN` expression.
### Filters for a Specific Column
The `{{ get_filters() }}` macro returns the filters applied to a given column. In addition to
returning the values (similar to how `filter_values()` does), the `get_filters()` macro
returns the operator specified in the Explore UI.
This is useful if:
- You want to handle more than the IN operator in your SQL clause
- You want to handle generating custom SQL conditions for a filter
- You want to have the ability to filter inside the main query for speed purposes
:::warning
`filter.get('val')` returns the raw filter value without escaping. For multi-value filters, use the `|where_in` Jinja filter, which handles quoting safely. For single-value operators like `LIKE`, escape single quotes before interpolating:
```sql
{%- if filter.get('op') == 'LIKE' -%}
AND full_name LIKE '{{ filter.get('val') | replace("'", "''") }}'
{%- endif -%}
```
:::
Here's a concrete example:
```sql
WITH RECURSIVE
superiors(employee_id, manager_id, full_name, level, lineage) AS (
SELECT
employee_id,
manager_id,
full_name,
1 as level,
employee_id as lineage
FROM
employees
WHERE
1=1
{# Render a blank line #}
{%- for filter in get_filters('full_name', remove_filter=True) -%}
{%- if filter.get('op') == 'IN' -%}
AND
full_name IN {{ filter.get('val')|where_in }}
{%- endif -%}
{%- if filter.get('op') == 'LIKE' -%}
AND
full_name LIKE '{{ filter.get('val') | replace("'", "''") }}'
Assuming we are creating a table chart with a simple `COUNT(*)` as the metric with a time filter `Last week` on the
`dttm` column, this would render the following query on Postgres (note the formatting of the temporal filters, and
the absence of time filters on the outer query):
```
SELECT COUNT(*) AS count
FROM
(SELECT *,
'Last week' AS time_range
FROM public.logs
WHERE 1 = 1
AND dttm >= TO_TIMESTAMP('2024-08-27 00:00:00.000000', 'YYYY-MM-DD HH24:MI:SS.US')
AND dttm < TO_TIMESTAMP('2024-09-03 00:00:00.000000', 'YYYY-MM-DD HH24:MI:SS.US')) AS virtual_table
ORDER BY count DESC
LIMIT 1000;
```
When using the `default` parameter, the templated query can be simplified, as the endpoints will always be defined
(to use a fixed time range, you can also use something like `default="2024-08-27 : 2024-09-03"`)
```
{% set time_filter = get_time_filter("dttm", default="Last week", remove_filter=True) %}
SELECT
*,
'{{ time_filter.time_range }}' as time_range
FROM logs
WHERE
dttm >= {{ time_filter.from_expr }}
AND dttm < {{ time_filter.to_expr }}
```
### Datasets
It's possible to query physical and virtual datasets using the `dataset` macro. This is useful if you've defined computed columns and metrics on your datasets, and want to reuse the definition in adhoc SQL Lab queries.
To use the macro, first you need to find the ID of the dataset. This can be done by going to the view showing all the datasets, hovering over the dataset you're interested in, and looking at its URL. For example, if the URL for a dataset is https://superset.example.org/explore/?dataset_type=table&dataset_id=42 its ID is 42.
Once you have the ID you can query it as if it were a table:
```sql
SELECT * FROM {{ dataset(42) }} LIMIT 10
```
If you want to select the metric definitions as well, in addition to the columns, you need to pass an additional keyword argument:
```sql
SELECT * FROM {{ dataset(42, include_metrics=True) }} LIMIT 10
```
Since metrics are aggregations, the resulting SQL expression will be grouped by all non-metric columns. You can specify a subset of columns to group by instead:
The `{{ metric('metric_key', dataset_id) }}` macro can be used to retrieve the metric SQL syntax from a dataset. This can be useful for different purposes:
- Override the metric label in the chart level
- Combine multiple metrics in a calculation
- Retrieve a metric syntax in SQL lab
- Re-use metrics across datasets
This macro avoids copy/paste, allowing users to centralize the metric definition in the dataset layer.
The `dataset_id` parameter is optional, and if not provided Superset will use the current dataset from context (for example, when using this macro in the Chart Builder, by default the `macro_key` will be searched in the dataset powering the chart).
The parameter can be used in SQL Lab, or when fetching a metric from another dataset.
## Available Filters
Superset supports [builtin filters from the Jinja2 templating package](https://jinja.palletsprojects.com/en/stable/templates/#builtin-filters). Custom filters have also been implemented:
### Where In
Parses a list into a SQL-compatible statement. This is useful with macros that return an array (for example the `filter_values` macro):
```
Dashboard filter with "First", "Second" and "Third" options selected
By default, this filter returns `()` (as a string) in case the value is null. The `default_to_none` parameter can be se to `True` to return null in this case:
Superset now rides on **Ant Design v5's token-based theming**.
Every Antd token works, plus a handful of Superset-specific ones for charts and dashboard chrome.
## Managing Themes via UI
Superset includes a built-in **Theme Management** interface accessible from the admin menu under **Settings > Themes**.
### Creating a New Theme
1. Navigate to **Settings > Themes** in the Superset interface
2. Click **+ Theme** to create a new theme
3. Use the [Ant Design Theme Editor](https://ant.design/theme-editor) to design your theme:
- Design your palette, typography, and component overrides
- Open the `CONFIG` modal and copy the JSON configuration
4. Paste the JSON into the theme definition field in Superset
5. Give your theme a descriptive name and save
You can also extend with Superset-specific tokens (documented in the default theme object) before you import.
### System Theme Administration
When `ENABLE_UI_THEME_ADMINISTRATION = True` is configured, administrators can manage system-wide themes directly from the UI:
#### Setting System Themes
- **System Default Theme**: Click the sun icon on any theme to set it as the system-wide default
- **System Dark Theme**: Click the moon icon on any theme to set it as the system dark mode theme
- **Automatic OS Detection**: When both default and dark themes are set, Superset automatically detects and applies the appropriate theme based on OS preferences
#### Managing System Themes
- System themes are indicated with special badges in the theme list
- Only administrators with write permissions can modify system theme settings
- Removing a system theme designation reverts to configuration file defaults
### Applying Themes to Dashboards
Once created, themes can be applied to individual dashboards:
- Edit any dashboard and select your custom theme from the theme dropdown
- Each dashboard can have its own theme, allowing for branded or context-specific styling
## Configuration Options
### Python Configuration
Configure theme behavior via `superset_config.py`:
```python
# Enable UI-based theme administration for admins
ENABLE_UI_THEME_ADMINISTRATION = True
# Optional: Set initial default themes via configuration
# These can be overridden via the UI when ENABLE_UI_THEME_ADMINISTRATION = True
THEME_DEFAULT = {
"token": {
"colorPrimary": "#2893B3",
"colorSuccess": "#5ac189",
# ... your theme JSON configuration
}
}
# Optional: Dark theme configuration
THEME_DARK = {
"algorithm": "dark",
"token": {
"colorPrimary": "#2893B3",
# ... your dark theme overrides
}
}
# To force a single theme on all users, set THEME_DARK = None
# When both themes are defined (via UI or config):
# - Users can manually switch between themes
# - OS preference detection is automatically enabled
```
### App Branding
The application name shown in the browser title bar and navigation can be
set via the `brandAppName` theme token:
```python
THEME_DEFAULT = {
"token": {
"brandAppName": "Acme Analytics",
# ... other tokens
}
}
```
Or in the theme CRUD UI JSON editor:
```json
{
"token": {
"brandAppName": "Acme Analytics"
}
}
```
The existing `APP_NAME` Python config key continues to work for backward compatibility.
`brandAppName` takes precedence when both are set, and allows different themes to carry different brand names.
Email and alert/report notification subjects are driven by backend settings such as
`EMAIL_REPORTS_SUBJECT_PREFIX` and `APP_NAME`, not by this theme token.
### Migration from Configuration to UI
When `ENABLE_UI_THEME_ADMINISTRATION = True`:
1. System themes set via the UI take precedence over configuration file settings
2. The UI shows which themes are currently set as system defaults
3. Administrators can change system themes without restarting Superset
4. Configuration file themes serve as fallbacks when no UI themes are set
### Theme Validation and Fallback
Superset validates theme JSON when it is saved, either through the UI or via configuration. If a theme contains invalid tokens or an unrecognized structure, Superset logs a warning and falls back to the built-in default theme rather than applying a broken configuration. This prevents a bad theme from rendering the application unusable.
The fallback order is:
1. **UI-configured system theme** (highest priority, if `ENABLE_UI_THEME_ADMINISTRATION = True`)
2. **`THEME_DEFAULT` / `THEME_DARK`** from `superset_config.py`
3. **Built-in Superset default theme** (always present as a safety net)
If you see unexpected styling after a config change, check the Superset server logs for theme validation warnings.
### Copying Themes Between Systems
To export a theme for use in configuration files or another instance:
1. Navigate to **Settings > Themes** and click the export icon on your desired theme
2. Extract the JSON configuration from the exported YAML file
3. Use this JSON in your `superset_config.py` or import it into another Superset instance
## Theme Development Workflow
1. **Design**: Use the [Ant Design Theme Editor](https://ant.design/theme-editor) to iterate on your design
2. **Test**: Create themes in Superset's CRUD interface for testing
3. **Apply**: Assign themes to specific dashboards or configure instance-wide
4. **Iterate**: Modify theme JSON directly in the CRUD interface or re-import from the theme editor
## Custom Fonts
Superset supports custom fonts through the theme configuration, allowing you to use branded or custom typefaces without rebuilding the application.
### Default Fonts
By default, Superset uses **Inter** for UI text and **IBM Plex Mono** for code (SQL editors, JSON fields, and other monospace contexts). Both fonts are bundled with the application via `@fontsource` packages and work offline without any external network calls.
:::note
IBM Plex Mono replaced Fira Code as the default code font in Superset 6.1. If you have an existing theme that explicitly sets `fontFamilyCode: "Fira Code, ..."`, you may want to update it.
:::
### Configuring Custom Fonts
To use custom fonts, add font URLs to your theme configuration using the `fontUrls` token:
```python
THEME_DEFAULT = {
"token": {
# Load fonts from external sources (e.g., Google Fonts, Adobe Fonts)
Font URLs are validated against a configurable allowlist. By default, fonts from `fonts.googleapis.com`, `fonts.gstatic.com`, and `use.typekit.net` are allowed. Configure `THEME_FONT_URL_ALLOWED_DOMAINS` to customize the allowed domains.
:::
### Font Sources
- **Google Fonts**: Free, CDN-hosted fonts with wide variety
- **Adobe Fonts**: Premium fonts (requires subscription and kit ID)
- **Self-hosted**: Place font files in `/static/assets/fonts/` and reference via CSS
This feature works with the stock Docker image - no custom build required!
## ECharts Configuration Overrides
:::note
Available since Superset 6.0
:::
Superset provides fine-grained control over ECharts visualizations through theme-level configuration overrides. This allows you to customize the appearance and behavior of all ECharts-based charts without modifying individual chart configurations.
### Global ECharts Overrides
Apply settings to all ECharts visualizations using `echartsOptionsOverrides`:
```python
THEME_DEFAULT = {
"token": {
"colorPrimary": "#2893B3",
# ... other Ant Design tokens
},
"echartsOptionsOverrides": {
"grid": {
"left": "10%",
"right": "10%",
"top": "15%",
"bottom": "15%"
},
"tooltip": {
"backgroundColor": "rgba(0, 0, 0, 0.8)",
"borderColor": "#ccc",
"textStyle": {
"color": "#fff"
}
},
"legend": {
"textStyle": {
"fontSize": 14,
"fontWeight": "bold"
}
}
}
}
```
### Chart-Specific Overrides
Target specific chart types using `echartsOptionsOverridesByChartType`:
```python
THEME_DEFAULT = {
"token": {
"colorPrimary": "#2893B3",
# ... other tokens
},
"echartsOptionsOverridesByChartType": {
"echarts_pie": {
"legend": {
"orient": "vertical",
"right": 10,
"top": "center"
}
},
"echarts_timeseries": {
"xAxis": {
"axisLabel": {
"rotate": 45,
"fontSize": 12
}
},
"dataZoom": [{
"type": "slider",
"show": True,
"start": 0,
"end": 100
}]
},
"echarts_bubble": {
"grid": {
"left": "15%",
"bottom": "20%"
}
}
}
}
```
### UI Configuration
You can also configure ECharts overrides through the theme CRUD interface:
```json
{
"token": {
"colorPrimary": "#2893B3"
},
"echartsOptionsOverrides": {
"grid": {
"left": "10%",
"right": "10%"
},
"tooltip": {
"backgroundColor": "rgba(0, 0, 0, 0.8)"
}
},
"echartsOptionsOverridesByChartType": {
"echarts_pie": {
"legend": {
"orient": "vertical",
"right": 10
}
}
}
}
```
### Override Precedence
The system applies overrides in the following order (last wins):
This ensures chart-specific overrides take precedence over global ones.
### Common Chart Types
Available chart types for `echartsOptionsOverridesByChartType`:
- `echarts_timeseries` - Time series/line charts
- `echarts_pie` - Pie and donut charts
- `echarts_bubble` - Bubble/scatter charts
- `echarts_funnel` - Funnel charts
- `echarts_gauge` - Gauge charts
- `echarts_radar` - Radar charts
- `echarts_boxplot` - Box plot charts
- `echarts_treemap` - Treemap charts
- `echarts_sunburst` - Sunburst charts
- `echarts_graph` - Network/graph charts
- `echarts_sankey` - Sankey diagrams
- `echarts_heatmap` - Heatmaps
- `echarts_mixed_timeseries` - Mixed time series
### Array Property Overrides
Array properties (such as color palettes) are fully supported in overrides. Arrays are **replaced entirely** rather than merged, so specify the complete array:
```python
THEME_DEFAULT = {
"token": { ... },
"echartsOptionsOverrides": {
# Replace the default color palette for all ECharts visualizations
# Complete corporate theme with ECharts customization
THEME_DEFAULT = {
"token": {
"colorPrimary": "#1B4D3E",
"fontFamily": "Corporate Sans, Arial, sans-serif"
},
"echartsOptionsOverrides": {
"grid": {
"left": "8%",
"right": "8%",
"top": "12%",
"bottom": "12%"
},
"textStyle": {
"fontFamily": "Corporate Sans, Arial, sans-serif"
},
"title": {
"textStyle": {
"color": "#1B4D3E",
"fontSize": 18,
"fontWeight": "bold"
}
}
},
"echartsOptionsOverridesByChartType": {
"echarts_timeseries": {
"xAxis": {
"axisLabel": {
"color": "#666",
"fontSize": 11
}
}
},
"echarts_pie": {
"legend": {
"textStyle": {
"fontSize": 12
},
"itemGap": 20
}
}
}
}
```
This feature provides powerful theming capabilities while maintaining the flexibility of ECharts' extensive configuration options.
## Advanced Features
- **System Themes**: Manage system-wide default and dark themes via UI or configuration
- **Per-Dashboard Theming**: Each dashboard can have its own visual identity
- **JSON Editor**: Edit theme configurations directly within Superset's interface
- **Custom Fonts**: Load external fonts via configuration without rebuilding
- **OS Dark Mode Detection**: Automatically switches themes based on system preferences
- **Theme Import/Export**: Share themes between instances via YAML files
## API Access
For programmatic theme management, Superset provides REST endpoints:
- `GET /api/v1/theme/` - List all themes
- `POST /api/v1/theme/` - Create a new theme
- `PUT /api/v1/theme/{id}` - Update a theme
- `DELETE /api/v1/theme/{id}` - Delete a theme
- `PUT /api/v1/theme/{id}/set_system_default` - Set as system default theme (admin only)
- `PUT /api/v1/theme/{id}/set_system_dark` - Set as system dark theme (admin only)
- `DELETE /api/v1/theme/unset_system_default` - Remove system default designation
- `DELETE /api/v1/theme/unset_system_dark` - Remove system dark designation
- `GET /api/v1/theme/export/` - Export themes as YAML
- `POST /api/v1/theme/import/` - Import themes from YAML
These endpoints require appropriate permissions and are subject to RBAC controls.
:::resources
- [Video: Live Demo — Theming Apache Superset](https://www.youtube.com/watch?v=XsZAsO9tC3o)
- [CSS and Theming](https://docs.preset.io/docs/css-and-theming) - Additional theming techniques and CSS customization
- [Blog: Customizing Apache Superset Dashboards with CSS](https://preset.io/blog/customizing-superset-dashboards-with-css/)
- [Blog: Customizing Dashboards with CSS — Tips and Tricks](https://preset.io/blog/customizing-apache-superset-dashboards-with-css-additional-tips-and-tricks/)
There are four distinct timezone components which relate to Apache Superset,
1. The timezone that the underlying data is encoded in.
2. The timezone of the database engine.
3. The timezone of the Apache Superset backend.
4. The timezone of the Apache Superset client.
where if a temporal field (`DATETIME`, `TIME`, `TIMESTAMP`, etc.) does not explicitly define a timezone it defaults to the underlying timezone of the component.
To help make the problem somewhat tractable—given that Apache Superset has no control on either how the data is ingested (1) or the timezone of the client (4)—from a consistency standpoint it is highly recommended that both (2) and (3) are configured to use the same timezone with a strong preference given to [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) to ensure temporal fields without an explicit timestamp are not incorrectly coerced into the wrong timezone. Actually Apache Superset currently has implicit assumptions that timestamps are in UTC and thus configuring (3) to a non-UTC timezone could be problematic.
To strive for data consistency (regardless of the timezone of the client) the Apache Superset backend tries to ensure that any timestamp sent to the client has an explicit (or semi-explicit as in the case with [Epoch time](https://en.wikipedia.org/wiki/Unix_time) which is always in reference to UTC) timezone encoded within.
The challenge however lies with the slew of [database engines](/user-docs/databases#installing-drivers-in-docker) which Apache Superset supports and various inconsistencies between their [Python Database API (DB-API)](https://www.python.org/dev/peps/pep-0249/) implementations combined with the fact that we use [Pandas](https://pandas.pydata.org/) to read SQL into a DataFrame prior to serializing to JSON. Regrettably Pandas ignores the DB-API [type_code](https://www.python.org/dev/peps/pep-0249/#type-objects) relying by default on the underlying Python type returned by the DB-API. Currently only a subset of the supported database engines work correctly with Pandas, i.e., ensuring timestamps without an explicit timestamp are serialized to JSON with the server timezone, thus guaranteeing the client will display timestamps in a consistent manner irrespective of the client's timezone.
For example the following is a comparison of MySQL and Presto,
```python
import pandas as pd
from sqlalchemy import create_engine
pd.read_sql_query(
sql="SELECT TIMESTAMP('2022-01-01 00:00:00') AS ts",
con=create_engine("mysql://root@localhost:3360"),
).to_json()
pd.read_sql_query(
sql="SELECT TIMESTAMP '2022-01-01 00:00:00' AS ts",
con=create_engine("presto://localhost:8080"),
).to_json()
```
which outputs `{"ts":{"0":1640995200000}}` (which infers the UTC timezone per the Epoch time definition) and `{"ts":{"0":"2022-01-01 00:00:00.000"}}` (without an explicit timezone) respectively and thus are treated differently in JavaScript:
```js
new Date(1640995200000)
> Sat Jan 01 2022 13:00:00 GMT+1300 (New Zealand Daylight Time)
new Date("2022-01-01 00:00:00.000")
> Sat Jan 01 2022 00:00:00 GMT+1300 (New Zealand Daylight Time)
If you install with Kubernetes or Docker Compose, all of these components will be created.
However, installing from PyPI only creates the application itself. Users installing from PyPI will need to configure a caching layer, worker, and beat on their own if they wish to enable the above features. Configuration of those components for a PyPI install is not currently covered in this documentation.
Here are further details on each component.
### The Superset Application
This is the core application. Superset operates like this:
- A user visits a chart or dashboard
- That triggers a SQL query to the data warehouse holding the underlying dataset
- The resulting data is served up in a data visualization
- The Superset application is comprised of the Python (Flask) backend application (server), API layer, and the React frontend, built via Webpack, and static assets needed for the application to work
### Metadata Database
This is where chart and dashboard definitions, user information, logs, etc. are stored. Superset is tested to work with PostgreSQL and MySQL databases as the metadata database (not be confused with a data source like your data warehouse, which could be a much greater variety of options like Snowflake, Redshift, etc.).
Some installation methods like our Quickstart and PyPI come configured by default to use a SQLite on-disk database. And in a Docker Compose installation, the data would be stored in a PostgreSQL container volume. Neither of these cases are recommended for production instances of Superset.
For production, a properly-configured, managed, standalone database is recommended. No matter what database you use, you should plan to back it up regularly.
### Caching Layer
The caching layer serves two main functions:
- Store the results of queries to your data warehouse so that when a chart is loaded twice, it pulls from the cache the second time, speeding up the application and reducing load on your data warehouse.
- Act as a message broker for the worker, enabling the Alerts & Reports, async queries, and thumbnail caching features.
Most people use Redis for their cache, but Superset supports other options too. See the [cache docs](/admin-docs/configuration/cache/) for more.
### Worker and Beat
This is one or more workers who execute tasks like run async queries or take snapshots of reports and send emails, and a "beat" that acts as the scheduler and tells workers when to perform their tasks. Most installations use Celery for these components.
## Other components
Other components can be incorporated into Superset. The best place to learn about additional configurations is the [Configuration page](/admin-docs/configuration/configuring-superset). For instance, you could set up a load balancer or reverse proxy to implement HTTPS in front of your Superset application, or specify a Mapbox URL to enable geospatial charts, etc.
Superset won't even start without certain configuration settings established, so it's essential to review that page.
How should you install Superset? Here's a comparison of the different options. It will help if you've first read the [Architecture](/admin-docs/installation/architecture) page to understand Superset's different components.
The fundamental trade-off is between you needing to do more of the detail work yourself vs. using a more complex deployment route that handles those details.
**Summary:** This takes advantage of containerization while remaining simpler than Kubernetes. This is the best way to try out Superset; it's also useful for developing & contributing back to Superset.
If you're not just demoing the software, you'll need a moderate understanding of Docker to customize your deployment and avoid a few risks. Even when fully-optimized this is not as robust a method as Kubernetes when it comes to large-scale production deployments.
You manage a superset-config.py file and a docker-compose.yml file. Docker Compose brings up all the needed services - the Superset application, a Postgres metadata DB, Redis cache, Celery worker and beat. They are automatically connected to each other.
**Responsibilities**
You will need to back up your metadata DB. That could mean backing up the service running as a Docker container and its volume; ideally you are running Postgres as a service outside of that container and backing up that service.
You will also need to extend the Superset docker image. The default `lean` images do not contain drivers needed to access your metadata database (Postgres or MySQL), nor to access your data warehouse, nor the headless browser needed for Alerts & Reports. You could run a `-dev` image while demoing Superset, which has some of this, but you'll still need to install the driver for your data warehouse. The `-dev` images run as root, which is not recommended for production.
Ideally you will build your own image of Superset that extends `lean`, adding what your deployment needs. See [Building your own production Docker image](/admin-docs/installation/docker-builds/#building-your-own-production-docker-image).
**Summary:** This is the best-practice way to deploy a production instance of Superset, but has the steepest skill requirement - someone who knows Kubernetes.
You will deploy Superset into a K8s cluster. The most common method is using the community-maintained Helm chart, though work is now underway to implement [SIP-149 - a Kubernetes Operator for Superset](https://github.com/apache/superset/issues/31408).
A K8s deployment can scale up and down based on usage and deploy rolling updates with zero downtime - features that big deployments appreciate.
**Responsibilities**
You will need to build your own Docker image, and back up your metadata DB, both as described in Docker Compose above. You'll also need to customize your Helm chart values and deploy and maintain your Kubernetes cluster.
## [PyPI (Python)](/admin-docs/installation/pypi)
**Summary:** This is the only method that requires no knowledge of containers. It requires the most hands-on work to deploy, connect, and maintain each component.
You install Superset as a Python package and run it that way, providing your own metadata database. Superset has documentation on how to install this way, but it is updated infrequently.
If you want caching, you'll set up Redis or RabbitMQ. If you want Alerts & Reports, you'll set up Celery.
**Responsibilities**
You will need to get the component services running and communicating with each other. You'll need to arrange backups of your metadata database.
When upgrading, you'll need to manage the system environment and packages and ensure all components have functional dependencies.
superset/superset 0.1.1 1.0 Apache Superset is a modern, enterprise-ready b...
```
3. Configure your setting overrides
Just like any typical Helm chart, you'll need to craft a `values.yaml` file that would define/override any of the values exposed into the default [values.yaml](https://github.com/apache/superset/tree/master/helm/superset/values.yaml), or from any of the dependent charts it depends on:
The exact list will depend on some of your specific configuration overrides but you should generally expect:
- N `superset-xxxx-yyyy` and `superset-worker-xxxx-yyyy` pods (depending on your `supersetNode.replicaCount` and `supersetWorker.replicaCount` values)
- 1 `superset-postgresql-0` depending on your postgres settings
- 1 `superset-redis-master-0` depending on your redis settings
- 1 `superset-celerybeat-xxxx-yyyy` pod if you have `supersetCeleryBeat.enabled = true` in your values overrides
1. Access it
The chart will publish appropriate services to expose the Superset UI internally within your k8s cluster. To access it externally you will have to either:
- Configure the Service as a `LoadBalancer` or `NodePort`
- Set up an `Ingress` for it - the chart includes a definition, but will need to be tuned to your needs (hostname, tls, annotations etc...)
- Run `kubectl port-forward superset-xxxx-yyyy :8088` to directly tunnel one pod's port into your localhost
Depending how you configured external access, the URL will vary. Once you've identified the appropriate URL you can log in with:
- user: `admin`
- password: `admin`
## Important settings
### Security settings
Default security settings and passwords are included but you **MUST** update them to run `prod` instances, in particular:
```yaml
postgresql:
postgresqlPassword: superset
```
Make sure, you set a unique strong complex alphanumeric string for your SECRET_KEY and use a tool to help you generate
a sufficiently random sequence.
- To generate a good key you can run, `openssl rand -base64 42`
Superset uses [Scarf Gateway](https://about.scarf.sh/scarf-gateway) to collect telemetry data. Knowing the installation counts for different Superset versions informs the project's decisions about patching and long-term support. Scarf purges personally identifiable information (PII) and provides only aggregated statistics.
To opt-out of this data collection in your Helm-based installation, edit the `repository:` line in your `helm/superset/values.yaml` file, replacing `apachesuperset.docker.scarf.sh/apache/superset` with `apache/superset` to pull the image directly from Docker Hub.
:::
### Dependencies
Install additional packages and do any other bootstrap configuration in the bootstrap script.
For production clusters it's recommended to build own image with this step done in CI.
:::note
Superset requires a Python DB-API database driver and a SQLAlchemy
dialect to be installed for each datastore you want to connect to.
See [Install Database Drivers](/user-docs/databases#installing-database-drivers) for more information.
It is recommended that you refer to versions listed in
instead of hard-coding them in your bootstrap script, as seen below.
:::
The following example installs the drivers for BigQuery and Elasticsearch, allowing you to connect to these data sources within your Superset setup:
```yaml
bootstrapScript: |
#!/bin/bash
uv pip install .[postgres] \
.[bigquery] \
.[elasticsearch] &&\
if [ ! -f ~/bootstrap ]; then echo "Running Superset with uid {{ .Values.runAsUser }}" > ~/bootstrap; fi
```
### superset_config.py
The default `superset_config.py` is fairly minimal and you will very likely need to extend it. This is done by specifying one or more key/value entries in `configOverrides`, e.g.:
```yaml
configOverrides:
my_override: |
# This will make sure the redirect_uri is properly computed, even with SSL offloading
ENABLE_PROXY_FIX = True
FEATURE_FLAGS = {
"DYNAMIC_PLUGINS": True
}
```
Those will be evaluated as Helm templates and therefore will be able to reference other `values.yaml` variables e.g. `{{ .Values.ingress.hosts[0] }}` will resolve to your ingress external domain.
The entire `superset_config.py` will be installed as a secret, so it is safe to pass sensitive parameters directly... however it might be more readable to use secret env variables for that.
Full python files can be provided by running `helm upgrade --install --values my-values.yaml --set-file configOverrides.oauth=set_oauth.py`
### Environment Variables
Those can be passed as key/values either with `extraEnv` or `extraSecretEnv` if they're sensitive. They can then be referenced from `superset_config.py` using e.g. `os.environ.get("VAR")`.
# Will allow user self registration, allowing to create Flask users from Authorized User
AUTH_USER_REGISTRATION = True
# The default user self registration role
AUTH_USER_REGISTRATION_ROLE = "Admin"
```
### Enable Alerts and Reports
For this, as per the [Alerts and Reports doc](/admin-docs/configuration/alerts-reports), you will need to:
#### Install a supported webdriver in the Celery worker
This is done either by using a custom image that has the webdriver pre-installed, or installing at startup time by overriding the `command`. Here's a working example for `chromedriver`:
```yaml
supersetWorker:
command:
- /bin/sh
- -c
- |
# Install chrome webdriver
# See https://github.com/apache/superset/blob/4fa3b6c7185629b87c27fc2c0e5435d458f7b73d/docs/src/pages/admin-docs/installation/email_reports.mdx
# This is required because our process runs as root (in order to install pip packages)
"--no-sandbox",
"--disable-setuid-sandbox",
"--disable-extensions",
]
```
### Load the Examples data and dashboards
If you are trying Superset out and want some data and dashboards to explore, you can load some examples by creating a `my_values.yaml` and deploying it as described above in the **Configure your setting overrides** step of the **Running** section.
To load the examples, add the following to the `my_values.yaml` file:
```yaml
init:
loadExamples: true
```
:::resources
- [Tutorial: Mastering Data Visualization — Installing Superset on Kubernetes with Helm Chart](https://mahira-technology.medium.com/mastering-data-visualization-installing-superset-on-kubernetes-cluster-using-helm-chart-e4ec99199e1e)
- [Tutorial: Installing Apache Superset in Kubernetes](https://aws.plainenglish.io/installing-apache-superset-in-kubernetes-1aec192ac495)
:::
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.