docs: Superset 6.1 documentation catch-up — batch 5 (#39454)

Co-authored-by: Superset Dev <dev@superset.apache.org>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Evan Rusackas
2026-04-21 20:30:27 -04:00
committed by GitHub
parent e6853894ab
commit e1ed5003a8
7 changed files with 484 additions and 5 deletions

View File

@@ -0,0 +1,162 @@
{/*
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
*/}
---
title: AWS IAM Authentication
sidebar_label: AWS IAM Authentication
sidebar_position: 15
---
# AWS IAM Authentication for AWS Databases
Superset supports IAM-based authentication for **Amazon Aurora** (PostgreSQL and MySQL) and **Amazon Redshift**. IAM auth eliminates the need for database passwords — Superset generates a short-lived auth token using temporary AWS credentials instead.
Cross-account IAM role assumption via STS `AssumeRole` is supported, allowing a Superset deployment in one AWS account to connect to databases in a different account.
## Prerequisites
- Enable the `AWS_DATABASE_IAM_AUTH` feature flag in `superset_config.py`. IAM authentication is gated behind this flag; if it is disabled, connections using `aws_iam` fail with *"AWS IAM database authentication is not enabled."*
```python
FEATURE_FLAGS = {
"AWS_DATABASE_IAM_AUTH": True,
}
```
- `boto3` must be installed in your Superset environment:
```bash
pip install boto3
```
- The Superset server's IAM role (or static credentials) must have permission to call `sts:AssumeRole` (for cross-account) or the same-account permissions for the target service:
- **Aurora (RDS)**: `rds-db:connect`
- **Redshift provisioned**: `redshift:GetClusterCredentials`
- **Redshift Serverless**: `redshift-serverless:GetCredentials` and `redshift-serverless:GetWorkgroup`
- SSL must be enabled on the Aurora / Redshift endpoint (required for IAM token auth).
## Configuration
IAM authentication is configured via the **encrypted_extra** field of the database connection. Access this field in the **Advanced** → **Security** section of the database connection form, under **Secure Extra**.
### Aurora PostgreSQL or Aurora MySQL
```json
{
"aws_iam": {
"enabled": true,
"role_arn": "arn:aws:iam::222222222222:role/SupersetDatabaseAccess",
"external_id": "superset-prod-12345",
"region": "us-east-1",
"db_username": "superset_iam_user",
"session_duration": 3600
}
}
```
| Field | Required | Description |
|-------|----------|-------------|
| `enabled` | Yes | Set to `true` to activate IAM auth |
| `role_arn` | No | ARN of the cross-account IAM role to assume via STS. Omit for same-account auth |
| `external_id` | No | External ID for the STS `AssumeRole` call, if required by the target role's trust policy |
| `region` | Yes | AWS region of the database cluster |
| `db_username` | Yes | The database username associated with the IAM identity |
| `session_duration` | No | STS session duration in seconds (default: `3600`) |
### Redshift (Serverless)
```json
{
"aws_iam": {
"enabled": true,
"role_arn": "arn:aws:iam::222222222222:role/SupersetRedshiftAccess",
"region": "us-east-1",
"workgroup_name": "my-workgroup",
"db_name": "dev"
}
}
```
### Redshift (Provisioned Cluster)
```json
{
"aws_iam": {
"enabled": true,
"role_arn": "arn:aws:iam::222222222222:role/SupersetRedshiftAccess",
"region": "us-east-1",
"cluster_identifier": "my-cluster",
"db_username": "superset_iam_user",
"db_name": "dev"
}
}
```
## Cross-Account IAM Setup
To connect to a database in Account B from a Superset deployment in Account A:
**1. In Account B — create a database-access role:**
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["rds-db:connect"],
"Resource": "arn:aws:rds-db:us-east-1:222222222222:dbuser/db-XXXXXXXXXXXX/superset_iam_user"
}
]
}
```
**Trust policy** (allows Account A's Superset role to assume it):
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111111111111:role/SupersetInstanceRole"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"sts:ExternalId": "superset-prod-12345"
}
}
}
]
}
```
**2. In Account A — grant Superset's role permission to assume the Account B role:**
```json
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::222222222222:role/SupersetDatabaseAccess"
}
```
**3. Configure the database connection in Superset** using the `role_arn` and `external_id` from the trust policy (as shown in the configuration example above).
## Credential Caching
STS credentials are cached in memory keyed by `(role_arn, region, external_id)` with a 10-minute TTL. This reduces the number of STS API calls when multiple queries are executed with the same connection. Tokens are refreshed automatically before expiry.

View File

@@ -10,6 +10,10 @@ version: 1
The superset cli allows you to import and export datasources from and to YAML. Datasources include
databases. The data is expected to be organized in the following hierarchy:
:::info
Superset's ZIP-based import/export also covers **dashboards**, **charts**, and **saved queries**, exercised through the UI and REST API. The [Dashboard Import Overwrite Behavior](#dashboard-import-overwrite-behavior) and [UUIDs in API Responses](#uuids-in-api-responses) sections below document the behavior shared across all asset types.
:::
```text
├──databases
| ├──database_1
@@ -75,6 +79,29 @@ The optional username flag **-u** sets the user used for the datasource import.
superset import_datasources -p <path / filename> -u 'admin'
```
## Dashboard Import Overwrite Behavior
When importing a dashboard ZIP with the **overwrite** option enabled, any existing charts that are part of the dashboard are **replaced** rather than duplicated. This applies to:
- Charts whose UUID matches a chart already present in the target instance
- The full chart configuration (query, visualization type, columns, metrics) is replaced by the imported version
If you import without the overwrite flag, existing charts with conflicting UUIDs are left unchanged and the import skips those objects. Use overwrite when you want to push a fully updated dashboard (including chart definitions) from a development or staging environment to production.
## UUIDs in API Responses
The REST API POST endpoints for **datasets**, **charts**, and **dashboards** include the auto-generated `uuid` field in the response body:
```json
{
"id": 42,
"uuid": "b8a8d5c3-1234-4abc-8def-0123456789ab",
...
}
```
UUIDs remain stable across import/export cycles and can be used for cross-environment workflows — for example, recording a UUID when creating a chart in development and using it to identify the matching chart after importing into production.
## Legacy Importing Datasources
### From older versions of Superset to current version

View File

@@ -501,6 +501,7 @@ All MCP settings go in `superset_config.py`. Defaults are defined in `superset/m
| `MCP_SERVICE_URL` | `None` | Public base URL for MCP-generated links (set this when behind a reverse proxy) |
| `MCP_DEBUG` | `False` | Enable debug logging |
| `MCP_DEV_USERNAME` | -- | Superset username for development mode (no auth) |
| `MCP_PARSE_REQUEST_ENABLED` | `True` | Pre-parse MCP tool inputs from JSON strings into objects. Set to `False` for clients (Claude Desktop, LangChain) that do not double-serialize arguments — this produces cleaner tool schemas for those clients |
### Authentication
@@ -664,6 +665,32 @@ MCP_CSRF_CONFIG = {
---
## Audit Events
All MCP tool calls are logged to Superset's event logger, the same system used by the web UI (viewable at **Settings → Action Log**). Each event captures:
- **Action**: `mcp.<tool_name>.<phase>` (e.g., `mcp.list_databases.query`)
- **User**: the resolved Superset username from the JWT or dev config
- **Timestamp**: when the operation ran
This means MCP activity is auditable alongside normal user activity. No additional configuration is required — logging is on by default whenever the event logger is enabled in your Superset deployment.
## Tool Pagination
MCP list tools (`list_datasets`, `list_charts`, `list_dashboards`, `list_databases`) use **offset pagination** via `page` (1-based) and `page_size` parameters. Responses include `page`, `page_size`, `total_count`, `total_pages`, `has_previous`, and `has_next`. To iterate through all results:
```python
# Example: fetch all charts across pages
all_charts = []
page = 1
while True:
result = mcp.list_charts(page=page, page_size=50)
all_charts.extend(result["charts"])
if not result.get("has_next"):
break
page += 1
```
## Security Best Practices
- **Use TLS** for all production MCP endpoints -- place the server behind a reverse proxy with HTTPS