docs(mcp): add comprehensive architecture, security, and production deployment documentation (#36017)

This commit is contained in:
Amin Ghadersohi
2025-11-20 01:41:56 +10:00
committed by GitHub
parent 4582f0e8d2
commit 66afdfd119
4 changed files with 2929 additions and 0 deletions

View File

@@ -23,6 +23,107 @@ This file documents any backwards-incompatible changes in Superset and
assists people when migrating to a new version.
## Next
### MCP Service
The MCP (Model Context Protocol) service enables AI assistants and automation tools to interact programmatically with Superset.
#### New Features
- MCP service infrastructure with FastMCP framework
- Tools for dashboards, charts, datasets, SQL Lab, and instance metadata
- Optional dependency: install with `pip install apache-superset[fastmcp]`
- Runs as separate process from Superset web server
- JWT-based authentication for production deployments
#### New Configuration Options
**Development** (single-user, local testing):
```python
# superset_config.py
MCP_DEV_USERNAME = "admin" # User for MCP authentication
MCP_SERVICE_HOST = "localhost"
MCP_SERVICE_PORT = 5008
```
**Production** (JWT-based, multi-user):
```python
# superset_config.py
MCP_AUTH_ENABLED = True
MCP_JWT_ISSUER = "https://your-auth-provider.com"
MCP_JWT_AUDIENCE = "superset-mcp"
MCP_JWT_ALGORITHM = "RS256" # or "HS256" for shared secrets
# Option 1: Use JWKS endpoint (recommended for RS256)
MCP_JWKS_URI = "https://auth.example.com/.well-known/jwks.json"
# Option 2: Use static public key (RS256)
MCP_JWT_PUBLIC_KEY = "-----BEGIN PUBLIC KEY-----..."
# Option 3: Use shared secret (HS256)
MCP_JWT_ALGORITHM = "HS256"
MCP_JWT_SECRET = "your-shared-secret-key"
# Optional overrides
MCP_SERVICE_HOST = "0.0.0.0"
MCP_SERVICE_PORT = 5008
MCP_SESSION_CONFIG = {
"SESSION_COOKIE_SECURE": True,
"SESSION_COOKIE_HTTPONLY": True,
"SESSION_COOKIE_SAMESITE": "Strict",
}
```
#### Running the MCP Service
```bash
# Development
superset mcp run --port 5008 --debug
# Production
superset mcp run --port 5008
# With factory config
superset mcp run --port 5008 --use-factory-config
```
#### Deployment Considerations
The MCP service runs as a **separate process** from the Superset web server.
**Important**:
- Requires same Python environment and configuration as Superset
- Shares database connections with main Superset app
- Can be scaled independently from web server
- Requires `fastmcp` package (optional dependency)
**Installation**:
```bash
# Install with MCP support
pip install apache-superset[fastmcp]
# Or add to requirements.txt
apache-superset[fastmcp]>=X.Y.Z
```
**Process Management**:
Use systemd, supervisord, or Kubernetes to manage the MCP service process.
See `superset/mcp_service/PRODUCTION.md` for deployment guides.
**Security**:
- Development: Uses `MCP_DEV_USERNAME` for single-user access
- Production: **MUST** configure JWT authentication
- See `superset/mcp_service/SECURITY.md` for details
#### Documentation
- Architecture: `superset/mcp_service/ARCHITECTURE.md`
- Security: `superset/mcp_service/SECURITY.md`
- Production: `superset/mcp_service/PRODUCTION.md`
- Developer Guide: `superset/mcp_service/CLAUDE.md`
- Quick Start: `superset/mcp_service/README.md`
---
- [33055](https://github.com/apache/superset/pull/33055): Upgrades Flask-AppBuilder to 5.0.0. The AUTH_OID authentication type has been deprecated and is no longer available as an option in Flask-AppBuilder. OpenID (OID) is considered a deprecated authentication protocol - if you are using AUTH_OID, you will need to migrate to an alternative authentication method such as OAuth, LDAP, or database authentication before upgrading.
- [35062](https://github.com/apache/superset/pull/35062): Changed the function signature of `setupExtensions` to `setupCodeOverrides` with options as arguments.
- [34871](https://github.com/apache/superset/pull/34871): Fixed Jest test hanging issue from Ant Design v5 upgrade. MessageChannel is now mocked in test environment to prevent rc-overflow from causing Jest to hang. Test environment only - no production impact.

View File

@@ -0,0 +1,693 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# MCP Service Architecture
## Overview
The Apache Superset MCP (Model Context Protocol) service provides programmatic access to Superset functionality through a standardized protocol that enables AI assistants and automation tools to interact with dashboards, charts, datasets, and SQL Lab.
The MCP service runs as a **separate process** from the Superset web server, using its own Flask application instance and HTTP server while sharing the same database and configuration with the main Superset application.
## Flask Singleton Pattern
### Why Module-Level Singleton?
The MCP service uses a module-level singleton Flask application instance rather than creating a new app instance per request. This design decision is based on several important considerations:
**Separate Process Architecture**:
- The MCP service runs as an independent process from the Superset web server
- It has its own HTTP server (via FastMCP/Starlette) handling MCP protocol requests
- Each MCP tool invocation occurs within the context of this single, long-lived Flask app
**Benefits of Module-Level Singleton**:
1. **Consistent Database Connection Pool**
- A single SQLAlchemy connection pool is maintained across all tool calls
- Connections are efficiently reused rather than recreated
- Connection pool configuration (size, timeout, etc.) behaves predictably
2. **Shared Configuration Access**
- Flask app configuration is loaded once at startup
- All tools access the same configuration state
- Changes to runtime config affect all subsequent tool calls consistently
3. **Thread-Safe Initialization**
- The Flask app is created exactly once using `threading.Lock()`
- Multiple concurrent requests safely share the same app instance
- No risk of duplicate initialization or race conditions
4. **Lower Resource Overhead**
- No per-request app creation/teardown overhead
- Memory footprint remains constant regardless of request volume
- Extension initialization (Flask-AppBuilder, Flask-Migrate, etc.) happens once
**When Module-Level Singleton Is Appropriate**:
- Service runs as dedicated daemon/process
- Application state is consistent across all requests
- No per-request application context needed
- Long-lived server process with many requests
**When Module-Level Singleton Is NOT Appropriate**:
- Testing with different configurations (use app fixtures instead)
- Multi-tenant deployments requiring different app configs per tenant
- Dynamic plugin loading requiring app recreation
- Development scenarios requiring hot-reload of app configuration
### Implementation Details
The singleton is implemented in `flask_singleton.py`:
```python
# Module-level instance - created once on import
from superset.app import create_app
from superset.mcp_service.mcp_config import get_mcp_config
_temp_app = create_app()
with _temp_app.app_context():
mcp_config = get_mcp_config(_temp_app.config)
_temp_app.config.update(mcp_config)
app = _temp_app
def get_flask_app() -> Flask:
"""Get the Flask app instance."""
return app
```
**Key characteristics**:
- No complex patterns or metaclasses needed
- The module itself acts as the singleton container
- Clean, Pythonic approach following Stack Overflow recommendations
- Application context pushed during initialization to avoid "Working outside of application context" errors
## Multitenant Architecture
### Current Implementation
The MCP service uses **Option B: Shared Process with Tenant Isolation**:
```mermaid
graph LR
T1[Tenant 1]
T2[Tenant 2]
T3[Tenant 3]
MCP[Single MCP Process]
DB[(Superset Database)]
T1 --> MCP
T2 --> MCP
T3 --> MCP
MCP --> DB
MCP -.->|Isolation via| ISO[User authentication JWT or dev user<br/>Flask-AppBuilder RBAC<br/>Dataset access filters<br/>Row-level security]
style ISO fill:#f9f,stroke:#333,stroke-width:2px
```
### Tenant Isolation Mechanisms
#### Database Level
**Superset's Existing RLS (Row-Level Security)**:
- RLS rules are defined at the dataset level
- Rules filter queries based on user attributes (e.g., `department = '{{ current_user.department }}'`)
- The MCP service respects all RLS rules automatically through Superset's query execution layer
**No Schema-Based Isolation**:
- The current implementation does NOT use separate database schemas per tenant
- All Superset metadata (dashboards, charts, datasets) exists in the same database schema
- Database-level isolation is achieved through Superset's permission system rather than physical schema separation
#### Application Level
**Flask-AppBuilder Security Manager**:
- Every MCP tool call uses `@mcp_auth_hook` decorator
- The auth hook sets `g.user` to the authenticated user (from JWT or `MCP_DEV_USERNAME`)
- Superset's security manager then enforces permissions based on this user's roles
**User-Based Access Control**:
- Users can only access resources they have permissions for
- Dashboard ownership and role-based permissions are enforced
- The `can_access_datasource()` method validates dataset access
**Dataset Access Filters**:
- All list operations (dashboards, charts, datasets) use Superset's access filters:
- `DashboardAccessFilter` - filters dashboards based on user permissions
- `ChartAccessFilter` - filters charts based on user permissions
- `DatasourceFilter` - filters datasets based on user permissions
**Row-Level Security Enforcement**:
- RLS rules are applied transparently during query execution
- The MCP service makes no modifications to bypass RLS
- SQL queries executed through `execute_sql` tool respect RLS policies
#### JWT Tenant Claims
**Development Mode** (single user):
```python
# superset_config.py
MCP_DEV_USERNAME = "admin"
```
**Production Mode** (JWT-based):
```json
{
"sub": "user@company.com",
"email": "user@company.com",
"scopes": ["superset:read", "superset:chart:create"],
"exp": 1672531200
}
```
**Future Enhancement** (multi-tenant JWT):
```json
{
"sub": "user@tenant-a.com",
"tenant_id": "tenant-a",
"scopes": ["superset:read"],
"exp": 1672531200
}
```
The `tenant_id` claim could be used in future versions to:
- Further isolate data by tenant context
- Apply tenant-specific RLS rules
- Log and audit actions by tenant
- Implement tenant-specific rate limits
## Process Model
### Single Process Deployment
**When to Use**:
- Development and testing environments
- Small deployments with low request volume (< 100 requests/minute)
- Single-tenant installations
- Resource-constrained environments
**Resource Characteristics**:
- Memory: ~500MB-1GB (includes Flask app, SQLAlchemy, screenshot pool)
- CPU: Mostly I/O bound (database queries, screenshot generation)
- Database connections: Configurable via `SQLALCHEMY_POOL_SIZE` (default: 5)
**Scaling Limitations**:
- Single Python process = GIL limitations for CPU-bound operations
- Screenshot generation can block other requests
- Limited horizontal scalability without load balancer
**Example Command**:
```bash
superset mcp run --port 5008
```
### Multi-Process Deployment
**Using Gunicorn Workers**:
```bash
gunicorn \
--workers 4 \
--bind 0.0.0.0:5008 \
--worker-class uvicorn.workers.UvicornWorker \
superset.mcp_service.server:app
```
**Configuration Considerations**:
- Worker count: `2-4 x CPU cores` (typical recommendation)
- Each worker has its own Flask app instance via module-level singleton
- Workers share nothing - fully isolated processes
- Database connection pool per worker (watch total connections)
**Process Pool Management**:
- Use process manager (systemd, supervisord) for auto-restart
- Health checks to detect and restart failed workers
- Graceful shutdown to complete in-flight requests
**Load Balancing**:
- Use nginx/HAProxy to distribute requests across workers
- Round-robin or least-connections algorithms work well
- Sticky sessions NOT required (stateless API)
### Containerized Deployment
**Docker**:
```dockerfile
FROM apache/superset:latest
CMD ["superset", "mcp", "run", "--port", "5008"]
```
**Kubernetes Deployment**:
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: superset-mcp
spec:
replicas: 3
selector:
matchLabels:
app: superset-mcp
template:
metadata:
labels:
app: superset-mcp
spec:
containers:
- name: mcp
image: apache/superset:latest
command: ["superset", "mcp", "run", "--port", "5008"]
ports:
- containerPort: 5008
env:
- name: SUPERSET_CONFIG_PATH
value: /app/pythonpath/superset_config.py
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
```
**Horizontal Pod Autoscaling**:
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: superset-mcp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: superset-mcp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```
**Service Mesh Integration**:
- Istio/Linkerd can provide:
- Automatic retries and circuit breaking
- Distributed tracing
- Mutual TLS between pods
- Advanced traffic routing
## Database Connection Management
### Connection Pooling
The MCP service uses SQLAlchemy's connection pooling with configuration inherited from Superset:
```python
# superset_config.py
SQLALCHEMY_POOL_SIZE = 5 # Max connections per worker
SQLALCHEMY_POOL_TIMEOUT = 30 # Seconds to wait for connection
SQLALCHEMY_MAX_OVERFLOW = 10 # Extra connections beyond pool_size
SQLALCHEMY_POOL_RECYCLE = 3600 # Recycle connections after 1 hour
```
**Connection Lifecycle**:
1. Request arrives at MCP tool
2. Tool calls DAO method which accesses `db.session`
3. SQLAlchemy checks out connection from pool
4. Query executes on borrowed connection
5. Connection returns to pool (not closed)
6. Connection reused for next request
**Pool Size Recommendations**:
- **Single process**: 5-10 connections
- **Multi-worker (4 workers)**: 3-5 connections per worker = 12-20 total
- **Monitor**: Database max_connections setting must exceed total pool size across all MCP workers
**Example with 4 Gunicorn workers**:
```python
SQLALCHEMY_POOL_SIZE = 5
SQLALCHEMY_MAX_OVERFLOW = 5
# Total potential connections: 4 workers × (5 + 5) = 40 connections
# Ensure database supports 40+ connections
```
### Transaction Handling
**MCP Tool Transaction Pattern**:
```python
@mcp.tool
@mcp_auth_hook
def my_tool(param: str) -> Result:
# Auth hook sets g.user and manages session
try:
# Tool executes within implicit transaction
result = DashboardDAO.find_by_id(123)
return Result(data=result)
except Exception:
# On error: rollback happens in auth hook's except block
raise
finally:
# On success: rollback happens in auth hook's finally block
# (read-only operations don't commit)
pass
```
**Session Cleanup in Auth Hook**:
The `@mcp_auth_hook` decorator manages session lifecycle:
```python
# On error path
except Exception:
try:
db.session.rollback()
db.session.remove()
except Exception as e:
logger.warning("Error cleaning up session: %s", e)
raise
# On success path (finally block)
finally:
try:
if db.session.is_active:
db.session.rollback() # Cleanup, don't commit
except Exception as e:
logger.warning("Error in finally block: %s", e)
```
**Why Rollback on Success?**
- MCP tools are primarily **read-only operations**
- No explicit commits needed for queries
- Rollback ensures clean slate for next request
- Write operations (create chart, etc.) use Superset's command pattern which handles commits internally
## Deployment Considerations
### Resource Requirements
**Memory Per Process**:
- Base Flask app: ~200MB
- SQLAlchemy + models: ~100MB
- WebDriver pool (if screenshots enabled): ~200MB
- Request processing overhead: ~50MB per concurrent request
- **Total**: 500MB-1GB per process
**CPU Usage Patterns**:
- I/O bound: Most time spent waiting on database/screenshots
- Low CPU during normal operations (< 20% per core)
- CPU spikes during:
- Screenshot generation (WebDriver rendering)
- Large dataset query processing
- Complex chart configuration validation
**Database Connections**:
- **Single process**: 5-10 connections (pool_size + max_overflow)
- **Multi-process**: `(pool_size + max_overflow) × worker_count`
- **Example**: 4 workers × 10 max connections = 40 total database connections
### Scaling Strategy
**When to Scale Horizontally**:
- Request latency increases beyond acceptable threshold (e.g., p95 > 2 seconds)
- CPU utilization consistently > 70%
- Request queue depth growing
- Database connection pool frequently exhausted
**Load Balancing Between MCP Instances**:
**Option 1: Nginx Round-Robin**:
```nginx
upstream mcp_backend {
server mcp-1:5008;
server mcp-2:5008;
server mcp-3:5008;
}
server {
location / {
proxy_pass http://mcp_backend;
}
}
```
**Option 2: Kubernetes Service**:
```yaml
apiVersion: v1
kind: Service
metadata:
name: superset-mcp
spec:
selector:
app: superset-mcp
ports:
- port: 5008
targetPort: 5008
type: ClusterIP
```
**Session Affinity**:
- NOT required - MCP service is stateless
- Each request is independent
- No session state maintained between requests
- Load balancer can freely distribute requests
### High Availability
**Multiple MCP Instances**:
- Deploy at least 2 instances for redundancy
- Use load balancer health checks to detect failures
- Failed instances automatically removed from rotation
**Health Checks**:
The MCP service provides a health check tool:
```python
# Internal health check
@mcp.tool
def health_check() -> HealthCheckResponse:
return HealthCheckResponse(
status="healthy",
timestamp=datetime.now(timezone.utc),
database_connection="ok"
)
```
**Load balancer health check**:
```nginx
# Nginx example
upstream mcp_backend {
server mcp-1:5008 max_fails=3 fail_timeout=30s;
server mcp-2:5008 max_fails=3 fail_timeout=30s;
}
```
**Kubernetes health check**:
```yaml
livenessProbe:
httpGet:
path: /health
port: 5008
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 5008
initialDelaySeconds: 10
periodSeconds: 5
```
**Failover Handling**:
- Load balancer automatically routes around failed instances
- MCP clients should implement retry logic for transient failures
- Use circuit breaker pattern for repeated failures
- Monitor and alert on instance failures
### Database Considerations
**Shared Database with Superset**:
- MCP service and Superset web server share the same database
- Same SQLAlchemy models and schema
- Database migrations applied once, affect both services
**Connection Pool Sizing**:
```
Total DB Connections =
Superset Web (workers × pool_size) +
MCP Service (workers × pool_size) +
Other services
Must be < Database max_connections
```
**Example Calculation**:
- Superset web: 8 workers × 10 connections = 80
- MCP service: 4 workers × 10 connections = 40
- Other: 20 reserved
- **Total**: 140 connections
- **Database**: Set max_connections >= 150
### Monitoring Recommendations
**Key Metrics to Track**:
- Request rate per tool
- Request latency (p50, p95, p99)
- Error rate by tool and error type
- Database connection pool utilization
- Memory usage per process
- Active concurrent requests
**Example Prometheus Metrics** (future implementation):
```python
mcp_requests_total{tool="list_charts", status="success"}
mcp_request_duration_seconds{tool="list_charts", quantile="0.95"}
mcp_database_connections_active
mcp_database_connections_idle
mcp_memory_usage_bytes
```
**Log Aggregation**:
- Centralize logs from all MCP instances
- Use structured logging (JSON format)
- Include trace IDs for request correlation
- Alert on error rate spikes
## Architecture Diagrams
### Request Flow
```mermaid
sequenceDiagram
participant Client as MCP Client<br/>(Claude/automation)
participant FastMCP as FastMCP Server<br/>(Starlette/Uvicorn)
participant Auth as MCP Auth Hook
participant Tool as Tool Implementation<br/>(e.g., list_charts)
participant DAO as Superset DAO Layer<br/>(ChartDAO, DashboardDAO)
participant DB as Database<br/>(PostgreSQL/MySQL)
Client->>FastMCP: MCP Protocol (HTTP/SSE)
FastMCP->>Auth: @mcp.tool decorator
Auth->>Auth: Sets g.user, manages session
Auth->>Tool: Execute tool
Tool->>DAO: Uses DAO pattern
DAO->>DB: SQLAlchemy ORM
DB-->>DAO: Query results
DAO-->>Tool: Processed data
Tool-->>Auth: Tool response
Auth-->>FastMCP: Response with cleanup
FastMCP-->>Client: MCP response
```
### Multi-Instance Deployment
```mermaid
graph TD
LB[Load Balancer<br/>Nginx/K8s Service]
MCP1[MCP Instance 1<br/>port 5008]
MCP2[MCP Instance 2<br/>port 5008]
MCP3[MCP Instance 3<br/>port 5008]
DB[(Superset Database<br/>shared connection pool)]
LB --> MCP1
LB --> MCP2
LB --> MCP3
MCP1 --> DB
MCP2 --> DB
MCP3 --> DB
```
### Tenant Isolation
```mermaid
graph TD
UserA[User A<br/>JWT: tenant=acme]
UserB[User B<br/>JWT: tenant=beta]
MCP[MCP Service<br/>single process]
Auth[@mcp_auth_hook<br/>Sets g.user from JWT]
RBAC[Flask-AppBuilder<br/>RBAC]
Filters[Dataset Access<br/>Filters]
DB[(Superset Database<br/>single schema, filtered by permissions)]
UserA --> MCP
UserB --> MCP
MCP --> Auth
Auth --> RBAC
Auth --> Filters
RBAC --> |User A sees only<br/>acme dashboards| DB
Filters --> |User A queries filtered<br/>by RLS rules for acme| DB
```
## Comparison with Alternative Architectures
### Module-Level Singleton (Current) vs Per-Request App
| Aspect | Module-Level Singleton | Per-Request App |
|--------|----------------------|-----------------|
| Connection Pool | Single shared pool | New pool per request |
| Memory Overhead | Constant (~500MB) | 500MB × concurrent requests |
| Thread Safety | Must ensure thread-safe access | Each request isolated |
| Configuration | Loaded once at startup | Can vary per request |
| Performance | Fast (no setup overhead) | Slow (initialization cost) |
| Use Case | Production daemon | Testing/multi-config scenarios |
### Shared Process (Current) vs Separate Process Per Tenant
| Aspect | Shared Process | Process Per Tenant |
|--------|---------------|-------------------|
| Isolation | Application-level (RBAC/RLS) | Process-level (OS isolation) |
| Resource Usage | Efficient (shared resources) | Higher (duplicate resources) |
| Scaling | Horizontal (add instances) | Vertical (more processes) |
| Complexity | Simpler deployment | Complex orchestration |
| Security | Depends on Superset RBAC | Stronger isolation |
| Use Case | Most deployments | High-security multi-tenant |
## Future Architectural Considerations
### Async/Await Support
The current implementation uses synchronous request handling. Future versions could:
- Use `async`/`await` for I/O operations
- Implement connection pooling with `asyncpg` (PostgreSQL) or `aiomysql`
- Improve throughput for I/O-bound operations
### Caching Layer
Adding caching between MCP service and database:
- Redis cache for frequently accessed resources (dashboards, charts, datasets)
- Cache invalidation on updates
- Reduced database load for read-heavy workloads
### Event-Driven Updates
WebSocket support for real-time updates:
- Push notifications when dashboards/charts change
- Streaming query results for large datasets
- Live dashboard editing collaboration
## References
- **Flask Application Context**: https://flask.palletsprojects.com/en/stable/appcontext/
- **SQLAlchemy Connection Pooling**: https://docs.sqlalchemy.org/en/stable/core/pooling.html
- **FastMCP Documentation**: https://github.com/jlowin/fastmcp
- **Superset Security Model**: https://superset.apache.org/docs/security

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,803 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# MCP Service Security
## Overview
The MCP service implements multiple layers of security to ensure safe programmatic access to Superset functionality. This document covers authentication, authorization, session management, audit logging, and compliance considerations.
## Authentication
### Current Implementation (Development)
For development and testing, the MCP service uses a simple username-based authentication:
```python
# superset_config.py
MCP_DEV_USERNAME = "admin"
```
**How it works**:
1. The `@mcp_auth_hook` decorator calls `get_user_from_request()`
2. `get_user_from_request()` reads `MCP_DEV_USERNAME` from config
3. User is queried from database and set as `g.user`
4. All subsequent Superset operations use this user's permissions
**Development Use Only**:
- No token validation
- No multi-user support
- No authentication security
- Single user for all MCP requests
- NOT suitable for production
### Production Implementation (JWT)
For production deployments, the MCP service supports JWT (JSON Web Token) authentication:
```python
# superset_config.py
MCP_AUTH_ENABLED = True
MCP_JWT_ISSUER = "https://your-auth-provider.com"
MCP_JWT_AUDIENCE = "superset-mcp"
MCP_JWT_ALGORITHM = "RS256" # or "HS256" for symmetric keys
# Option 1: Use JWKS endpoint (recommended for RS256)
MCP_JWKS_URI = "https://your-auth-provider.com/.well-known/jwks.json"
# Option 2: Use static public key (RS256)
MCP_JWT_PUBLIC_KEY = """-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
-----END PUBLIC KEY-----"""
# Option 3: Use shared secret (HS256 - less secure)
MCP_JWT_ALGORITHM = "HS256"
MCP_JWT_SECRET = "your-shared-secret-key"
```
**JWT Token Structure**:
```json
{
"iss": "https://your-auth-provider.com",
"sub": "user@company.com",
"aud": "superset-mcp",
"exp": 1735689600,
"iat": 1735686000,
"email": "user@company.com",
"scopes": ["superset:read", "superset:chart:create"]
}
```
**Required Claims**:
- `iss` (issuer): Must match `MCP_JWT_ISSUER`
- `sub` (subject): User identifier (username/email)
- `aud` (audience): Must match `MCP_JWT_AUDIENCE`
- `exp` (expiration): Token expiration timestamp
- `iat` (issued at): Token creation timestamp
**Optional Claims**:
- `email`: User's email address
- `username`: Alternative to `sub` for user identification
- `scopes`: Array of permission scopes
- `tenant_id`: Multi-tenant identifier (future use)
**Token Validation Process**:
1. Extract Bearer token from `Authorization` header
2. Verify token signature using public key or JWKS
3. Validate `iss`, `aud`, and `exp` claims
4. Check required scopes (if configured)
5. Extract user identifier from `sub`, `email`, or `username` claim
6. Look up Superset user from database
7. Set `g.user` for request context
**Example Client Usage**:
```bash
# Using curl
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" \
http://localhost:5008/list_charts
# Using MCP client (Claude Desktop)
{
"mcpServers": {
"superset": {
"url": "http://localhost:5008",
"headers": {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
}
}
}
```
### Token Renewal and Refresh
**Short-lived Access Tokens** (recommended):
- Issue tokens with short expiration (e.g., 15 minutes)
- Client must refresh token before expiration
- Reduces risk of token theft
**Refresh Token Pattern**:
```mermaid
sequenceDiagram
participant Client
participant AuthProvider as Auth Provider
participant MCP as MCP Service
Client->>AuthProvider: Request access
AuthProvider-->>Client: access_token (15 min)<br/>refresh_token (30 days)
Client->>MCP: Request with access_token
MCP-->>Client: Response
Note over Client,MCP: Access token expires
Client->>AuthProvider: Request new token with refresh_token
AuthProvider-->>Client: New access_token (15 min)
Client->>MCP: Request with new access_token
MCP-->>Client: Response
Note over Client,AuthProvider: Refresh token expires
Client->>AuthProvider: User must re-authenticate
```
**MCP Service Responsibility**:
- MCP service only validates access tokens
- Refresh token handling is the client's responsibility
- Auth provider (OAuth2/OIDC server) handles token refresh
### Service Account Patterns
For automation and batch jobs, use service accounts instead of user credentials:
```json
{
"iss": "https://your-auth-provider.com",
"sub": "service-account@automation.company.com",
"aud": "superset-mcp",
"exp": 1735689600,
"client_id": "superset-automation",
"scopes": ["superset:read", "superset:chart:create"]
}
```
**Service Account Best Practices**:
- Create dedicated Superset users for service accounts
- Grant minimal required permissions
- Use long-lived tokens only when necessary
- Rotate service account credentials regularly
- Log all service account activity
- Use separate service accounts per automation job
**Example Superset Service Account Setup**:
```bash
# Create service account user in Superset
superset fab create-user \
--role Alpha \
--username automation-service \
--firstname Automation \
--lastname Service \
--email automation@company.com \
--password <generated-password>
# Grant specific permissions
# (Use Superset UI or FAB CLI to configure role permissions)
```
## Authorization
### RBAC Integration
The MCP service fully integrates with Superset's Flask-AppBuilder role-based access control:
**Role Hierarchy**:
- **Admin**: Full access to all resources
- **Alpha**: Can create and edit dashboards, charts, datasets
- **Gamma**: Read-only access to permitted resources
- **Custom Roles**: Fine-grained permission sets
**Permission Checking Flow**:
```python
# In MCP tool
@mcp.tool
@mcp_auth_hook # Sets g.user
def list_dashboards(filters: List[Filter]) -> DashboardList:
# Flask-AppBuilder security manager automatically filters
# based on g.user's permissions
dashboards = DashboardDAO.find_by_ids(...)
# Only returns dashboards g.user can access
```
**Permission Types**:
| Permission | Description | Example |
|------------|-------------|---------|
| `can_read` | View resource | View dashboard details |
| `can_write` | Edit resource | Update chart configuration |
| `can_delete` | Delete resource | Remove dashboard |
| `datasource_access` | Access dataset | Query dataset in chart |
| `database_access` | Access database | Execute SQL in SQL Lab |
### Row-Level Security (RLS)
RLS rules filter query results based on user attributes:
**RLS Rule Example**:
```sql
-- Only show records for user's department
department = '{{ current_user().department }}'
```
**How RLS Works with MCP**:
```mermaid
sequenceDiagram
participant Client
participant Auth as @mcp_auth_hook
participant Tool as MCP Tool
participant DAO as Superset DAO
participant DB as Database
Client->>Auth: Request with JWT/dev username
Auth->>Auth: Set g.user
Auth->>Tool: Execute tool
Tool->>DAO: Call ChartDAO.get_chart_data()
DAO->>DAO: Apply RLS rules<br/>Replace template variables<br/>with g.user attributes
DAO->>DB: Query with RLS filters in WHERE clause
DB-->>DAO: Only permitted rows
DAO-->>Tool: Filtered data
Tool-->>Client: Response
```
**RLS Configuration**:
RLS is configured per dataset in Superset UI:
1. Navigate to dataset → Edit → Row Level Security
2. Create RLS rule with SQL filter template
3. Assign rule to roles or users
4. MCP service automatically applies rules (no code changes needed)
**MCP Service Guarantees**:
- Cannot bypass RLS rules
- No privileged access mode
- RLS applied consistently across all tools
- Same security model as Superset web UI
### Dataset Access Control
The MCP service validates dataset access before executing queries:
```python
# In chart generation tool
@mcp.tool
@mcp_auth_hook
def generate_chart(dataset_id: int, ...) -> ChartResponse:
dataset = DatasetDAO.find_by_id(dataset_id)
# Check if user has access
if not has_dataset_access(dataset):
raise ValueError(
f"User {g.user.username} does not have access to dataset {dataset_id}"
)
# Proceed with chart creation
...
```
**Dataset Access Filters**:
All listing operations automatically filter by user access:
- `list_datasets`: Uses `DatasourceFilter` - only shows datasets user can query
- `list_charts`: Uses `ChartAccessFilter` - only shows charts with accessible datasets
- `list_dashboards`: Uses `DashboardAccessFilter` - only shows dashboards user can view
**Access Check Implementation**:
```python
from superset import security_manager
def has_dataset_access(dataset: SqlaTable) -> bool:
"""Check if g.user can access dataset."""
if hasattr(g, "user") and g.user:
return security_manager.can_access_datasource(datasource=dataset)
return False
```
### Tool-Level Permissions
Different MCP tools require different Superset permissions:
| Tool | Required Permissions | Notes |
|------|---------------------|-------|
| `list_dashboards` | `can_read` on Dashboard | Returns only accessible dashboards |
| `get_dashboard_info` | `can_read` on Dashboard + dataset access | Validates dashboard and dataset permissions |
| `list_charts` | `can_read` on Slice | Returns only charts with accessible datasets |
| `get_chart_info` | `can_read` on Slice + dataset access | Validates chart and dataset permissions |
| `get_chart_data` | `can_read` on Slice + `datasource_access` | Executes query with RLS applied |
| `generate_chart` | `can_write` on Slice + `datasource_access` | Creates new chart |
| `update_chart` | `can_write` on Slice + ownership or Admin | Must own chart or be Admin |
| `list_datasets` | `datasource_access` | Returns only accessible datasets |
| `get_dataset_info` | `datasource_access` | Validates dataset access |
| `execute_sql` | `can_sql_json` or `can_sqllab` on Database | Executes SQL with RLS |
| `generate_dashboard` | `can_write` on Dashboard + dataset access | Creates new dashboard |
**Permission Denied Handling**:
```python
# If user lacks permission, Superset raises exception
try:
result = DashboardDAO.find_by_id(dashboard_id)
except SupersetSecurityException as e:
raise ValueError(f"Access denied: {e}")
```
### JWT Scope Validation
Future implementation will support scope-based authorization:
```python
# superset_config.py
MCP_REQUIRED_SCOPES = ["superset:read"] # Minimum scopes required
```
**Scope Hierarchy**:
- `superset:read`: List and view resources
- `superset:chart:create`: Create new charts
- `superset:chart:update`: Update existing charts
- `superset:chart:delete`: Delete charts
- `superset:dashboard:create`: Create dashboards
- `superset:sql:execute`: Execute SQL queries
- `superset:admin`: Full administrative access
**Scope Enforcement** (future):
```python
@mcp.tool
@mcp_auth_hook
@require_scopes(["superset:chart:create"])
def generate_chart(...) -> ChartResponse:
# Only proceeds if JWT contains required scope
...
```
**Scope Validation Logic**:
1. Extract `scopes` array from JWT payload
2. Check if all required scopes present
3. Deny access if any scope missing
4. Log denied attempts for audit
## Session and CSRF Handling
### Session Configuration
The MCP service configures sessions for authentication context:
```python
# superset_config.py
MCP_SESSION_CONFIG = {
"SESSION_COOKIE_HTTPONLY": True, # Prevent JavaScript access
"SESSION_COOKIE_SECURE": True, # HTTPS only (production)
"SESSION_COOKIE_SAMESITE": "Strict", # CSRF protection
"SESSION_COOKIE_NAME": "superset_session",
"PERMANENT_SESSION_LIFETIME": 86400, # 24 hours
}
```
**Why Session Config in MCP?**
The MCP service uses Flask's session mechanism for:
- **Authentication context**: Storing `g.user` across request lifecycle
- **CSRF token generation**: Protecting state-changing operations
- **Request correlation**: Linking related tool calls
**Important Notes**:
- MCP service is **stateless** - no server-side session storage
- Sessions used only for request-scoped auth context
- Cookies used for auth token transmission (alternative to Bearer header)
- Session data NOT persisted between MCP service restarts
### CSRF Protection
CSRF (Cross-Site Request Forgery) protection is configured but currently **not enforced** for MCP tools:
```python
MCP_CSRF_CONFIG = {
"WTF_CSRF_ENABLED": True,
"WTF_CSRF_TIME_LIMIT": None, # No time limit
}
```
**Why CSRF Config Exists**:
- Flask-AppBuilder and Superset expect CSRF configuration
- Prevents errors during app initialization
- Future-proofing for potential web UI for MCP service
**Why CSRF NOT Enforced**:
- MCP protocol uses Bearer tokens (not cookies for auth)
- CSRF attacks require browser cookie-based authentication
- Stateless API design prevents CSRF vulnerability
- MCP clients are programmatic (not browsers)
**If Using Cookie-Based Auth** (future):
- Enable CSRF token requirement
- Include CSRF token in MCP tool requests
- Validate token on state-changing operations
**CSRF Token Flow** (if enabled):
```mermaid
sequenceDiagram
participant Client
participant MCP as MCP Service
participant Session as Session Store
Client->>MCP: Request CSRF token
MCP->>Session: Generate and store token
MCP-->>Client: Return CSRF token
Client->>MCP: Request with CSRF token
MCP->>Session: Validate token matches session
alt Token valid
MCP-->>Client: Process request
else Token invalid/missing
MCP-->>Client: Reject request (403)
end
```
### Production Security Recommendations
**HTTPS Required**:
```python
MCP_SESSION_CONFIG = {
"SESSION_COOKIE_SECURE": True, # MUST be True in production
}
```
Without HTTPS:
- Cookies transmitted in plaintext
- Session hijacking risk
- JWT tokens exposed
- Man-in-the-middle attacks possible
**SameSite Configuration**:
- `Strict`: Cookies never sent cross-site (most secure)
- `Lax`: Cookies sent on top-level navigation (less secure)
- `None`: Cookies sent everywhere (requires Secure flag, least secure)
**Recommended Production Settings**:
```python
MCP_SESSION_CONFIG = {
"SESSION_COOKIE_HTTPONLY": True, # Always
"SESSION_COOKIE_SECURE": True, # Always (HTTPS required)
"SESSION_COOKIE_SAMESITE": "Strict", # Recommended
"PERMANENT_SESSION_LIFETIME": 3600, # 1 hour (adjust as needed)
}
```
## Audit Logging
### Current Logging
The MCP service logs basic authentication events:
```python
# In @mcp_auth_hook
logger.debug(
"MCP tool call: user=%s, tool=%s",
user.username,
tool_func.__name__
)
```
**What's Logged**:
- User who made the request
- Which tool was called
- Timestamp (from log formatter)
- Success/failure (via exception logging)
**Log Format**:
```
2025-01-01 10:30:45,123 DEBUG [mcp_auth_hook] MCP tool call: user=admin, tool=list_dashboards
2025-01-01 10:30:45,456 ERROR [mcp_auth_hook] Tool execution failed: user=admin, tool=generate_chart, error=Permission denied
```
### Enhanced Audit Logging (Recommended)
For production deployments, implement structured logging:
```python
# superset_config.py
import logging
import json
class StructuredFormatter(logging.Formatter):
def format(self, record):
log_data = {
"timestamp": self.formatTime(record),
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
"user": getattr(record, "user", None),
"tool": getattr(record, "tool", None),
"resource_type": getattr(record, "resource_type", None),
"resource_id": getattr(record, "resource_id", None),
"action": getattr(record, "action", None),
"result": getattr(record, "result", None),
"error": getattr(record, "error", None),
}
return json.dumps(log_data)
# Apply formatter
handler = logging.StreamHandler()
handler.setFormatter(StructuredFormatter())
logging.getLogger("superset.mcp_service").addHandler(handler)
```
**Structured Log Example**:
```json
{
"timestamp": "2025-01-01T10:30:45.123Z",
"level": "INFO",
"logger": "superset.mcp_service.auth",
"message": "MCP tool execution",
"user": "admin",
"tool": "generate_chart",
"resource_type": "chart",
"resource_id": 42,
"action": "create",
"result": "success",
"duration_ms": 234
}
```
### Audit Events
**Key Events to Log**:
| Event | Data to Capture | Severity |
|-------|----------------|----------|
| Authentication success | User, timestamp, IP | INFO |
| Authentication failure | Username attempted, reason | WARNING |
| Tool execution | User, tool, parameters, result | INFO |
| Permission denied | User, tool, resource, reason | WARNING |
| Chart created | User, chart_id, dataset_id | INFO |
| Dashboard created | User, dashboard_id, chart_ids | INFO |
| SQL executed | User, database, query (sanitized), rows | INFO |
| Error occurred | User, tool, error type, stack trace | ERROR |
### Integration with SIEM Systems
**Export to External Systems**:
**Option 1: Syslog**:
```python
import logging.handlers
syslog_handler = logging.handlers.SysLogHandler(
address=("syslog.company.com", 514)
)
logging.getLogger("superset.mcp_service").addHandler(syslog_handler)
```
**Option 2: Log Aggregation (ELK, Splunk)**:
```python
# Send JSON logs to stdout, collected by log shipper
import sys
import logging
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(StructuredFormatter())
```
**Option 3: Cloud Logging (CloudWatch, Stackdriver)**:
```python
# AWS CloudWatch example
import watchtower
handler = watchtower.CloudWatchLogHandler(
log_group="/superset/mcp",
stream_name="mcp-service"
)
logging.getLogger("superset.mcp_service").addHandler(handler)
```
### Log Retention
**Recommended Retention Policies**:
- **Authentication logs**: 90 days minimum
- **Tool execution logs**: 30 days minimum
- **Error logs**: 180 days minimum
- **Compliance logs**: Per regulatory requirements (e.g., 7 years for HIPAA)
## Compliance Considerations
### GDPR (General Data Protection Regulation)
**User Data Access Tracking**:
- Log all data access by user
- Provide audit trail for data subject access requests (DSAR)
- Implement data retention policies
- Support right to be forgotten (delete user data from logs)
**MCP Service Compliance**:
- All tool calls logged with user identification
- Can generate reports of user's data access
- Logs can be filtered/redacted for privacy
- No personal data stored in MCP service (only in Superset DB)
### SOC 2 (Service Organization Control 2)
**Audit Trail Requirements**:
- Log all administrative actions
- Maintain immutable audit logs
- Implement log integrity verification
- Provide audit log export functionality
**MCP Service Compliance**:
- Structured logging provides audit trail
- Logs include who, what, when for all actions
- Export logs to secure, immutable storage (S3, etc.)
- Implement log signing for integrity verification
### HIPAA (Health Insurance Portability and Accountability Act)
**PHI Access Logging**:
- Log all access to protected health information
- Include user, timestamp, data accessed
- Maintain logs for 6 years minimum
- Implement access controls on audit logs
**MCP Service Compliance**:
- All dataset queries logged
- Row-level security enforces data access controls
- Can identify which users accessed which PHI records
- Logs exportable for compliance reporting
**Example HIPAA Audit Log Entry**:
```json
{
"timestamp": "2025-01-01T10:30:45.123Z",
"user": "doctor@hospital.com",
"action": "query_dataset",
"dataset_id": 123,
"dataset_name": "patient_records",
"rows_returned": 5,
"phi_accessed": true,
"purpose": "Treatment",
"ip_address": "10.0.1.25"
}
```
### Access Control Matrix
For compliance audits, maintain a matrix of who can access what:
| Role | Dashboards | Charts | Datasets | SQL Lab | Admin |
|------|-----------|--------|----------|---------|-------|
| Admin | All | All | All | All | Yes |
| Alpha | Owned + Shared | Owned + Shared | Permitted | Permitted DBs | No |
| Gamma | Shared | Shared | Permitted | No | No |
| Viewer | Shared | Shared | None | No | No |
## Security Checklist for Production
Before deploying MCP service to production:
**Authentication**:
- [ ] `MCP_AUTH_ENABLED = True`
- [ ] JWT issuer, audience, and keys configured
- [ ] `MCP_DEV_USERNAME` removed or set to `None`
- [ ] Token expiration enforced (short-lived tokens)
- [ ] Refresh token mechanism implemented (client-side)
**Authorization**:
- [ ] RBAC roles configured in Superset
- [ ] RLS rules tested for all datasets
- [ ] Dataset access permissions verified
- [ ] Minimum required permissions granted per role
- [ ] Service accounts use dedicated roles
**Network Security**:
- [ ] HTTPS enabled (`SESSION_COOKIE_SECURE = True`)
- [ ] TLS 1.2+ enforced
- [ ] Firewall rules restrict access to MCP service
- [ ] Network isolation between MCP and database
- [ ] Load balancer health checks configured
**Session Security**:
- [ ] `SESSION_COOKIE_HTTPONLY = True`
- [ ] `SESSION_COOKIE_SECURE = True`
- [ ] `SESSION_COOKIE_SAMESITE = "Strict"`
- [ ] Session timeout configured appropriately
- [ ] No sensitive data stored in sessions
**Audit Logging**:
- [ ] Structured logging enabled
- [ ] All tool executions logged
- [ ] Authentication events logged
- [ ] Logs exported to SIEM/aggregation system
- [ ] Log retention policy implemented
**Monitoring**:
- [ ] Failed authentication attempts alerted
- [ ] Permission denied events monitored
- [ ] Error rate alerts configured
- [ ] Unusual access patterns detected
- [ ] Service availability monitored
**Compliance**:
- [ ] Data access logs retained per regulations
- [ ] Audit trail exportable
- [ ] Privacy policy updated for MCP service
- [ ] User consent obtained (if required)
- [ ] Security incident response plan includes MCP
## Security Incident Response
### Suspected Token Compromise
**Immediate Actions**:
1. Revoke compromised token at auth provider
2. Review audit logs for unauthorized access
3. Identify affected resources
4. Notify affected users/stakeholders
5. Force token refresh for all users (if provider supports)
**Investigation**:
1. Check MCP service logs for unusual activity
2. Correlate access patterns with compromised token
3. Determine scope of data accessed
4. Document timeline of events
### Unauthorized Access Detected
**Response Procedure**:
1. Block user/IP immediately (firewall/load balancer)
2. Disable user account in Superset
3. Review all actions by user in audit logs
4. Assess data exposure
5. Notify security team and management
6. Preserve logs for forensic analysis
### Data Breach
**MCP-Specific Considerations**:
1. Identify which datasets were accessed via MCP
2. Determine if RLS was bypassed (should not be possible)
3. Check for SQL injection attempts (should be prevented by Superset)
4. Review all tool executions in timeframe
5. Export detailed audit logs for incident report
## References
- **JWT Best Practices**: https://tools.ietf.org/html/rfc8725
- **OWASP API Security**: https://owasp.org/www-project-api-security/
- **Superset Security Documentation**: https://superset.apache.org/docs/security
- **Flask-AppBuilder Security**: https://flask-appbuilder.readthedocs.io/en/latest/security.html
- **GDPR Compliance Guide**: https://gdpr.eu/
- **SOC 2 Framework**: https://www.aicpa.org/soc2
- **HIPAA Security Rule**: https://www.hhs.gov/hipaa/for-professionals/security/