mirror of
https://github.com/apache/superset.git
synced 2026-04-19 16:14:52 +00:00
docs(mcp): add comprehensive architecture, security, and production deployment documentation (#36017)
This commit is contained in:
101
UPDATING.md
101
UPDATING.md
@@ -23,6 +23,107 @@ This file documents any backwards-incompatible changes in Superset and
|
|||||||
assists people when migrating to a new version.
|
assists people when migrating to a new version.
|
||||||
|
|
||||||
## Next
|
## Next
|
||||||
|
|
||||||
|
### MCP Service
|
||||||
|
|
||||||
|
The MCP (Model Context Protocol) service enables AI assistants and automation tools to interact programmatically with Superset.
|
||||||
|
|
||||||
|
#### New Features
|
||||||
|
- MCP service infrastructure with FastMCP framework
|
||||||
|
- Tools for dashboards, charts, datasets, SQL Lab, and instance metadata
|
||||||
|
- Optional dependency: install with `pip install apache-superset[fastmcp]`
|
||||||
|
- Runs as separate process from Superset web server
|
||||||
|
- JWT-based authentication for production deployments
|
||||||
|
|
||||||
|
#### New Configuration Options
|
||||||
|
|
||||||
|
**Development** (single-user, local testing):
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
MCP_DEV_USERNAME = "admin" # User for MCP authentication
|
||||||
|
MCP_SERVICE_HOST = "localhost"
|
||||||
|
MCP_SERVICE_PORT = 5008
|
||||||
|
```
|
||||||
|
|
||||||
|
**Production** (JWT-based, multi-user):
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
MCP_AUTH_ENABLED = True
|
||||||
|
MCP_JWT_ISSUER = "https://your-auth-provider.com"
|
||||||
|
MCP_JWT_AUDIENCE = "superset-mcp"
|
||||||
|
MCP_JWT_ALGORITHM = "RS256" # or "HS256" for shared secrets
|
||||||
|
|
||||||
|
# Option 1: Use JWKS endpoint (recommended for RS256)
|
||||||
|
MCP_JWKS_URI = "https://auth.example.com/.well-known/jwks.json"
|
||||||
|
|
||||||
|
# Option 2: Use static public key (RS256)
|
||||||
|
MCP_JWT_PUBLIC_KEY = "-----BEGIN PUBLIC KEY-----..."
|
||||||
|
|
||||||
|
# Option 3: Use shared secret (HS256)
|
||||||
|
MCP_JWT_ALGORITHM = "HS256"
|
||||||
|
MCP_JWT_SECRET = "your-shared-secret-key"
|
||||||
|
|
||||||
|
# Optional overrides
|
||||||
|
MCP_SERVICE_HOST = "0.0.0.0"
|
||||||
|
MCP_SERVICE_PORT = 5008
|
||||||
|
MCP_SESSION_CONFIG = {
|
||||||
|
"SESSION_COOKIE_SECURE": True,
|
||||||
|
"SESSION_COOKIE_HTTPONLY": True,
|
||||||
|
"SESSION_COOKIE_SAMESITE": "Strict",
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Running the MCP Service
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Development
|
||||||
|
superset mcp run --port 5008 --debug
|
||||||
|
|
||||||
|
# Production
|
||||||
|
superset mcp run --port 5008
|
||||||
|
|
||||||
|
# With factory config
|
||||||
|
superset mcp run --port 5008 --use-factory-config
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Deployment Considerations
|
||||||
|
|
||||||
|
The MCP service runs as a **separate process** from the Superset web server.
|
||||||
|
|
||||||
|
**Important**:
|
||||||
|
- Requires same Python environment and configuration as Superset
|
||||||
|
- Shares database connections with main Superset app
|
||||||
|
- Can be scaled independently from web server
|
||||||
|
- Requires `fastmcp` package (optional dependency)
|
||||||
|
|
||||||
|
**Installation**:
|
||||||
|
```bash
|
||||||
|
# Install with MCP support
|
||||||
|
pip install apache-superset[fastmcp]
|
||||||
|
|
||||||
|
# Or add to requirements.txt
|
||||||
|
apache-superset[fastmcp]>=X.Y.Z
|
||||||
|
```
|
||||||
|
|
||||||
|
**Process Management**:
|
||||||
|
Use systemd, supervisord, or Kubernetes to manage the MCP service process.
|
||||||
|
See `superset/mcp_service/PRODUCTION.md` for deployment guides.
|
||||||
|
|
||||||
|
**Security**:
|
||||||
|
- Development: Uses `MCP_DEV_USERNAME` for single-user access
|
||||||
|
- Production: **MUST** configure JWT authentication
|
||||||
|
- See `superset/mcp_service/SECURITY.md` for details
|
||||||
|
|
||||||
|
#### Documentation
|
||||||
|
|
||||||
|
- Architecture: `superset/mcp_service/ARCHITECTURE.md`
|
||||||
|
- Security: `superset/mcp_service/SECURITY.md`
|
||||||
|
- Production: `superset/mcp_service/PRODUCTION.md`
|
||||||
|
- Developer Guide: `superset/mcp_service/CLAUDE.md`
|
||||||
|
- Quick Start: `superset/mcp_service/README.md`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
- [33055](https://github.com/apache/superset/pull/33055): Upgrades Flask-AppBuilder to 5.0.0. The AUTH_OID authentication type has been deprecated and is no longer available as an option in Flask-AppBuilder. OpenID (OID) is considered a deprecated authentication protocol - if you are using AUTH_OID, you will need to migrate to an alternative authentication method such as OAuth, LDAP, or database authentication before upgrading.
|
- [33055](https://github.com/apache/superset/pull/33055): Upgrades Flask-AppBuilder to 5.0.0. The AUTH_OID authentication type has been deprecated and is no longer available as an option in Flask-AppBuilder. OpenID (OID) is considered a deprecated authentication protocol - if you are using AUTH_OID, you will need to migrate to an alternative authentication method such as OAuth, LDAP, or database authentication before upgrading.
|
||||||
- [35062](https://github.com/apache/superset/pull/35062): Changed the function signature of `setupExtensions` to `setupCodeOverrides` with options as arguments.
|
- [35062](https://github.com/apache/superset/pull/35062): Changed the function signature of `setupExtensions` to `setupCodeOverrides` with options as arguments.
|
||||||
- [34871](https://github.com/apache/superset/pull/34871): Fixed Jest test hanging issue from Ant Design v5 upgrade. MessageChannel is now mocked in test environment to prevent rc-overflow from causing Jest to hang. Test environment only - no production impact.
|
- [34871](https://github.com/apache/superset/pull/34871): Fixed Jest test hanging issue from Ant Design v5 upgrade. MessageChannel is now mocked in test environment to prevent rc-overflow from causing Jest to hang. Test environment only - no production impact.
|
||||||
|
|||||||
693
superset/mcp_service/ARCHITECTURE.md
Normal file
693
superset/mcp_service/ARCHITECTURE.md
Normal file
@@ -0,0 +1,693 @@
|
|||||||
|
<!--
|
||||||
|
Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
or more contributor license agreements. See the NOTICE file
|
||||||
|
distributed with this work for additional information
|
||||||
|
regarding copyright ownership. The ASF licenses this file
|
||||||
|
to you under the Apache License, Version 2.0 (the
|
||||||
|
"License"); you may not use this file except in compliance
|
||||||
|
with the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing,
|
||||||
|
software distributed under the License is distributed on an
|
||||||
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations
|
||||||
|
under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# MCP Service Architecture
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The Apache Superset MCP (Model Context Protocol) service provides programmatic access to Superset functionality through a standardized protocol that enables AI assistants and automation tools to interact with dashboards, charts, datasets, and SQL Lab.
|
||||||
|
|
||||||
|
The MCP service runs as a **separate process** from the Superset web server, using its own Flask application instance and HTTP server while sharing the same database and configuration with the main Superset application.
|
||||||
|
|
||||||
|
## Flask Singleton Pattern
|
||||||
|
|
||||||
|
### Why Module-Level Singleton?
|
||||||
|
|
||||||
|
The MCP service uses a module-level singleton Flask application instance rather than creating a new app instance per request. This design decision is based on several important considerations:
|
||||||
|
|
||||||
|
**Separate Process Architecture**:
|
||||||
|
- The MCP service runs as an independent process from the Superset web server
|
||||||
|
- It has its own HTTP server (via FastMCP/Starlette) handling MCP protocol requests
|
||||||
|
- Each MCP tool invocation occurs within the context of this single, long-lived Flask app
|
||||||
|
|
||||||
|
**Benefits of Module-Level Singleton**:
|
||||||
|
|
||||||
|
1. **Consistent Database Connection Pool**
|
||||||
|
- A single SQLAlchemy connection pool is maintained across all tool calls
|
||||||
|
- Connections are efficiently reused rather than recreated
|
||||||
|
- Connection pool configuration (size, timeout, etc.) behaves predictably
|
||||||
|
|
||||||
|
2. **Shared Configuration Access**
|
||||||
|
- Flask app configuration is loaded once at startup
|
||||||
|
- All tools access the same configuration state
|
||||||
|
- Changes to runtime config affect all subsequent tool calls consistently
|
||||||
|
|
||||||
|
3. **Thread-Safe Initialization**
|
||||||
|
- The Flask app is created exactly once using `threading.Lock()`
|
||||||
|
- Multiple concurrent requests safely share the same app instance
|
||||||
|
- No risk of duplicate initialization or race conditions
|
||||||
|
|
||||||
|
4. **Lower Resource Overhead**
|
||||||
|
- No per-request app creation/teardown overhead
|
||||||
|
- Memory footprint remains constant regardless of request volume
|
||||||
|
- Extension initialization (Flask-AppBuilder, Flask-Migrate, etc.) happens once
|
||||||
|
|
||||||
|
**When Module-Level Singleton Is Appropriate**:
|
||||||
|
- Service runs as dedicated daemon/process
|
||||||
|
- Application state is consistent across all requests
|
||||||
|
- No per-request application context needed
|
||||||
|
- Long-lived server process with many requests
|
||||||
|
|
||||||
|
**When Module-Level Singleton Is NOT Appropriate**:
|
||||||
|
- Testing with different configurations (use app fixtures instead)
|
||||||
|
- Multi-tenant deployments requiring different app configs per tenant
|
||||||
|
- Dynamic plugin loading requiring app recreation
|
||||||
|
- Development scenarios requiring hot-reload of app configuration
|
||||||
|
|
||||||
|
### Implementation Details
|
||||||
|
|
||||||
|
The singleton is implemented in `flask_singleton.py`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Module-level instance - created once on import
|
||||||
|
from superset.app import create_app
|
||||||
|
from superset.mcp_service.mcp_config import get_mcp_config
|
||||||
|
|
||||||
|
_temp_app = create_app()
|
||||||
|
|
||||||
|
with _temp_app.app_context():
|
||||||
|
mcp_config = get_mcp_config(_temp_app.config)
|
||||||
|
_temp_app.config.update(mcp_config)
|
||||||
|
|
||||||
|
app = _temp_app
|
||||||
|
|
||||||
|
def get_flask_app() -> Flask:
|
||||||
|
"""Get the Flask app instance."""
|
||||||
|
return app
|
||||||
|
```
|
||||||
|
|
||||||
|
**Key characteristics**:
|
||||||
|
- No complex patterns or metaclasses needed
|
||||||
|
- The module itself acts as the singleton container
|
||||||
|
- Clean, Pythonic approach following Stack Overflow recommendations
|
||||||
|
- Application context pushed during initialization to avoid "Working outside of application context" errors
|
||||||
|
|
||||||
|
## Multitenant Architecture
|
||||||
|
|
||||||
|
### Current Implementation
|
||||||
|
|
||||||
|
The MCP service uses **Option B: Shared Process with Tenant Isolation**:
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
T1[Tenant 1]
|
||||||
|
T2[Tenant 2]
|
||||||
|
T3[Tenant 3]
|
||||||
|
MCP[Single MCP Process]
|
||||||
|
DB[(Superset Database)]
|
||||||
|
|
||||||
|
T1 --> MCP
|
||||||
|
T2 --> MCP
|
||||||
|
T3 --> MCP
|
||||||
|
MCP --> DB
|
||||||
|
|
||||||
|
MCP -.->|Isolation via| ISO[User authentication JWT or dev user<br/>Flask-AppBuilder RBAC<br/>Dataset access filters<br/>Row-level security]
|
||||||
|
|
||||||
|
style ISO fill:#f9f,stroke:#333,stroke-width:2px
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tenant Isolation Mechanisms
|
||||||
|
|
||||||
|
#### Database Level
|
||||||
|
|
||||||
|
**Superset's Existing RLS (Row-Level Security)**:
|
||||||
|
- RLS rules are defined at the dataset level
|
||||||
|
- Rules filter queries based on user attributes (e.g., `department = '{{ current_user.department }}'`)
|
||||||
|
- The MCP service respects all RLS rules automatically through Superset's query execution layer
|
||||||
|
|
||||||
|
**No Schema-Based Isolation**:
|
||||||
|
- The current implementation does NOT use separate database schemas per tenant
|
||||||
|
- All Superset metadata (dashboards, charts, datasets) exists in the same database schema
|
||||||
|
- Database-level isolation is achieved through Superset's permission system rather than physical schema separation
|
||||||
|
|
||||||
|
#### Application Level
|
||||||
|
|
||||||
|
**Flask-AppBuilder Security Manager**:
|
||||||
|
- Every MCP tool call uses `@mcp_auth_hook` decorator
|
||||||
|
- The auth hook sets `g.user` to the authenticated user (from JWT or `MCP_DEV_USERNAME`)
|
||||||
|
- Superset's security manager then enforces permissions based on this user's roles
|
||||||
|
|
||||||
|
**User-Based Access Control**:
|
||||||
|
- Users can only access resources they have permissions for
|
||||||
|
- Dashboard ownership and role-based permissions are enforced
|
||||||
|
- The `can_access_datasource()` method validates dataset access
|
||||||
|
|
||||||
|
**Dataset Access Filters**:
|
||||||
|
- All list operations (dashboards, charts, datasets) use Superset's access filters:
|
||||||
|
- `DashboardAccessFilter` - filters dashboards based on user permissions
|
||||||
|
- `ChartAccessFilter` - filters charts based on user permissions
|
||||||
|
- `DatasourceFilter` - filters datasets based on user permissions
|
||||||
|
|
||||||
|
**Row-Level Security Enforcement**:
|
||||||
|
- RLS rules are applied transparently during query execution
|
||||||
|
- The MCP service makes no modifications to bypass RLS
|
||||||
|
- SQL queries executed through `execute_sql` tool respect RLS policies
|
||||||
|
|
||||||
|
#### JWT Tenant Claims
|
||||||
|
|
||||||
|
**Development Mode** (single user):
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
MCP_DEV_USERNAME = "admin"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Production Mode** (JWT-based):
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"sub": "user@company.com",
|
||||||
|
"email": "user@company.com",
|
||||||
|
"scopes": ["superset:read", "superset:chart:create"],
|
||||||
|
"exp": 1672531200
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Future Enhancement** (multi-tenant JWT):
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"sub": "user@tenant-a.com",
|
||||||
|
"tenant_id": "tenant-a",
|
||||||
|
"scopes": ["superset:read"],
|
||||||
|
"exp": 1672531200
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `tenant_id` claim could be used in future versions to:
|
||||||
|
- Further isolate data by tenant context
|
||||||
|
- Apply tenant-specific RLS rules
|
||||||
|
- Log and audit actions by tenant
|
||||||
|
- Implement tenant-specific rate limits
|
||||||
|
|
||||||
|
## Process Model
|
||||||
|
|
||||||
|
### Single Process Deployment
|
||||||
|
|
||||||
|
**When to Use**:
|
||||||
|
- Development and testing environments
|
||||||
|
- Small deployments with low request volume (< 100 requests/minute)
|
||||||
|
- Single-tenant installations
|
||||||
|
- Resource-constrained environments
|
||||||
|
|
||||||
|
**Resource Characteristics**:
|
||||||
|
- Memory: ~500MB-1GB (includes Flask app, SQLAlchemy, screenshot pool)
|
||||||
|
- CPU: Mostly I/O bound (database queries, screenshot generation)
|
||||||
|
- Database connections: Configurable via `SQLALCHEMY_POOL_SIZE` (default: 5)
|
||||||
|
|
||||||
|
**Scaling Limitations**:
|
||||||
|
- Single Python process = GIL limitations for CPU-bound operations
|
||||||
|
- Screenshot generation can block other requests
|
||||||
|
- Limited horizontal scalability without load balancer
|
||||||
|
|
||||||
|
**Example Command**:
|
||||||
|
```bash
|
||||||
|
superset mcp run --port 5008
|
||||||
|
```
|
||||||
|
|
||||||
|
### Multi-Process Deployment
|
||||||
|
|
||||||
|
**Using Gunicorn Workers**:
|
||||||
|
```bash
|
||||||
|
gunicorn \
|
||||||
|
--workers 4 \
|
||||||
|
--bind 0.0.0.0:5008 \
|
||||||
|
--worker-class uvicorn.workers.UvicornWorker \
|
||||||
|
superset.mcp_service.server:app
|
||||||
|
```
|
||||||
|
|
||||||
|
**Configuration Considerations**:
|
||||||
|
- Worker count: `2-4 x CPU cores` (typical recommendation)
|
||||||
|
- Each worker has its own Flask app instance via module-level singleton
|
||||||
|
- Workers share nothing - fully isolated processes
|
||||||
|
- Database connection pool per worker (watch total connections)
|
||||||
|
|
||||||
|
**Process Pool Management**:
|
||||||
|
- Use process manager (systemd, supervisord) for auto-restart
|
||||||
|
- Health checks to detect and restart failed workers
|
||||||
|
- Graceful shutdown to complete in-flight requests
|
||||||
|
|
||||||
|
**Load Balancing**:
|
||||||
|
- Use nginx/HAProxy to distribute requests across workers
|
||||||
|
- Round-robin or least-connections algorithms work well
|
||||||
|
- Sticky sessions NOT required (stateless API)
|
||||||
|
|
||||||
|
### Containerized Deployment
|
||||||
|
|
||||||
|
**Docker**:
|
||||||
|
```dockerfile
|
||||||
|
FROM apache/superset:latest
|
||||||
|
CMD ["superset", "mcp", "run", "--port", "5008"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Kubernetes Deployment**:
|
||||||
|
```yaml
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: superset-mcp
|
||||||
|
spec:
|
||||||
|
replicas: 3
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: superset-mcp
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
app: superset-mcp
|
||||||
|
spec:
|
||||||
|
containers:
|
||||||
|
- name: mcp
|
||||||
|
image: apache/superset:latest
|
||||||
|
command: ["superset", "mcp", "run", "--port", "5008"]
|
||||||
|
ports:
|
||||||
|
- containerPort: 5008
|
||||||
|
env:
|
||||||
|
- name: SUPERSET_CONFIG_PATH
|
||||||
|
value: /app/pythonpath/superset_config.py
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "512Mi"
|
||||||
|
cpu: "500m"
|
||||||
|
limits:
|
||||||
|
memory: "1Gi"
|
||||||
|
cpu: "1000m"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Horizontal Pod Autoscaling**:
|
||||||
|
```yaml
|
||||||
|
apiVersion: autoscaling/v2
|
||||||
|
kind: HorizontalPodAutoscaler
|
||||||
|
metadata:
|
||||||
|
name: superset-mcp-hpa
|
||||||
|
spec:
|
||||||
|
scaleTargetRef:
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
name: superset-mcp
|
||||||
|
minReplicas: 2
|
||||||
|
maxReplicas: 10
|
||||||
|
metrics:
|
||||||
|
- type: Resource
|
||||||
|
resource:
|
||||||
|
name: cpu
|
||||||
|
target:
|
||||||
|
type: Utilization
|
||||||
|
averageUtilization: 70
|
||||||
|
```
|
||||||
|
|
||||||
|
**Service Mesh Integration**:
|
||||||
|
- Istio/Linkerd can provide:
|
||||||
|
- Automatic retries and circuit breaking
|
||||||
|
- Distributed tracing
|
||||||
|
- Mutual TLS between pods
|
||||||
|
- Advanced traffic routing
|
||||||
|
|
||||||
|
## Database Connection Management
|
||||||
|
|
||||||
|
### Connection Pooling
|
||||||
|
|
||||||
|
The MCP service uses SQLAlchemy's connection pooling with configuration inherited from Superset:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
SQLALCHEMY_POOL_SIZE = 5 # Max connections per worker
|
||||||
|
SQLALCHEMY_POOL_TIMEOUT = 30 # Seconds to wait for connection
|
||||||
|
SQLALCHEMY_MAX_OVERFLOW = 10 # Extra connections beyond pool_size
|
||||||
|
SQLALCHEMY_POOL_RECYCLE = 3600 # Recycle connections after 1 hour
|
||||||
|
```
|
||||||
|
|
||||||
|
**Connection Lifecycle**:
|
||||||
|
1. Request arrives at MCP tool
|
||||||
|
2. Tool calls DAO method which accesses `db.session`
|
||||||
|
3. SQLAlchemy checks out connection from pool
|
||||||
|
4. Query executes on borrowed connection
|
||||||
|
5. Connection returns to pool (not closed)
|
||||||
|
6. Connection reused for next request
|
||||||
|
|
||||||
|
**Pool Size Recommendations**:
|
||||||
|
- **Single process**: 5-10 connections
|
||||||
|
- **Multi-worker (4 workers)**: 3-5 connections per worker = 12-20 total
|
||||||
|
- **Monitor**: Database max_connections setting must exceed total pool size across all MCP workers
|
||||||
|
|
||||||
|
**Example with 4 Gunicorn workers**:
|
||||||
|
```python
|
||||||
|
SQLALCHEMY_POOL_SIZE = 5
|
||||||
|
SQLALCHEMY_MAX_OVERFLOW = 5
|
||||||
|
# Total potential connections: 4 workers × (5 + 5) = 40 connections
|
||||||
|
# Ensure database supports 40+ connections
|
||||||
|
```
|
||||||
|
|
||||||
|
### Transaction Handling
|
||||||
|
|
||||||
|
**MCP Tool Transaction Pattern**:
|
||||||
|
```python
|
||||||
|
@mcp.tool
|
||||||
|
@mcp_auth_hook
|
||||||
|
def my_tool(param: str) -> Result:
|
||||||
|
# Auth hook sets g.user and manages session
|
||||||
|
try:
|
||||||
|
# Tool executes within implicit transaction
|
||||||
|
result = DashboardDAO.find_by_id(123)
|
||||||
|
return Result(data=result)
|
||||||
|
except Exception:
|
||||||
|
# On error: rollback happens in auth hook's except block
|
||||||
|
raise
|
||||||
|
finally:
|
||||||
|
# On success: rollback happens in auth hook's finally block
|
||||||
|
# (read-only operations don't commit)
|
||||||
|
pass
|
||||||
|
```
|
||||||
|
|
||||||
|
**Session Cleanup in Auth Hook**:
|
||||||
|
|
||||||
|
The `@mcp_auth_hook` decorator manages session lifecycle:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# On error path
|
||||||
|
except Exception:
|
||||||
|
try:
|
||||||
|
db.session.rollback()
|
||||||
|
db.session.remove()
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("Error cleaning up session: %s", e)
|
||||||
|
raise
|
||||||
|
|
||||||
|
# On success path (finally block)
|
||||||
|
finally:
|
||||||
|
try:
|
||||||
|
if db.session.is_active:
|
||||||
|
db.session.rollback() # Cleanup, don't commit
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("Error in finally block: %s", e)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why Rollback on Success?**
|
||||||
|
- MCP tools are primarily **read-only operations**
|
||||||
|
- No explicit commits needed for queries
|
||||||
|
- Rollback ensures clean slate for next request
|
||||||
|
- Write operations (create chart, etc.) use Superset's command pattern which handles commits internally
|
||||||
|
|
||||||
|
## Deployment Considerations
|
||||||
|
|
||||||
|
### Resource Requirements
|
||||||
|
|
||||||
|
**Memory Per Process**:
|
||||||
|
- Base Flask app: ~200MB
|
||||||
|
- SQLAlchemy + models: ~100MB
|
||||||
|
- WebDriver pool (if screenshots enabled): ~200MB
|
||||||
|
- Request processing overhead: ~50MB per concurrent request
|
||||||
|
- **Total**: 500MB-1GB per process
|
||||||
|
|
||||||
|
**CPU Usage Patterns**:
|
||||||
|
- I/O bound: Most time spent waiting on database/screenshots
|
||||||
|
- Low CPU during normal operations (< 20% per core)
|
||||||
|
- CPU spikes during:
|
||||||
|
- Screenshot generation (WebDriver rendering)
|
||||||
|
- Large dataset query processing
|
||||||
|
- Complex chart configuration validation
|
||||||
|
|
||||||
|
**Database Connections**:
|
||||||
|
- **Single process**: 5-10 connections (pool_size + max_overflow)
|
||||||
|
- **Multi-process**: `(pool_size + max_overflow) × worker_count`
|
||||||
|
- **Example**: 4 workers × 10 max connections = 40 total database connections
|
||||||
|
|
||||||
|
### Scaling Strategy
|
||||||
|
|
||||||
|
**When to Scale Horizontally**:
|
||||||
|
- Request latency increases beyond acceptable threshold (e.g., p95 > 2 seconds)
|
||||||
|
- CPU utilization consistently > 70%
|
||||||
|
- Request queue depth growing
|
||||||
|
- Database connection pool frequently exhausted
|
||||||
|
|
||||||
|
**Load Balancing Between MCP Instances**:
|
||||||
|
|
||||||
|
**Option 1: Nginx Round-Robin**:
|
||||||
|
```nginx
|
||||||
|
upstream mcp_backend {
|
||||||
|
server mcp-1:5008;
|
||||||
|
server mcp-2:5008;
|
||||||
|
server mcp-3:5008;
|
||||||
|
}
|
||||||
|
|
||||||
|
server {
|
||||||
|
location / {
|
||||||
|
proxy_pass http://mcp_backend;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option 2: Kubernetes Service**:
|
||||||
|
```yaml
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Service
|
||||||
|
metadata:
|
||||||
|
name: superset-mcp
|
||||||
|
spec:
|
||||||
|
selector:
|
||||||
|
app: superset-mcp
|
||||||
|
ports:
|
||||||
|
- port: 5008
|
||||||
|
targetPort: 5008
|
||||||
|
type: ClusterIP
|
||||||
|
```
|
||||||
|
|
||||||
|
**Session Affinity**:
|
||||||
|
- NOT required - MCP service is stateless
|
||||||
|
- Each request is independent
|
||||||
|
- No session state maintained between requests
|
||||||
|
- Load balancer can freely distribute requests
|
||||||
|
|
||||||
|
### High Availability
|
||||||
|
|
||||||
|
**Multiple MCP Instances**:
|
||||||
|
- Deploy at least 2 instances for redundancy
|
||||||
|
- Use load balancer health checks to detect failures
|
||||||
|
- Failed instances automatically removed from rotation
|
||||||
|
|
||||||
|
**Health Checks**:
|
||||||
|
|
||||||
|
The MCP service provides a health check tool:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Internal health check
|
||||||
|
@mcp.tool
|
||||||
|
def health_check() -> HealthCheckResponse:
|
||||||
|
return HealthCheckResponse(
|
||||||
|
status="healthy",
|
||||||
|
timestamp=datetime.now(timezone.utc),
|
||||||
|
database_connection="ok"
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Load balancer health check**:
|
||||||
|
```nginx
|
||||||
|
# Nginx example
|
||||||
|
upstream mcp_backend {
|
||||||
|
server mcp-1:5008 max_fails=3 fail_timeout=30s;
|
||||||
|
server mcp-2:5008 max_fails=3 fail_timeout=30s;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Kubernetes health check**:
|
||||||
|
```yaml
|
||||||
|
livenessProbe:
|
||||||
|
httpGet:
|
||||||
|
path: /health
|
||||||
|
port: 5008
|
||||||
|
initialDelaySeconds: 30
|
||||||
|
periodSeconds: 10
|
||||||
|
readinessProbe:
|
||||||
|
httpGet:
|
||||||
|
path: /health
|
||||||
|
port: 5008
|
||||||
|
initialDelaySeconds: 10
|
||||||
|
periodSeconds: 5
|
||||||
|
```
|
||||||
|
|
||||||
|
**Failover Handling**:
|
||||||
|
- Load balancer automatically routes around failed instances
|
||||||
|
- MCP clients should implement retry logic for transient failures
|
||||||
|
- Use circuit breaker pattern for repeated failures
|
||||||
|
- Monitor and alert on instance failures
|
||||||
|
|
||||||
|
### Database Considerations
|
||||||
|
|
||||||
|
**Shared Database with Superset**:
|
||||||
|
- MCP service and Superset web server share the same database
|
||||||
|
- Same SQLAlchemy models and schema
|
||||||
|
- Database migrations applied once, affect both services
|
||||||
|
|
||||||
|
**Connection Pool Sizing**:
|
||||||
|
```
|
||||||
|
Total DB Connections =
|
||||||
|
Superset Web (workers × pool_size) +
|
||||||
|
MCP Service (workers × pool_size) +
|
||||||
|
Other services
|
||||||
|
|
||||||
|
Must be < Database max_connections
|
||||||
|
```
|
||||||
|
|
||||||
|
**Example Calculation**:
|
||||||
|
- Superset web: 8 workers × 10 connections = 80
|
||||||
|
- MCP service: 4 workers × 10 connections = 40
|
||||||
|
- Other: 20 reserved
|
||||||
|
- **Total**: 140 connections
|
||||||
|
- **Database**: Set max_connections >= 150
|
||||||
|
|
||||||
|
### Monitoring Recommendations
|
||||||
|
|
||||||
|
**Key Metrics to Track**:
|
||||||
|
- Request rate per tool
|
||||||
|
- Request latency (p50, p95, p99)
|
||||||
|
- Error rate by tool and error type
|
||||||
|
- Database connection pool utilization
|
||||||
|
- Memory usage per process
|
||||||
|
- Active concurrent requests
|
||||||
|
|
||||||
|
**Example Prometheus Metrics** (future implementation):
|
||||||
|
```python
|
||||||
|
mcp_requests_total{tool="list_charts", status="success"}
|
||||||
|
mcp_request_duration_seconds{tool="list_charts", quantile="0.95"}
|
||||||
|
mcp_database_connections_active
|
||||||
|
mcp_database_connections_idle
|
||||||
|
mcp_memory_usage_bytes
|
||||||
|
```
|
||||||
|
|
||||||
|
**Log Aggregation**:
|
||||||
|
- Centralize logs from all MCP instances
|
||||||
|
- Use structured logging (JSON format)
|
||||||
|
- Include trace IDs for request correlation
|
||||||
|
- Alert on error rate spikes
|
||||||
|
|
||||||
|
## Architecture Diagrams
|
||||||
|
|
||||||
|
### Request Flow
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant Client as MCP Client<br/>(Claude/automation)
|
||||||
|
participant FastMCP as FastMCP Server<br/>(Starlette/Uvicorn)
|
||||||
|
participant Auth as MCP Auth Hook
|
||||||
|
participant Tool as Tool Implementation<br/>(e.g., list_charts)
|
||||||
|
participant DAO as Superset DAO Layer<br/>(ChartDAO, DashboardDAO)
|
||||||
|
participant DB as Database<br/>(PostgreSQL/MySQL)
|
||||||
|
|
||||||
|
Client->>FastMCP: MCP Protocol (HTTP/SSE)
|
||||||
|
FastMCP->>Auth: @mcp.tool decorator
|
||||||
|
Auth->>Auth: Sets g.user, manages session
|
||||||
|
Auth->>Tool: Execute tool
|
||||||
|
Tool->>DAO: Uses DAO pattern
|
||||||
|
DAO->>DB: SQLAlchemy ORM
|
||||||
|
DB-->>DAO: Query results
|
||||||
|
DAO-->>Tool: Processed data
|
||||||
|
Tool-->>Auth: Tool response
|
||||||
|
Auth-->>FastMCP: Response with cleanup
|
||||||
|
FastMCP-->>Client: MCP response
|
||||||
|
```
|
||||||
|
|
||||||
|
### Multi-Instance Deployment
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
LB[Load Balancer<br/>Nginx/K8s Service]
|
||||||
|
MCP1[MCP Instance 1<br/>port 5008]
|
||||||
|
MCP2[MCP Instance 2<br/>port 5008]
|
||||||
|
MCP3[MCP Instance 3<br/>port 5008]
|
||||||
|
DB[(Superset Database<br/>shared connection pool)]
|
||||||
|
|
||||||
|
LB --> MCP1
|
||||||
|
LB --> MCP2
|
||||||
|
LB --> MCP3
|
||||||
|
MCP1 --> DB
|
||||||
|
MCP2 --> DB
|
||||||
|
MCP3 --> DB
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tenant Isolation
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
UserA[User A<br/>JWT: tenant=acme]
|
||||||
|
UserB[User B<br/>JWT: tenant=beta]
|
||||||
|
MCP[MCP Service<br/>single process]
|
||||||
|
Auth[@mcp_auth_hook<br/>Sets g.user from JWT]
|
||||||
|
RBAC[Flask-AppBuilder<br/>RBAC]
|
||||||
|
Filters[Dataset Access<br/>Filters]
|
||||||
|
DB[(Superset Database<br/>single schema, filtered by permissions)]
|
||||||
|
|
||||||
|
UserA --> MCP
|
||||||
|
UserB --> MCP
|
||||||
|
MCP --> Auth
|
||||||
|
Auth --> RBAC
|
||||||
|
Auth --> Filters
|
||||||
|
RBAC --> |User A sees only<br/>acme dashboards| DB
|
||||||
|
Filters --> |User A queries filtered<br/>by RLS rules for acme| DB
|
||||||
|
```
|
||||||
|
|
||||||
|
## Comparison with Alternative Architectures
|
||||||
|
|
||||||
|
### Module-Level Singleton (Current) vs Per-Request App
|
||||||
|
|
||||||
|
| Aspect | Module-Level Singleton | Per-Request App |
|
||||||
|
|--------|----------------------|-----------------|
|
||||||
|
| Connection Pool | Single shared pool | New pool per request |
|
||||||
|
| Memory Overhead | Constant (~500MB) | 500MB × concurrent requests |
|
||||||
|
| Thread Safety | Must ensure thread-safe access | Each request isolated |
|
||||||
|
| Configuration | Loaded once at startup | Can vary per request |
|
||||||
|
| Performance | Fast (no setup overhead) | Slow (initialization cost) |
|
||||||
|
| Use Case | Production daemon | Testing/multi-config scenarios |
|
||||||
|
|
||||||
|
### Shared Process (Current) vs Separate Process Per Tenant
|
||||||
|
|
||||||
|
| Aspect | Shared Process | Process Per Tenant |
|
||||||
|
|--------|---------------|-------------------|
|
||||||
|
| Isolation | Application-level (RBAC/RLS) | Process-level (OS isolation) |
|
||||||
|
| Resource Usage | Efficient (shared resources) | Higher (duplicate resources) |
|
||||||
|
| Scaling | Horizontal (add instances) | Vertical (more processes) |
|
||||||
|
| Complexity | Simpler deployment | Complex orchestration |
|
||||||
|
| Security | Depends on Superset RBAC | Stronger isolation |
|
||||||
|
| Use Case | Most deployments | High-security multi-tenant |
|
||||||
|
|
||||||
|
## Future Architectural Considerations
|
||||||
|
|
||||||
|
### Async/Await Support
|
||||||
|
|
||||||
|
The current implementation uses synchronous request handling. Future versions could:
|
||||||
|
- Use `async`/`await` for I/O operations
|
||||||
|
- Implement connection pooling with `asyncpg` (PostgreSQL) or `aiomysql`
|
||||||
|
- Improve throughput for I/O-bound operations
|
||||||
|
|
||||||
|
### Caching Layer
|
||||||
|
|
||||||
|
Adding caching between MCP service and database:
|
||||||
|
- Redis cache for frequently accessed resources (dashboards, charts, datasets)
|
||||||
|
- Cache invalidation on updates
|
||||||
|
- Reduced database load for read-heavy workloads
|
||||||
|
|
||||||
|
### Event-Driven Updates
|
||||||
|
|
||||||
|
WebSocket support for real-time updates:
|
||||||
|
- Push notifications when dashboards/charts change
|
||||||
|
- Streaming query results for large datasets
|
||||||
|
- Live dashboard editing collaboration
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- **Flask Application Context**: https://flask.palletsprojects.com/en/stable/appcontext/
|
||||||
|
- **SQLAlchemy Connection Pooling**: https://docs.sqlalchemy.org/en/stable/core/pooling.html
|
||||||
|
- **FastMCP Documentation**: https://github.com/jlowin/fastmcp
|
||||||
|
- **Superset Security Model**: https://superset.apache.org/docs/security
|
||||||
1332
superset/mcp_service/PRODUCTION.md
Normal file
1332
superset/mcp_service/PRODUCTION.md
Normal file
File diff suppressed because it is too large
Load Diff
803
superset/mcp_service/SECURITY.md
Normal file
803
superset/mcp_service/SECURITY.md
Normal file
@@ -0,0 +1,803 @@
|
|||||||
|
<!--
|
||||||
|
Licensed to the Apache Software Foundation (ASF) under one
|
||||||
|
or more contributor license agreements. See the NOTICE file
|
||||||
|
distributed with this work for additional information
|
||||||
|
regarding copyright ownership. The ASF licenses this file
|
||||||
|
to you under the Apache License, Version 2.0 (the
|
||||||
|
"License"); you may not use this file except in compliance
|
||||||
|
with the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing,
|
||||||
|
software distributed under the License is distributed on an
|
||||||
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||||
|
KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations
|
||||||
|
under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# MCP Service Security
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The MCP service implements multiple layers of security to ensure safe programmatic access to Superset functionality. This document covers authentication, authorization, session management, audit logging, and compliance considerations.
|
||||||
|
|
||||||
|
## Authentication
|
||||||
|
|
||||||
|
### Current Implementation (Development)
|
||||||
|
|
||||||
|
For development and testing, the MCP service uses a simple username-based authentication:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
MCP_DEV_USERNAME = "admin"
|
||||||
|
```
|
||||||
|
|
||||||
|
**How it works**:
|
||||||
|
1. The `@mcp_auth_hook` decorator calls `get_user_from_request()`
|
||||||
|
2. `get_user_from_request()` reads `MCP_DEV_USERNAME` from config
|
||||||
|
3. User is queried from database and set as `g.user`
|
||||||
|
4. All subsequent Superset operations use this user's permissions
|
||||||
|
|
||||||
|
**Development Use Only**:
|
||||||
|
- No token validation
|
||||||
|
- No multi-user support
|
||||||
|
- No authentication security
|
||||||
|
- Single user for all MCP requests
|
||||||
|
- NOT suitable for production
|
||||||
|
|
||||||
|
### Production Implementation (JWT)
|
||||||
|
|
||||||
|
For production deployments, the MCP service supports JWT (JSON Web Token) authentication:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
MCP_AUTH_ENABLED = True
|
||||||
|
MCP_JWT_ISSUER = "https://your-auth-provider.com"
|
||||||
|
MCP_JWT_AUDIENCE = "superset-mcp"
|
||||||
|
MCP_JWT_ALGORITHM = "RS256" # or "HS256" for symmetric keys
|
||||||
|
|
||||||
|
# Option 1: Use JWKS endpoint (recommended for RS256)
|
||||||
|
MCP_JWKS_URI = "https://your-auth-provider.com/.well-known/jwks.json"
|
||||||
|
|
||||||
|
# Option 2: Use static public key (RS256)
|
||||||
|
MCP_JWT_PUBLIC_KEY = """-----BEGIN PUBLIC KEY-----
|
||||||
|
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
|
||||||
|
-----END PUBLIC KEY-----"""
|
||||||
|
|
||||||
|
# Option 3: Use shared secret (HS256 - less secure)
|
||||||
|
MCP_JWT_ALGORITHM = "HS256"
|
||||||
|
MCP_JWT_SECRET = "your-shared-secret-key"
|
||||||
|
```
|
||||||
|
|
||||||
|
**JWT Token Structure**:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"iss": "https://your-auth-provider.com",
|
||||||
|
"sub": "user@company.com",
|
||||||
|
"aud": "superset-mcp",
|
||||||
|
"exp": 1735689600,
|
||||||
|
"iat": 1735686000,
|
||||||
|
"email": "user@company.com",
|
||||||
|
"scopes": ["superset:read", "superset:chart:create"]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Required Claims**:
|
||||||
|
- `iss` (issuer): Must match `MCP_JWT_ISSUER`
|
||||||
|
- `sub` (subject): User identifier (username/email)
|
||||||
|
- `aud` (audience): Must match `MCP_JWT_AUDIENCE`
|
||||||
|
- `exp` (expiration): Token expiration timestamp
|
||||||
|
- `iat` (issued at): Token creation timestamp
|
||||||
|
|
||||||
|
**Optional Claims**:
|
||||||
|
- `email`: User's email address
|
||||||
|
- `username`: Alternative to `sub` for user identification
|
||||||
|
- `scopes`: Array of permission scopes
|
||||||
|
- `tenant_id`: Multi-tenant identifier (future use)
|
||||||
|
|
||||||
|
**Token Validation Process**:
|
||||||
|
|
||||||
|
1. Extract Bearer token from `Authorization` header
|
||||||
|
2. Verify token signature using public key or JWKS
|
||||||
|
3. Validate `iss`, `aud`, and `exp` claims
|
||||||
|
4. Check required scopes (if configured)
|
||||||
|
5. Extract user identifier from `sub`, `email`, or `username` claim
|
||||||
|
6. Look up Superset user from database
|
||||||
|
7. Set `g.user` for request context
|
||||||
|
|
||||||
|
**Example Client Usage**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Using curl
|
||||||
|
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" \
|
||||||
|
http://localhost:5008/list_charts
|
||||||
|
|
||||||
|
# Using MCP client (Claude Desktop)
|
||||||
|
{
|
||||||
|
"mcpServers": {
|
||||||
|
"superset": {
|
||||||
|
"url": "http://localhost:5008",
|
||||||
|
"headers": {
|
||||||
|
"Authorization": "Bearer YOUR_JWT_TOKEN"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Token Renewal and Refresh
|
||||||
|
|
||||||
|
**Short-lived Access Tokens** (recommended):
|
||||||
|
- Issue tokens with short expiration (e.g., 15 minutes)
|
||||||
|
- Client must refresh token before expiration
|
||||||
|
- Reduces risk of token theft
|
||||||
|
|
||||||
|
**Refresh Token Pattern**:
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant Client
|
||||||
|
participant AuthProvider as Auth Provider
|
||||||
|
participant MCP as MCP Service
|
||||||
|
|
||||||
|
Client->>AuthProvider: Request access
|
||||||
|
AuthProvider-->>Client: access_token (15 min)<br/>refresh_token (30 days)
|
||||||
|
|
||||||
|
Client->>MCP: Request with access_token
|
||||||
|
MCP-->>Client: Response
|
||||||
|
|
||||||
|
Note over Client,MCP: Access token expires
|
||||||
|
|
||||||
|
Client->>AuthProvider: Request new token with refresh_token
|
||||||
|
AuthProvider-->>Client: New access_token (15 min)
|
||||||
|
|
||||||
|
Client->>MCP: Request with new access_token
|
||||||
|
MCP-->>Client: Response
|
||||||
|
|
||||||
|
Note over Client,AuthProvider: Refresh token expires
|
||||||
|
|
||||||
|
Client->>AuthProvider: User must re-authenticate
|
||||||
|
```
|
||||||
|
|
||||||
|
**MCP Service Responsibility**:
|
||||||
|
- MCP service only validates access tokens
|
||||||
|
- Refresh token handling is the client's responsibility
|
||||||
|
- Auth provider (OAuth2/OIDC server) handles token refresh
|
||||||
|
|
||||||
|
### Service Account Patterns
|
||||||
|
|
||||||
|
For automation and batch jobs, use service accounts instead of user credentials:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"iss": "https://your-auth-provider.com",
|
||||||
|
"sub": "service-account@automation.company.com",
|
||||||
|
"aud": "superset-mcp",
|
||||||
|
"exp": 1735689600,
|
||||||
|
"client_id": "superset-automation",
|
||||||
|
"scopes": ["superset:read", "superset:chart:create"]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Service Account Best Practices**:
|
||||||
|
- Create dedicated Superset users for service accounts
|
||||||
|
- Grant minimal required permissions
|
||||||
|
- Use long-lived tokens only when necessary
|
||||||
|
- Rotate service account credentials regularly
|
||||||
|
- Log all service account activity
|
||||||
|
- Use separate service accounts per automation job
|
||||||
|
|
||||||
|
**Example Superset Service Account Setup**:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Create service account user in Superset
|
||||||
|
superset fab create-user \
|
||||||
|
--role Alpha \
|
||||||
|
--username automation-service \
|
||||||
|
--firstname Automation \
|
||||||
|
--lastname Service \
|
||||||
|
--email automation@company.com \
|
||||||
|
--password <generated-password>
|
||||||
|
|
||||||
|
# Grant specific permissions
|
||||||
|
# (Use Superset UI or FAB CLI to configure role permissions)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Authorization
|
||||||
|
|
||||||
|
### RBAC Integration
|
||||||
|
|
||||||
|
The MCP service fully integrates with Superset's Flask-AppBuilder role-based access control:
|
||||||
|
|
||||||
|
**Role Hierarchy**:
|
||||||
|
- **Admin**: Full access to all resources
|
||||||
|
- **Alpha**: Can create and edit dashboards, charts, datasets
|
||||||
|
- **Gamma**: Read-only access to permitted resources
|
||||||
|
- **Custom Roles**: Fine-grained permission sets
|
||||||
|
|
||||||
|
**Permission Checking Flow**:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# In MCP tool
|
||||||
|
@mcp.tool
|
||||||
|
@mcp_auth_hook # Sets g.user
|
||||||
|
def list_dashboards(filters: List[Filter]) -> DashboardList:
|
||||||
|
# Flask-AppBuilder security manager automatically filters
|
||||||
|
# based on g.user's permissions
|
||||||
|
dashboards = DashboardDAO.find_by_ids(...)
|
||||||
|
# Only returns dashboards g.user can access
|
||||||
|
```
|
||||||
|
|
||||||
|
**Permission Types**:
|
||||||
|
|
||||||
|
| Permission | Description | Example |
|
||||||
|
|------------|-------------|---------|
|
||||||
|
| `can_read` | View resource | View dashboard details |
|
||||||
|
| `can_write` | Edit resource | Update chart configuration |
|
||||||
|
| `can_delete` | Delete resource | Remove dashboard |
|
||||||
|
| `datasource_access` | Access dataset | Query dataset in chart |
|
||||||
|
| `database_access` | Access database | Execute SQL in SQL Lab |
|
||||||
|
|
||||||
|
### Row-Level Security (RLS)
|
||||||
|
|
||||||
|
RLS rules filter query results based on user attributes:
|
||||||
|
|
||||||
|
**RLS Rule Example**:
|
||||||
|
```sql
|
||||||
|
-- Only show records for user's department
|
||||||
|
department = '{{ current_user().department }}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**How RLS Works with MCP**:
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant Client
|
||||||
|
participant Auth as @mcp_auth_hook
|
||||||
|
participant Tool as MCP Tool
|
||||||
|
participant DAO as Superset DAO
|
||||||
|
participant DB as Database
|
||||||
|
|
||||||
|
Client->>Auth: Request with JWT/dev username
|
||||||
|
Auth->>Auth: Set g.user
|
||||||
|
Auth->>Tool: Execute tool
|
||||||
|
Tool->>DAO: Call ChartDAO.get_chart_data()
|
||||||
|
DAO->>DAO: Apply RLS rules<br/>Replace template variables<br/>with g.user attributes
|
||||||
|
DAO->>DB: Query with RLS filters in WHERE clause
|
||||||
|
DB-->>DAO: Only permitted rows
|
||||||
|
DAO-->>Tool: Filtered data
|
||||||
|
Tool-->>Client: Response
|
||||||
|
```
|
||||||
|
|
||||||
|
**RLS Configuration**:
|
||||||
|
|
||||||
|
RLS is configured per dataset in Superset UI:
|
||||||
|
1. Navigate to dataset → Edit → Row Level Security
|
||||||
|
2. Create RLS rule with SQL filter template
|
||||||
|
3. Assign rule to roles or users
|
||||||
|
4. MCP service automatically applies rules (no code changes needed)
|
||||||
|
|
||||||
|
**MCP Service Guarantees**:
|
||||||
|
- Cannot bypass RLS rules
|
||||||
|
- No privileged access mode
|
||||||
|
- RLS applied consistently across all tools
|
||||||
|
- Same security model as Superset web UI
|
||||||
|
|
||||||
|
### Dataset Access Control
|
||||||
|
|
||||||
|
The MCP service validates dataset access before executing queries:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# In chart generation tool
|
||||||
|
@mcp.tool
|
||||||
|
@mcp_auth_hook
|
||||||
|
def generate_chart(dataset_id: int, ...) -> ChartResponse:
|
||||||
|
dataset = DatasetDAO.find_by_id(dataset_id)
|
||||||
|
|
||||||
|
# Check if user has access
|
||||||
|
if not has_dataset_access(dataset):
|
||||||
|
raise ValueError(
|
||||||
|
f"User {g.user.username} does not have access to dataset {dataset_id}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Proceed with chart creation
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Dataset Access Filters**:
|
||||||
|
|
||||||
|
All listing operations automatically filter by user access:
|
||||||
|
|
||||||
|
- `list_datasets`: Uses `DatasourceFilter` - only shows datasets user can query
|
||||||
|
- `list_charts`: Uses `ChartAccessFilter` - only shows charts with accessible datasets
|
||||||
|
- `list_dashboards`: Uses `DashboardAccessFilter` - only shows dashboards user can view
|
||||||
|
|
||||||
|
**Access Check Implementation**:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from superset import security_manager
|
||||||
|
|
||||||
|
def has_dataset_access(dataset: SqlaTable) -> bool:
|
||||||
|
"""Check if g.user can access dataset."""
|
||||||
|
if hasattr(g, "user") and g.user:
|
||||||
|
return security_manager.can_access_datasource(datasource=dataset)
|
||||||
|
return False
|
||||||
|
```
|
||||||
|
|
||||||
|
### Tool-Level Permissions
|
||||||
|
|
||||||
|
Different MCP tools require different Superset permissions:
|
||||||
|
|
||||||
|
| Tool | Required Permissions | Notes |
|
||||||
|
|------|---------------------|-------|
|
||||||
|
| `list_dashboards` | `can_read` on Dashboard | Returns only accessible dashboards |
|
||||||
|
| `get_dashboard_info` | `can_read` on Dashboard + dataset access | Validates dashboard and dataset permissions |
|
||||||
|
| `list_charts` | `can_read` on Slice | Returns only charts with accessible datasets |
|
||||||
|
| `get_chart_info` | `can_read` on Slice + dataset access | Validates chart and dataset permissions |
|
||||||
|
| `get_chart_data` | `can_read` on Slice + `datasource_access` | Executes query with RLS applied |
|
||||||
|
| `generate_chart` | `can_write` on Slice + `datasource_access` | Creates new chart |
|
||||||
|
| `update_chart` | `can_write` on Slice + ownership or Admin | Must own chart or be Admin |
|
||||||
|
| `list_datasets` | `datasource_access` | Returns only accessible datasets |
|
||||||
|
| `get_dataset_info` | `datasource_access` | Validates dataset access |
|
||||||
|
| `execute_sql` | `can_sql_json` or `can_sqllab` on Database | Executes SQL with RLS |
|
||||||
|
| `generate_dashboard` | `can_write` on Dashboard + dataset access | Creates new dashboard |
|
||||||
|
|
||||||
|
**Permission Denied Handling**:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# If user lacks permission, Superset raises exception
|
||||||
|
try:
|
||||||
|
result = DashboardDAO.find_by_id(dashboard_id)
|
||||||
|
except SupersetSecurityException as e:
|
||||||
|
raise ValueError(f"Access denied: {e}")
|
||||||
|
```
|
||||||
|
|
||||||
|
### JWT Scope Validation
|
||||||
|
|
||||||
|
Future implementation will support scope-based authorization:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
MCP_REQUIRED_SCOPES = ["superset:read"] # Minimum scopes required
|
||||||
|
```
|
||||||
|
|
||||||
|
**Scope Hierarchy**:
|
||||||
|
- `superset:read`: List and view resources
|
||||||
|
- `superset:chart:create`: Create new charts
|
||||||
|
- `superset:chart:update`: Update existing charts
|
||||||
|
- `superset:chart:delete`: Delete charts
|
||||||
|
- `superset:dashboard:create`: Create dashboards
|
||||||
|
- `superset:sql:execute`: Execute SQL queries
|
||||||
|
- `superset:admin`: Full administrative access
|
||||||
|
|
||||||
|
**Scope Enforcement** (future):
|
||||||
|
|
||||||
|
```python
|
||||||
|
@mcp.tool
|
||||||
|
@mcp_auth_hook
|
||||||
|
@require_scopes(["superset:chart:create"])
|
||||||
|
def generate_chart(...) -> ChartResponse:
|
||||||
|
# Only proceeds if JWT contains required scope
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**Scope Validation Logic**:
|
||||||
|
1. Extract `scopes` array from JWT payload
|
||||||
|
2. Check if all required scopes present
|
||||||
|
3. Deny access if any scope missing
|
||||||
|
4. Log denied attempts for audit
|
||||||
|
|
||||||
|
## Session and CSRF Handling
|
||||||
|
|
||||||
|
### Session Configuration
|
||||||
|
|
||||||
|
The MCP service configures sessions for authentication context:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
MCP_SESSION_CONFIG = {
|
||||||
|
"SESSION_COOKIE_HTTPONLY": True, # Prevent JavaScript access
|
||||||
|
"SESSION_COOKIE_SECURE": True, # HTTPS only (production)
|
||||||
|
"SESSION_COOKIE_SAMESITE": "Strict", # CSRF protection
|
||||||
|
"SESSION_COOKIE_NAME": "superset_session",
|
||||||
|
"PERMANENT_SESSION_LIFETIME": 86400, # 24 hours
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why Session Config in MCP?**
|
||||||
|
|
||||||
|
The MCP service uses Flask's session mechanism for:
|
||||||
|
- **Authentication context**: Storing `g.user` across request lifecycle
|
||||||
|
- **CSRF token generation**: Protecting state-changing operations
|
||||||
|
- **Request correlation**: Linking related tool calls
|
||||||
|
|
||||||
|
**Important Notes**:
|
||||||
|
- MCP service is **stateless** - no server-side session storage
|
||||||
|
- Sessions used only for request-scoped auth context
|
||||||
|
- Cookies used for auth token transmission (alternative to Bearer header)
|
||||||
|
- Session data NOT persisted between MCP service restarts
|
||||||
|
|
||||||
|
### CSRF Protection
|
||||||
|
|
||||||
|
CSRF (Cross-Site Request Forgery) protection is configured but currently **not enforced** for MCP tools:
|
||||||
|
|
||||||
|
```python
|
||||||
|
MCP_CSRF_CONFIG = {
|
||||||
|
"WTF_CSRF_ENABLED": True,
|
||||||
|
"WTF_CSRF_TIME_LIMIT": None, # No time limit
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why CSRF Config Exists**:
|
||||||
|
- Flask-AppBuilder and Superset expect CSRF configuration
|
||||||
|
- Prevents errors during app initialization
|
||||||
|
- Future-proofing for potential web UI for MCP service
|
||||||
|
|
||||||
|
**Why CSRF NOT Enforced**:
|
||||||
|
- MCP protocol uses Bearer tokens (not cookies for auth)
|
||||||
|
- CSRF attacks require browser cookie-based authentication
|
||||||
|
- Stateless API design prevents CSRF vulnerability
|
||||||
|
- MCP clients are programmatic (not browsers)
|
||||||
|
|
||||||
|
**If Using Cookie-Based Auth** (future):
|
||||||
|
- Enable CSRF token requirement
|
||||||
|
- Include CSRF token in MCP tool requests
|
||||||
|
- Validate token on state-changing operations
|
||||||
|
|
||||||
|
**CSRF Token Flow** (if enabled):
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant Client
|
||||||
|
participant MCP as MCP Service
|
||||||
|
participant Session as Session Store
|
||||||
|
|
||||||
|
Client->>MCP: Request CSRF token
|
||||||
|
MCP->>Session: Generate and store token
|
||||||
|
MCP-->>Client: Return CSRF token
|
||||||
|
|
||||||
|
Client->>MCP: Request with CSRF token
|
||||||
|
MCP->>Session: Validate token matches session
|
||||||
|
alt Token valid
|
||||||
|
MCP-->>Client: Process request
|
||||||
|
else Token invalid/missing
|
||||||
|
MCP-->>Client: Reject request (403)
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
### Production Security Recommendations
|
||||||
|
|
||||||
|
**HTTPS Required**:
|
||||||
|
```python
|
||||||
|
MCP_SESSION_CONFIG = {
|
||||||
|
"SESSION_COOKIE_SECURE": True, # MUST be True in production
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Without HTTPS:
|
||||||
|
- Cookies transmitted in plaintext
|
||||||
|
- Session hijacking risk
|
||||||
|
- JWT tokens exposed
|
||||||
|
- Man-in-the-middle attacks possible
|
||||||
|
|
||||||
|
**SameSite Configuration**:
|
||||||
|
- `Strict`: Cookies never sent cross-site (most secure)
|
||||||
|
- `Lax`: Cookies sent on top-level navigation (less secure)
|
||||||
|
- `None`: Cookies sent everywhere (requires Secure flag, least secure)
|
||||||
|
|
||||||
|
**Recommended Production Settings**:
|
||||||
|
```python
|
||||||
|
MCP_SESSION_CONFIG = {
|
||||||
|
"SESSION_COOKIE_HTTPONLY": True, # Always
|
||||||
|
"SESSION_COOKIE_SECURE": True, # Always (HTTPS required)
|
||||||
|
"SESSION_COOKIE_SAMESITE": "Strict", # Recommended
|
||||||
|
"PERMANENT_SESSION_LIFETIME": 3600, # 1 hour (adjust as needed)
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Audit Logging
|
||||||
|
|
||||||
|
### Current Logging
|
||||||
|
|
||||||
|
The MCP service logs basic authentication events:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# In @mcp_auth_hook
|
||||||
|
logger.debug(
|
||||||
|
"MCP tool call: user=%s, tool=%s",
|
||||||
|
user.username,
|
||||||
|
tool_func.__name__
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
**What's Logged**:
|
||||||
|
- User who made the request
|
||||||
|
- Which tool was called
|
||||||
|
- Timestamp (from log formatter)
|
||||||
|
- Success/failure (via exception logging)
|
||||||
|
|
||||||
|
**Log Format**:
|
||||||
|
```
|
||||||
|
2025-01-01 10:30:45,123 DEBUG [mcp_auth_hook] MCP tool call: user=admin, tool=list_dashboards
|
||||||
|
2025-01-01 10:30:45,456 ERROR [mcp_auth_hook] Tool execution failed: user=admin, tool=generate_chart, error=Permission denied
|
||||||
|
```
|
||||||
|
|
||||||
|
### Enhanced Audit Logging (Recommended)
|
||||||
|
|
||||||
|
For production deployments, implement structured logging:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# superset_config.py
|
||||||
|
import logging
|
||||||
|
import json
|
||||||
|
|
||||||
|
class StructuredFormatter(logging.Formatter):
|
||||||
|
def format(self, record):
|
||||||
|
log_data = {
|
||||||
|
"timestamp": self.formatTime(record),
|
||||||
|
"level": record.levelname,
|
||||||
|
"logger": record.name,
|
||||||
|
"message": record.getMessage(),
|
||||||
|
"user": getattr(record, "user", None),
|
||||||
|
"tool": getattr(record, "tool", None),
|
||||||
|
"resource_type": getattr(record, "resource_type", None),
|
||||||
|
"resource_id": getattr(record, "resource_id", None),
|
||||||
|
"action": getattr(record, "action", None),
|
||||||
|
"result": getattr(record, "result", None),
|
||||||
|
"error": getattr(record, "error", None),
|
||||||
|
}
|
||||||
|
return json.dumps(log_data)
|
||||||
|
|
||||||
|
# Apply formatter
|
||||||
|
handler = logging.StreamHandler()
|
||||||
|
handler.setFormatter(StructuredFormatter())
|
||||||
|
logging.getLogger("superset.mcp_service").addHandler(handler)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Structured Log Example**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"timestamp": "2025-01-01T10:30:45.123Z",
|
||||||
|
"level": "INFO",
|
||||||
|
"logger": "superset.mcp_service.auth",
|
||||||
|
"message": "MCP tool execution",
|
||||||
|
"user": "admin",
|
||||||
|
"tool": "generate_chart",
|
||||||
|
"resource_type": "chart",
|
||||||
|
"resource_id": 42,
|
||||||
|
"action": "create",
|
||||||
|
"result": "success",
|
||||||
|
"duration_ms": 234
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Audit Events
|
||||||
|
|
||||||
|
**Key Events to Log**:
|
||||||
|
|
||||||
|
| Event | Data to Capture | Severity |
|
||||||
|
|-------|----------------|----------|
|
||||||
|
| Authentication success | User, timestamp, IP | INFO |
|
||||||
|
| Authentication failure | Username attempted, reason | WARNING |
|
||||||
|
| Tool execution | User, tool, parameters, result | INFO |
|
||||||
|
| Permission denied | User, tool, resource, reason | WARNING |
|
||||||
|
| Chart created | User, chart_id, dataset_id | INFO |
|
||||||
|
| Dashboard created | User, dashboard_id, chart_ids | INFO |
|
||||||
|
| SQL executed | User, database, query (sanitized), rows | INFO |
|
||||||
|
| Error occurred | User, tool, error type, stack trace | ERROR |
|
||||||
|
|
||||||
|
### Integration with SIEM Systems
|
||||||
|
|
||||||
|
**Export to External Systems**:
|
||||||
|
|
||||||
|
**Option 1: Syslog**:
|
||||||
|
```python
|
||||||
|
import logging.handlers
|
||||||
|
|
||||||
|
syslog_handler = logging.handlers.SysLogHandler(
|
||||||
|
address=("syslog.company.com", 514)
|
||||||
|
)
|
||||||
|
logging.getLogger("superset.mcp_service").addHandler(syslog_handler)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option 2: Log Aggregation (ELK, Splunk)**:
|
||||||
|
```python
|
||||||
|
# Send JSON logs to stdout, collected by log shipper
|
||||||
|
import sys
|
||||||
|
import logging
|
||||||
|
|
||||||
|
handler = logging.StreamHandler(sys.stdout)
|
||||||
|
handler.setFormatter(StructuredFormatter())
|
||||||
|
```
|
||||||
|
|
||||||
|
**Option 3: Cloud Logging (CloudWatch, Stackdriver)**:
|
||||||
|
```python
|
||||||
|
# AWS CloudWatch example
|
||||||
|
import watchtower
|
||||||
|
|
||||||
|
handler = watchtower.CloudWatchLogHandler(
|
||||||
|
log_group="/superset/mcp",
|
||||||
|
stream_name="mcp-service"
|
||||||
|
)
|
||||||
|
logging.getLogger("superset.mcp_service").addHandler(handler)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Log Retention
|
||||||
|
|
||||||
|
**Recommended Retention Policies**:
|
||||||
|
- **Authentication logs**: 90 days minimum
|
||||||
|
- **Tool execution logs**: 30 days minimum
|
||||||
|
- **Error logs**: 180 days minimum
|
||||||
|
- **Compliance logs**: Per regulatory requirements (e.g., 7 years for HIPAA)
|
||||||
|
|
||||||
|
## Compliance Considerations
|
||||||
|
|
||||||
|
### GDPR (General Data Protection Regulation)
|
||||||
|
|
||||||
|
**User Data Access Tracking**:
|
||||||
|
- Log all data access by user
|
||||||
|
- Provide audit trail for data subject access requests (DSAR)
|
||||||
|
- Implement data retention policies
|
||||||
|
- Support right to be forgotten (delete user data from logs)
|
||||||
|
|
||||||
|
**MCP Service Compliance**:
|
||||||
|
- All tool calls logged with user identification
|
||||||
|
- Can generate reports of user's data access
|
||||||
|
- Logs can be filtered/redacted for privacy
|
||||||
|
- No personal data stored in MCP service (only in Superset DB)
|
||||||
|
|
||||||
|
### SOC 2 (Service Organization Control 2)
|
||||||
|
|
||||||
|
**Audit Trail Requirements**:
|
||||||
|
- Log all administrative actions
|
||||||
|
- Maintain immutable audit logs
|
||||||
|
- Implement log integrity verification
|
||||||
|
- Provide audit log export functionality
|
||||||
|
|
||||||
|
**MCP Service Compliance**:
|
||||||
|
- Structured logging provides audit trail
|
||||||
|
- Logs include who, what, when for all actions
|
||||||
|
- Export logs to secure, immutable storage (S3, etc.)
|
||||||
|
- Implement log signing for integrity verification
|
||||||
|
|
||||||
|
### HIPAA (Health Insurance Portability and Accountability Act)
|
||||||
|
|
||||||
|
**PHI Access Logging**:
|
||||||
|
- Log all access to protected health information
|
||||||
|
- Include user, timestamp, data accessed
|
||||||
|
- Maintain logs for 6 years minimum
|
||||||
|
- Implement access controls on audit logs
|
||||||
|
|
||||||
|
**MCP Service Compliance**:
|
||||||
|
- All dataset queries logged
|
||||||
|
- Row-level security enforces data access controls
|
||||||
|
- Can identify which users accessed which PHI records
|
||||||
|
- Logs exportable for compliance reporting
|
||||||
|
|
||||||
|
**Example HIPAA Audit Log Entry**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"timestamp": "2025-01-01T10:30:45.123Z",
|
||||||
|
"user": "doctor@hospital.com",
|
||||||
|
"action": "query_dataset",
|
||||||
|
"dataset_id": 123,
|
||||||
|
"dataset_name": "patient_records",
|
||||||
|
"rows_returned": 5,
|
||||||
|
"phi_accessed": true,
|
||||||
|
"purpose": "Treatment",
|
||||||
|
"ip_address": "10.0.1.25"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Access Control Matrix
|
||||||
|
|
||||||
|
For compliance audits, maintain a matrix of who can access what:
|
||||||
|
|
||||||
|
| Role | Dashboards | Charts | Datasets | SQL Lab | Admin |
|
||||||
|
|------|-----------|--------|----------|---------|-------|
|
||||||
|
| Admin | All | All | All | All | Yes |
|
||||||
|
| Alpha | Owned + Shared | Owned + Shared | Permitted | Permitted DBs | No |
|
||||||
|
| Gamma | Shared | Shared | Permitted | No | No |
|
||||||
|
| Viewer | Shared | Shared | None | No | No |
|
||||||
|
|
||||||
|
## Security Checklist for Production
|
||||||
|
|
||||||
|
Before deploying MCP service to production:
|
||||||
|
|
||||||
|
**Authentication**:
|
||||||
|
- [ ] `MCP_AUTH_ENABLED = True`
|
||||||
|
- [ ] JWT issuer, audience, and keys configured
|
||||||
|
- [ ] `MCP_DEV_USERNAME` removed or set to `None`
|
||||||
|
- [ ] Token expiration enforced (short-lived tokens)
|
||||||
|
- [ ] Refresh token mechanism implemented (client-side)
|
||||||
|
|
||||||
|
**Authorization**:
|
||||||
|
- [ ] RBAC roles configured in Superset
|
||||||
|
- [ ] RLS rules tested for all datasets
|
||||||
|
- [ ] Dataset access permissions verified
|
||||||
|
- [ ] Minimum required permissions granted per role
|
||||||
|
- [ ] Service accounts use dedicated roles
|
||||||
|
|
||||||
|
**Network Security**:
|
||||||
|
- [ ] HTTPS enabled (`SESSION_COOKIE_SECURE = True`)
|
||||||
|
- [ ] TLS 1.2+ enforced
|
||||||
|
- [ ] Firewall rules restrict access to MCP service
|
||||||
|
- [ ] Network isolation between MCP and database
|
||||||
|
- [ ] Load balancer health checks configured
|
||||||
|
|
||||||
|
**Session Security**:
|
||||||
|
- [ ] `SESSION_COOKIE_HTTPONLY = True`
|
||||||
|
- [ ] `SESSION_COOKIE_SECURE = True`
|
||||||
|
- [ ] `SESSION_COOKIE_SAMESITE = "Strict"`
|
||||||
|
- [ ] Session timeout configured appropriately
|
||||||
|
- [ ] No sensitive data stored in sessions
|
||||||
|
|
||||||
|
**Audit Logging**:
|
||||||
|
- [ ] Structured logging enabled
|
||||||
|
- [ ] All tool executions logged
|
||||||
|
- [ ] Authentication events logged
|
||||||
|
- [ ] Logs exported to SIEM/aggregation system
|
||||||
|
- [ ] Log retention policy implemented
|
||||||
|
|
||||||
|
**Monitoring**:
|
||||||
|
- [ ] Failed authentication attempts alerted
|
||||||
|
- [ ] Permission denied events monitored
|
||||||
|
- [ ] Error rate alerts configured
|
||||||
|
- [ ] Unusual access patterns detected
|
||||||
|
- [ ] Service availability monitored
|
||||||
|
|
||||||
|
**Compliance**:
|
||||||
|
- [ ] Data access logs retained per regulations
|
||||||
|
- [ ] Audit trail exportable
|
||||||
|
- [ ] Privacy policy updated for MCP service
|
||||||
|
- [ ] User consent obtained (if required)
|
||||||
|
- [ ] Security incident response plan includes MCP
|
||||||
|
|
||||||
|
## Security Incident Response
|
||||||
|
|
||||||
|
### Suspected Token Compromise
|
||||||
|
|
||||||
|
**Immediate Actions**:
|
||||||
|
1. Revoke compromised token at auth provider
|
||||||
|
2. Review audit logs for unauthorized access
|
||||||
|
3. Identify affected resources
|
||||||
|
4. Notify affected users/stakeholders
|
||||||
|
5. Force token refresh for all users (if provider supports)
|
||||||
|
|
||||||
|
**Investigation**:
|
||||||
|
1. Check MCP service logs for unusual activity
|
||||||
|
2. Correlate access patterns with compromised token
|
||||||
|
3. Determine scope of data accessed
|
||||||
|
4. Document timeline of events
|
||||||
|
|
||||||
|
### Unauthorized Access Detected
|
||||||
|
|
||||||
|
**Response Procedure**:
|
||||||
|
1. Block user/IP immediately (firewall/load balancer)
|
||||||
|
2. Disable user account in Superset
|
||||||
|
3. Review all actions by user in audit logs
|
||||||
|
4. Assess data exposure
|
||||||
|
5. Notify security team and management
|
||||||
|
6. Preserve logs for forensic analysis
|
||||||
|
|
||||||
|
### Data Breach
|
||||||
|
|
||||||
|
**MCP-Specific Considerations**:
|
||||||
|
1. Identify which datasets were accessed via MCP
|
||||||
|
2. Determine if RLS was bypassed (should not be possible)
|
||||||
|
3. Check for SQL injection attempts (should be prevented by Superset)
|
||||||
|
4. Review all tool executions in timeframe
|
||||||
|
5. Export detailed audit logs for incident report
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- **JWT Best Practices**: https://tools.ietf.org/html/rfc8725
|
||||||
|
- **OWASP API Security**: https://owasp.org/www-project-api-security/
|
||||||
|
- **Superset Security Documentation**: https://superset.apache.org/docs/security
|
||||||
|
- **Flask-AppBuilder Security**: https://flask-appbuilder.readthedocs.io/en/latest/security.html
|
||||||
|
- **GDPR Compliance Guide**: https://gdpr.eu/
|
||||||
|
- **SOC 2 Framework**: https://www.aicpa.org/soc2
|
||||||
|
- **HIPAA Security Rule**: https://www.hhs.gov/hipaa/for-professionals/security/
|
||||||
Reference in New Issue
Block a user