20 KiB
MCP Service Architecture
Overview
The Apache Superset MCP (Model Context Protocol) service provides programmatic access to Superset functionality through a standardized protocol that enables AI assistants and automation tools to interact with dashboards, charts, datasets, and SQL Lab.
The MCP service runs as a separate process from the Superset web server, using its own Flask application instance and HTTP server while sharing the same database and configuration with the main Superset application.
Flask Singleton Pattern
Why Module-Level Singleton?
The MCP service uses a module-level singleton Flask application instance rather than creating a new app instance per request. This design decision is based on several important considerations:
Separate Process Architecture:
- The MCP service runs as an independent process from the Superset web server
- It has its own HTTP server (via FastMCP/Starlette) handling MCP protocol requests
- Each MCP tool invocation occurs within the context of this single, long-lived Flask app
Benefits of Module-Level Singleton:
-
Consistent Database Connection Pool
- A single SQLAlchemy connection pool is maintained across all tool calls
- Connections are efficiently reused rather than recreated
- Connection pool configuration (size, timeout, etc.) behaves predictably
-
Shared Configuration Access
- Flask app configuration is loaded once at startup
- All tools access the same configuration state
- Changes to runtime config affect all subsequent tool calls consistently
-
Thread-Safe Initialization
- The Flask app is created exactly once using
threading.Lock() - Multiple concurrent requests safely share the same app instance
- No risk of duplicate initialization or race conditions
- The Flask app is created exactly once using
-
Lower Resource Overhead
- No per-request app creation/teardown overhead
- Memory footprint remains constant regardless of request volume
- Extension initialization (Flask-AppBuilder, Flask-Migrate, etc.) happens once
When Module-Level Singleton Is Appropriate:
- Service runs as dedicated daemon/process
- Application state is consistent across all requests
- No per-request application context needed
- Long-lived server process with many requests
When Module-Level Singleton Is NOT Appropriate:
- Testing with different configurations (use app fixtures instead)
- Multi-tenant deployments requiring different app configs per tenant
- Dynamic plugin loading requiring app recreation
- Development scenarios requiring hot-reload of app configuration
Implementation Details
The singleton is implemented in flask_singleton.py:
# Module-level instance - created once on import
from superset.app import create_app
from superset.mcp_service.mcp_config import get_mcp_config
_temp_app = create_app()
with _temp_app.app_context():
mcp_config = get_mcp_config(_temp_app.config)
_temp_app.config.update(mcp_config)
app = _temp_app
def get_flask_app() -> Flask:
"""Get the Flask app instance."""
return app
Key characteristics:
- No complex patterns or metaclasses needed
- The module itself acts as the singleton container
- Clean, Pythonic approach following Stack Overflow recommendations
- Application context pushed during initialization to avoid "Working outside of application context" errors
Multitenant Architecture
Current Implementation
The MCP service uses Option B: Shared Process with Tenant Isolation:
graph LR
T1[Tenant 1]
T2[Tenant 2]
T3[Tenant 3]
MCP[Single MCP Process]
DB[(Superset Database)]
T1 --> MCP
T2 --> MCP
T3 --> MCP
MCP --> DB
MCP -.->|Isolation via| ISO[User authentication JWT or dev user<br/>Flask-AppBuilder RBAC<br/>Dataset access filters<br/>Row-level security]
style ISO fill:#f9f,stroke:#333,stroke-width:2px
Tenant Isolation Mechanisms
Database Level
Superset's Existing RLS (Row-Level Security):
- RLS rules are defined at the dataset level
- Rules filter queries based on user attributes (e.g.,
department = '{{ current_user.department }}') - The MCP service respects all RLS rules automatically through Superset's query execution layer
No Schema-Based Isolation:
- The current implementation does NOT use separate database schemas per tenant
- All Superset metadata (dashboards, charts, datasets) exists in the same database schema
- Database-level isolation is achieved through Superset's permission system rather than physical schema separation
Application Level
Flask-AppBuilder Security Manager:
- Every MCP tool call uses
@mcp_auth_hookdecorator - The auth hook sets
g.userto the authenticated user (from JWT orMCP_DEV_USERNAME) - Superset's security manager then enforces permissions based on this user's roles
User-Based Access Control:
- Users can only access resources they have permissions for
- Dashboard ownership and role-based permissions are enforced
- The
can_access_datasource()method validates dataset access
Dataset Access Filters:
- All list operations (dashboards, charts, datasets) use Superset's access filters:
DashboardAccessFilter- filters dashboards based on user permissionsChartAccessFilter- filters charts based on user permissionsDatasourceFilter- filters datasets based on user permissions
Row-Level Security Enforcement:
- RLS rules are applied transparently during query execution
- The MCP service makes no modifications to bypass RLS
- SQL queries executed through
execute_sqltool respect RLS policies
JWT Tenant Claims
Development Mode (single user):
# superset_config.py
MCP_DEV_USERNAME = "admin"
Production Mode (JWT-based):
{
"sub": "user@company.com",
"email": "user@company.com",
"scopes": ["superset:read", "superset:chart:create"],
"exp": 1672531200
}
Future Enhancement (multi-tenant JWT):
{
"sub": "user@tenant-a.com",
"tenant_id": "tenant-a",
"scopes": ["superset:read"],
"exp": 1672531200
}
The tenant_id claim could be used in future versions to:
- Further isolate data by tenant context
- Apply tenant-specific RLS rules
- Log and audit actions by tenant
- Implement tenant-specific rate limits
Process Model
Single Process Deployment
When to Use:
- Development and testing environments
- Small deployments with low request volume (< 100 requests/minute)
- Single-tenant installations
- Resource-constrained environments
Resource Characteristics:
- Memory: ~500MB-1GB (includes Flask app, SQLAlchemy, screenshot pool)
- CPU: Mostly I/O bound (database queries, screenshot generation)
- Database connections: Configurable via
SQLALCHEMY_POOL_SIZE(default: 5)
Scaling Limitations:
- Single Python process = GIL limitations for CPU-bound operations
- Screenshot generation can block other requests
- Limited horizontal scalability without load balancer
Example Command:
superset mcp run --port 5008
Multi-Process Deployment
Using Gunicorn Workers:
gunicorn \
--workers 4 \
--bind 0.0.0.0:5008 \
--worker-class uvicorn.workers.UvicornWorker \
superset.mcp_service.server:app
Configuration Considerations:
- Worker count:
2-4 x CPU cores(typical recommendation) - Each worker has its own Flask app instance via module-level singleton
- Workers share nothing - fully isolated processes
- Database connection pool per worker (watch total connections)
Process Pool Management:
- Use process manager (systemd, supervisord) for auto-restart
- Health checks to detect and restart failed workers
- Graceful shutdown to complete in-flight requests
Load Balancing:
- Use nginx/HAProxy to distribute requests across workers
- Round-robin or least-connections algorithms work well
- Sticky sessions NOT required (stateless API)
Containerized Deployment
Docker:
FROM apache/superset:latest
CMD ["superset", "mcp", "run", "--port", "5008"]
Kubernetes Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: superset-mcp
spec:
replicas: 3
selector:
matchLabels:
app: superset-mcp
template:
metadata:
labels:
app: superset-mcp
spec:
containers:
- name: mcp
image: apache/superset:latest
command: ["superset", "mcp", "run", "--port", "5008"]
ports:
- containerPort: 5008
env:
- name: SUPERSET_CONFIG_PATH
value: /app/pythonpath/superset_config.py
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
Horizontal Pod Autoscaling:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: superset-mcp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: superset-mcp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Service Mesh Integration:
- Istio/Linkerd can provide:
- Automatic retries and circuit breaking
- Distributed tracing
- Mutual TLS between pods
- Advanced traffic routing
Database Connection Management
Connection Pooling
The MCP service uses SQLAlchemy's connection pooling with configuration inherited from Superset:
# superset_config.py
SQLALCHEMY_POOL_SIZE = 5 # Max connections per worker
SQLALCHEMY_POOL_TIMEOUT = 30 # Seconds to wait for connection
SQLALCHEMY_MAX_OVERFLOW = 10 # Extra connections beyond pool_size
SQLALCHEMY_POOL_RECYCLE = 3600 # Recycle connections after 1 hour
Connection Lifecycle:
- Request arrives at MCP tool
- Tool calls DAO method which accesses
db.session - SQLAlchemy checks out connection from pool
- Query executes on borrowed connection
- Connection returns to pool (not closed)
- Connection reused for next request
Pool Size Recommendations:
- Single process: 5-10 connections
- Multi-worker (4 workers): 3-5 connections per worker = 12-20 total
- Monitor: Database max_connections setting must exceed total pool size across all MCP workers
Example with 4 Gunicorn workers:
SQLALCHEMY_POOL_SIZE = 5
SQLALCHEMY_MAX_OVERFLOW = 5
# Total potential connections: 4 workers × (5 + 5) = 40 connections
# Ensure database supports 40+ connections
Transaction Handling
MCP Tool Transaction Pattern:
@mcp.tool
@mcp_auth_hook
def my_tool(param: str) -> Result:
# Auth hook sets g.user and manages session
try:
# Tool executes within implicit transaction
result = DashboardDAO.find_by_id(123)
return Result(data=result)
except Exception:
# On error: rollback happens in auth hook's except block
raise
finally:
# On success: rollback happens in auth hook's finally block
# (read-only operations don't commit)
pass
Session Cleanup in Auth Hook:
The @mcp_auth_hook decorator manages session lifecycle:
# On error path
except Exception:
try:
db.session.rollback()
db.session.remove()
except Exception as e:
logger.warning("Error cleaning up session: %s", e)
raise
# On success path (finally block)
finally:
try:
if db.session.is_active:
db.session.rollback() # Cleanup, don't commit
except Exception as e:
logger.warning("Error in finally block: %s", e)
Why Rollback on Success?
- MCP tools are primarily read-only operations
- No explicit commits needed for queries
- Rollback ensures clean slate for next request
- Write operations (create chart, etc.) use Superset's command pattern which handles commits internally
Deployment Considerations
Resource Requirements
Memory Per Process:
- Base Flask app: ~200MB
- SQLAlchemy + models: ~100MB
- WebDriver pool (if screenshots enabled): ~200MB
- Request processing overhead: ~50MB per concurrent request
- Total: 500MB-1GB per process
CPU Usage Patterns:
- I/O bound: Most time spent waiting on database/screenshots
- Low CPU during normal operations (< 20% per core)
- CPU spikes during:
- Screenshot generation (WebDriver rendering)
- Large dataset query processing
- Complex chart configuration validation
Database Connections:
- Single process: 5-10 connections (pool_size + max_overflow)
- Multi-process:
(pool_size + max_overflow) × worker_count - Example: 4 workers × 10 max connections = 40 total database connections
Scaling Strategy
When to Scale Horizontally:
- Request latency increases beyond acceptable threshold (e.g., p95 > 2 seconds)
- CPU utilization consistently > 70%
- Request queue depth growing
- Database connection pool frequently exhausted
Load Balancing Between MCP Instances:
Option 1: Nginx Round-Robin:
upstream mcp_backend {
server mcp-1:5008;
server mcp-2:5008;
server mcp-3:5008;
}
server {
location / {
proxy_pass http://mcp_backend;
}
}
Option 2: Kubernetes Service:
apiVersion: v1
kind: Service
metadata:
name: superset-mcp
spec:
selector:
app: superset-mcp
ports:
- port: 5008
targetPort: 5008
type: ClusterIP
Session Affinity:
- NOT required - MCP service is stateless
- Each request is independent
- No session state maintained between requests
- Load balancer can freely distribute requests
High Availability
Multiple MCP Instances:
- Deploy at least 2 instances for redundancy
- Use load balancer health checks to detect failures
- Failed instances automatically removed from rotation
Health Checks:
The MCP service provides a health check tool:
# Internal health check
@mcp.tool
def health_check() -> HealthCheckResponse:
return HealthCheckResponse(
status="healthy",
timestamp=datetime.now(timezone.utc),
database_connection="ok"
)
Load balancer health check:
# Nginx example
upstream mcp_backend {
server mcp-1:5008 max_fails=3 fail_timeout=30s;
server mcp-2:5008 max_fails=3 fail_timeout=30s;
}
Kubernetes health check:
livenessProbe:
httpGet:
path: /health
port: 5008
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 5008
initialDelaySeconds: 10
periodSeconds: 5
Failover Handling:
- Load balancer automatically routes around failed instances
- MCP clients should implement retry logic for transient failures
- Use circuit breaker pattern for repeated failures
- Monitor and alert on instance failures
Database Considerations
Shared Database with Superset:
- MCP service and Superset web server share the same database
- Same SQLAlchemy models and schema
- Database migrations applied once, affect both services
Connection Pool Sizing:
Total DB Connections =
Superset Web (workers × pool_size) +
MCP Service (workers × pool_size) +
Other services
Must be < Database max_connections
Example Calculation:
- Superset web: 8 workers × 10 connections = 80
- MCP service: 4 workers × 10 connections = 40
- Other: 20 reserved
- Total: 140 connections
- Database: Set max_connections >= 150
Monitoring Recommendations
Key Metrics to Track:
- Request rate per tool
- Request latency (p50, p95, p99)
- Error rate by tool and error type
- Database connection pool utilization
- Memory usage per process
- Active concurrent requests
Example Prometheus Metrics (future implementation):
mcp_requests_total{tool="list_charts", status="success"}
mcp_request_duration_seconds{tool="list_charts", quantile="0.95"}
mcp_database_connections_active
mcp_database_connections_idle
mcp_memory_usage_bytes
Log Aggregation:
- Centralize logs from all MCP instances
- Use structured logging (JSON format)
- Include trace IDs for request correlation
- Alert on error rate spikes
Architecture Diagrams
Request Flow
sequenceDiagram
participant Client as MCP Client<br/>(Claude/automation)
participant FastMCP as FastMCP Server<br/>(Starlette/Uvicorn)
participant Auth as MCP Auth Hook
participant Tool as Tool Implementation<br/>(e.g., list_charts)
participant DAO as Superset DAO Layer<br/>(ChartDAO, DashboardDAO)
participant DB as Database<br/>(PostgreSQL/MySQL)
Client->>FastMCP: MCP Protocol (HTTP/SSE)
FastMCP->>Auth: @mcp.tool decorator
Auth->>Auth: Sets g.user, manages session
Auth->>Tool: Execute tool
Tool->>DAO: Uses DAO pattern
DAO->>DB: SQLAlchemy ORM
DB-->>DAO: Query results
DAO-->>Tool: Processed data
Tool-->>Auth: Tool response
Auth-->>FastMCP: Response with cleanup
FastMCP-->>Client: MCP response
Multi-Instance Deployment
graph TD
LB[Load Balancer<br/>Nginx/K8s Service]
MCP1[MCP Instance 1<br/>port 5008]
MCP2[MCP Instance 2<br/>port 5008]
MCP3[MCP Instance 3<br/>port 5008]
DB[(Superset Database<br/>shared connection pool)]
LB --> MCP1
LB --> MCP2
LB --> MCP3
MCP1 --> DB
MCP2 --> DB
MCP3 --> DB
Tenant Isolation
graph TD
UserA[User A<br/>JWT: tenant=acme]
UserB[User B<br/>JWT: tenant=beta]
MCP[MCP Service<br/>single process]
Auth[@mcp_auth_hook<br/>Sets g.user from JWT]
RBAC[Flask-AppBuilder<br/>RBAC]
Filters[Dataset Access<br/>Filters]
DB[(Superset Database<br/>single schema, filtered by permissions)]
UserA --> MCP
UserB --> MCP
MCP --> Auth
Auth --> RBAC
Auth --> Filters
RBAC --> |User A sees only<br/>acme dashboards| DB
Filters --> |User A queries filtered<br/>by RLS rules for acme| DB
Comparison with Alternative Architectures
Module-Level Singleton (Current) vs Per-Request App
| Aspect | Module-Level Singleton | Per-Request App |
|---|---|---|
| Connection Pool | Single shared pool | New pool per request |
| Memory Overhead | Constant (~500MB) | 500MB × concurrent requests |
| Thread Safety | Must ensure thread-safe access | Each request isolated |
| Configuration | Loaded once at startup | Can vary per request |
| Performance | Fast (no setup overhead) | Slow (initialization cost) |
| Use Case | Production daemon | Testing/multi-config scenarios |
Shared Process (Current) vs Separate Process Per Tenant
| Aspect | Shared Process | Process Per Tenant |
|---|---|---|
| Isolation | Application-level (RBAC/RLS) | Process-level (OS isolation) |
| Resource Usage | Efficient (shared resources) | Higher (duplicate resources) |
| Scaling | Horizontal (add instances) | Vertical (more processes) |
| Complexity | Simpler deployment | Complex orchestration |
| Security | Depends on Superset RBAC | Stronger isolation |
| Use Case | Most deployments | High-security multi-tenant |
Future Architectural Considerations
Async/Await Support
The current implementation uses synchronous request handling. Future versions could:
- Use
async/awaitfor I/O operations - Implement connection pooling with
asyncpg(PostgreSQL) oraiomysql - Improve throughput for I/O-bound operations
Caching Layer
Adding caching between MCP service and database:
- Redis cache for frequently accessed resources (dashboards, charts, datasets)
- Cache invalidation on updates
- Reduced database load for read-heavy workloads
Event-Driven Updates
WebSocket support for real-time updates:
- Push notifications when dashboards/charts change
- Streaming query results for large datasets
- Live dashboard editing collaboration
References
- Flask Application Context: https://flask.palletsprojects.com/en/stable/appcontext/
- SQLAlchemy Connection Pooling: https://docs.sqlalchemy.org/en/stable/core/pooling.html
- FastMCP Documentation: https://github.com/jlowin/fastmcp
- Superset Security Model: https://superset.apache.org/docs/security