mirror of https://github.com/apache/superset.git synced 2026-04-10 20:06:13 +00:00

Files

Amin Ghadersohi 66afdfd119 docs(mcp): add comprehensive architecture, security, and production deployment documentation (#36017 )

2025-11-19 16:41:56 +01:00

20 KiB

Raw Blame History

MCP Service Architecture

Overview

The Apache Superset MCP (Model Context Protocol) service provides programmatic access to Superset functionality through a standardized protocol that enables AI assistants and automation tools to interact with dashboards, charts, datasets, and SQL Lab.

The MCP service runs as a separate process from the Superset web server, using its own Flask application instance and HTTP server while sharing the same database and configuration with the main Superset application.

Flask Singleton Pattern

Why Module-Level Singleton?

The MCP service uses a module-level singleton Flask application instance rather than creating a new app instance per request. This design decision is based on several important considerations:

Separate Process Architecture:

The MCP service runs as an independent process from the Superset web server
It has its own HTTP server (via FastMCP/Starlette) handling MCP protocol requests
Each MCP tool invocation occurs within the context of this single, long-lived Flask app

Benefits of Module-Level Singleton:

Consistent Database Connection Pool
- A single SQLAlchemy connection pool is maintained across all tool calls
- Connections are efficiently reused rather than recreated
- Connection pool configuration (size, timeout, etc.) behaves predictably
Shared Configuration Access
- Flask app configuration is loaded once at startup
- All tools access the same configuration state
- Changes to runtime config affect all subsequent tool calls consistently
Thread-Safe Initialization
- The Flask app is created exactly once using threading.Lock()
- Multiple concurrent requests safely share the same app instance
- No risk of duplicate initialization or race conditions
Lower Resource Overhead
- No per-request app creation/teardown overhead
- Memory footprint remains constant regardless of request volume
- Extension initialization (Flask-AppBuilder, Flask-Migrate, etc.) happens once

When Module-Level Singleton Is Appropriate:

Service runs as dedicated daemon/process
Application state is consistent across all requests
No per-request application context needed
Long-lived server process with many requests

When Module-Level Singleton Is NOT Appropriate:

Testing with different configurations (use app fixtures instead)
Multi-tenant deployments requiring different app configs per tenant
Dynamic plugin loading requiring app recreation
Development scenarios requiring hot-reload of app configuration

Implementation Details

The singleton is implemented in flask_singleton.py:

# Module-level instance - created once on import
from superset.app import create_app
from superset.mcp_service.mcp_config import get_mcp_config

_temp_app = create_app()

with _temp_app.app_context():
    mcp_config = get_mcp_config(_temp_app.config)
    _temp_app.config.update(mcp_config)

app = _temp_app

def get_flask_app() -> Flask:
    """Get the Flask app instance."""
    return app

Key characteristics:

No complex patterns or metaclasses needed
The module itself acts as the singleton container
Clean, Pythonic approach following Stack Overflow recommendations
Application context pushed during initialization to avoid "Working outside of application context" errors

Multitenant Architecture

Current Implementation

The MCP service uses Option B: Shared Process with Tenant Isolation:

graph LR
    T1[Tenant 1]
    T2[Tenant 2]
    T3[Tenant 3]
    MCP[Single MCP Process]
    DB[(Superset Database)]

    T1 --> MCP
    T2 --> MCP
    T3 --> MCP
    MCP --> DB

    MCP -.->|Isolation via| ISO[User authentication JWT or dev user<br/>Flask-AppBuilder RBAC<br/>Dataset access filters<br/>Row-level security]

    style ISO fill:#f9f,stroke:#333,stroke-width:2px

Tenant Isolation Mechanisms

Database Level

Superset's Existing RLS (Row-Level Security):

RLS rules are defined at the dataset level
Rules filter queries based on user attributes (e.g., department = '{{ current_user.department }}')
The MCP service respects all RLS rules automatically through Superset's query execution layer

No Schema-Based Isolation:

The current implementation does NOT use separate database schemas per tenant
All Superset metadata (dashboards, charts, datasets) exists in the same database schema
Database-level isolation is achieved through Superset's permission system rather than physical schema separation

Application Level

Flask-AppBuilder Security Manager:

Every MCP tool call uses @mcp_auth_hook decorator
The auth hook sets g.user to the authenticated user (from JWT or MCP_DEV_USERNAME)
Superset's security manager then enforces permissions based on this user's roles

User-Based Access Control:

Users can only access resources they have permissions for
Dashboard ownership and role-based permissions are enforced
The can_access_datasource() method validates dataset access

Dataset Access Filters:

All list operations (dashboards, charts, datasets) use Superset's access filters:
- DashboardAccessFilter - filters dashboards based on user permissions
- ChartAccessFilter - filters charts based on user permissions
- DatasourceFilter - filters datasets based on user permissions

Row-Level Security Enforcement:

RLS rules are applied transparently during query execution
The MCP service makes no modifications to bypass RLS
SQL queries executed through execute_sql tool respect RLS policies

JWT Tenant Claims

Development Mode (single user):

# superset_config.py
MCP_DEV_USERNAME = "admin"

Production Mode (JWT-based):

{
  "sub": "user@company.com",
  "email": "user@company.com",
  "scopes": ["superset:read", "superset:chart:create"],
  "exp": 1672531200
}

Future Enhancement (multi-tenant JWT):

{
  "sub": "user@tenant-a.com",
  "tenant_id": "tenant-a",
  "scopes": ["superset:read"],
  "exp": 1672531200
}

The tenant_id claim could be used in future versions to:

Further isolate data by tenant context
Apply tenant-specific RLS rules
Log and audit actions by tenant
Implement tenant-specific rate limits

Process Model

Single Process Deployment

When to Use:

Development and testing environments
Small deployments with low request volume (< 100 requests/minute)
Single-tenant installations
Resource-constrained environments

Resource Characteristics:

Memory: ~500MB-1GB (includes Flask app, SQLAlchemy, screenshot pool)
CPU: Mostly I/O bound (database queries, screenshot generation)
Database connections: Configurable via SQLALCHEMY_POOL_SIZE (default: 5)

Scaling Limitations:

Single Python process = GIL limitations for CPU-bound operations
Screenshot generation can block other requests
Limited horizontal scalability without load balancer

Example Command:

superset mcp run --port 5008

Multi-Process Deployment

Using Gunicorn Workers:

gunicorn \
  --workers 4 \
  --bind 0.0.0.0:5008 \
  --worker-class uvicorn.workers.UvicornWorker \
  superset.mcp_service.server:app

Configuration Considerations:

Worker count: 2-4 x CPU cores (typical recommendation)
Each worker has its own Flask app instance via module-level singleton
Workers share nothing - fully isolated processes
Database connection pool per worker (watch total connections)

Process Pool Management:

Use process manager (systemd, supervisord) for auto-restart
Health checks to detect and restart failed workers
Graceful shutdown to complete in-flight requests

Load Balancing:

Use nginx/HAProxy to distribute requests across workers
Round-robin or least-connections algorithms work well
Sticky sessions NOT required (stateless API)

Containerized Deployment

Docker:

FROM apache/superset:latest
CMD ["superset", "mcp", "run", "--port", "5008"]

Kubernetes Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: superset-mcp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: superset-mcp
  template:
    metadata:
      labels:
        app: superset-mcp
    spec:
      containers:
      - name: mcp
        image: apache/superset:latest
        command: ["superset", "mcp", "run", "--port", "5008"]
        ports:
        - containerPort: 5008
        env:
        - name: SUPERSET_CONFIG_PATH
          value: /app/pythonpath/superset_config.py
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"

Horizontal Pod Autoscaling:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: superset-mcp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: superset-mcp
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Service Mesh Integration:

Istio/Linkerd can provide:
- Automatic retries and circuit breaking
- Distributed tracing
- Mutual TLS between pods
- Advanced traffic routing

Database Connection Management

Connection Pooling

The MCP service uses SQLAlchemy's connection pooling with configuration inherited from Superset:

# superset_config.py
SQLALCHEMY_POOL_SIZE = 5        # Max connections per worker
SQLALCHEMY_POOL_TIMEOUT = 30    # Seconds to wait for connection
SQLALCHEMY_MAX_OVERFLOW = 10    # Extra connections beyond pool_size
SQLALCHEMY_POOL_RECYCLE = 3600  # Recycle connections after 1 hour

Connection Lifecycle:

Request arrives at MCP tool
Tool calls DAO method which accesses db.session
SQLAlchemy checks out connection from pool
Query executes on borrowed connection
Connection returns to pool (not closed)
Connection reused for next request

Pool Size Recommendations:

Single process: 5-10 connections
Multi-worker (4 workers): 3-5 connections per worker = 12-20 total
Monitor: Database max_connections setting must exceed total pool size across all MCP workers

Example with 4 Gunicorn workers:

SQLALCHEMY_POOL_SIZE = 5
SQLALCHEMY_MAX_OVERFLOW = 5
# Total potential connections: 4 workers × (5 + 5) = 40 connections
# Ensure database supports 40+ connections

Transaction Handling

MCP Tool Transaction Pattern:

@mcp.tool
@mcp_auth_hook
def my_tool(param: str) -> Result:
    # Auth hook sets g.user and manages session
    try:
        # Tool executes within implicit transaction
        result = DashboardDAO.find_by_id(123)
        return Result(data=result)
    except Exception:
        # On error: rollback happens in auth hook's except block
        raise
    finally:
        # On success: rollback happens in auth hook's finally block
        # (read-only operations don't commit)
        pass

Session Cleanup in Auth Hook:

The @mcp_auth_hook decorator manages session lifecycle:

# On error path
except Exception:
    try:
        db.session.rollback()
        db.session.remove()
    except Exception as e:
        logger.warning("Error cleaning up session: %s", e)
    raise

# On success path (finally block)
finally:
    try:
        if db.session.is_active:
            db.session.rollback()  # Cleanup, don't commit
    except Exception as e:
        logger.warning("Error in finally block: %s", e)

Why Rollback on Success?

MCP tools are primarily read-only operations
No explicit commits needed for queries
Rollback ensures clean slate for next request
Write operations (create chart, etc.) use Superset's command pattern which handles commits internally

Deployment Considerations

Resource Requirements

Memory Per Process:

Base Flask app: ~200MB
SQLAlchemy + models: ~100MB
WebDriver pool (if screenshots enabled): ~200MB
Request processing overhead: ~50MB per concurrent request
Total: 500MB-1GB per process

CPU Usage Patterns:

I/O bound: Most time spent waiting on database/screenshots
Low CPU during normal operations (< 20% per core)
CPU spikes during:
- Screenshot generation (WebDriver rendering)
- Large dataset query processing
- Complex chart configuration validation

Database Connections:

Single process: 5-10 connections (pool_size + max_overflow)
Multi-process: (pool_size + max_overflow) × worker_count
Example: 4 workers × 10 max connections = 40 total database connections

Scaling Strategy

When to Scale Horizontally:

Request latency increases beyond acceptable threshold (e.g., p95 > 2 seconds)
CPU utilization consistently > 70%
Request queue depth growing
Database connection pool frequently exhausted

Load Balancing Between MCP Instances:

Option 1: Nginx Round-Robin:

upstream mcp_backend {
    server mcp-1:5008;
    server mcp-2:5008;
    server mcp-3:5008;
}

server {
    location / {
        proxy_pass http://mcp_backend;
    }
}

Option 2: Kubernetes Service:

apiVersion: v1
kind: Service
metadata:
  name: superset-mcp
spec:
  selector:
    app: superset-mcp
  ports:
  - port: 5008
    targetPort: 5008
  type: ClusterIP

Session Affinity:

NOT required - MCP service is stateless
Each request is independent
No session state maintained between requests
Load balancer can freely distribute requests

High Availability

Multiple MCP Instances:

Deploy at least 2 instances for redundancy
Use load balancer health checks to detect failures
Failed instances automatically removed from rotation

Health Checks:

The MCP service provides a health check tool:

# Internal health check
@mcp.tool
def health_check() -> HealthCheckResponse:
    return HealthCheckResponse(
        status="healthy",
        timestamp=datetime.now(timezone.utc),
        database_connection="ok"
    )

Load balancer health check:

# Nginx example
upstream mcp_backend {
    server mcp-1:5008 max_fails=3 fail_timeout=30s;
    server mcp-2:5008 max_fails=3 fail_timeout=30s;
}

Kubernetes health check:

livenessProbe:
  httpGet:
    path: /health
    port: 5008
  initialDelaySeconds: 30
  periodSeconds: 10
readinessProbe:
  httpGet:
    path: /health
    port: 5008
  initialDelaySeconds: 10
  periodSeconds: 5

Failover Handling:

Load balancer automatically routes around failed instances
MCP clients should implement retry logic for transient failures
Use circuit breaker pattern for repeated failures
Monitor and alert on instance failures

Database Considerations

Shared Database with Superset:

MCP service and Superset web server share the same database
Same SQLAlchemy models and schema
Database migrations applied once, affect both services

Connection Pool Sizing:

Total DB Connections =
  Superset Web (workers × pool_size) +
  MCP Service (workers × pool_size) +
  Other services

Must be < Database max_connections

Example Calculation:

Superset web: 8 workers × 10 connections = 80
MCP service: 4 workers × 10 connections = 40
Other: 20 reserved
Total: 140 connections
Database: Set max_connections >= 150

Monitoring Recommendations

Key Metrics to Track:

Request rate per tool
Request latency (p50, p95, p99)
Error rate by tool and error type
Database connection pool utilization
Memory usage per process
Active concurrent requests

Example Prometheus Metrics (future implementation):

mcp_requests_total{tool="list_charts", status="success"}
mcp_request_duration_seconds{tool="list_charts", quantile="0.95"}
mcp_database_connections_active
mcp_database_connections_idle
mcp_memory_usage_bytes

Log Aggregation:

Centralize logs from all MCP instances
Use structured logging (JSON format)
Include trace IDs for request correlation
Alert on error rate spikes

Architecture Diagrams

Request Flow

sequenceDiagram
    participant Client as MCP Client<br/>(Claude/automation)
    participant FastMCP as FastMCP Server<br/>(Starlette/Uvicorn)
    participant Auth as MCP Auth Hook
    participant Tool as Tool Implementation<br/>(e.g., list_charts)
    participant DAO as Superset DAO Layer<br/>(ChartDAO, DashboardDAO)
    participant DB as Database<br/>(PostgreSQL/MySQL)

    Client->>FastMCP: MCP Protocol (HTTP/SSE)
    FastMCP->>Auth: @mcp.tool decorator
    Auth->>Auth: Sets g.user, manages session
    Auth->>Tool: Execute tool
    Tool->>DAO: Uses DAO pattern
    DAO->>DB: SQLAlchemy ORM
    DB-->>DAO: Query results
    DAO-->>Tool: Processed data
    Tool-->>Auth: Tool response
    Auth-->>FastMCP: Response with cleanup
    FastMCP-->>Client: MCP response

Multi-Instance Deployment

graph TD
    LB[Load Balancer<br/>Nginx/K8s Service]
    MCP1[MCP Instance 1<br/>port 5008]
    MCP2[MCP Instance 2<br/>port 5008]
    MCP3[MCP Instance 3<br/>port 5008]
    DB[(Superset Database<br/>shared connection pool)]

    LB --> MCP1
    LB --> MCP2
    LB --> MCP3
    MCP1 --> DB
    MCP2 --> DB
    MCP3 --> DB

Tenant Isolation

graph TD
    UserA[User A<br/>JWT: tenant=acme]
    UserB[User B<br/>JWT: tenant=beta]
    MCP[MCP Service<br/>single process]
    Auth[@mcp_auth_hook<br/>Sets g.user from JWT]
    RBAC[Flask-AppBuilder<br/>RBAC]
    Filters[Dataset Access<br/>Filters]
    DB[(Superset Database<br/>single schema, filtered by permissions)]

    UserA --> MCP
    UserB --> MCP
    MCP --> Auth
    Auth --> RBAC
    Auth --> Filters
    RBAC --> |User A sees only<br/>acme dashboards| DB
    Filters --> |User A queries filtered<br/>by RLS rules for acme| DB

Comparison with Alternative Architectures

Module-Level Singleton (Current) vs Per-Request App

Aspect	Module-Level Singleton	Per-Request App
Connection Pool	Single shared pool	New pool per request
Memory Overhead	Constant (~500MB)	500MB × concurrent requests
Thread Safety	Must ensure thread-safe access	Each request isolated
Configuration	Loaded once at startup	Can vary per request
Performance	Fast (no setup overhead)	Slow (initialization cost)
Use Case	Production daemon	Testing/multi-config scenarios

Shared Process (Current) vs Separate Process Per Tenant

Aspect	Shared Process	Process Per Tenant
Isolation	Application-level (RBAC/RLS)	Process-level (OS isolation)
Resource Usage	Efficient (shared resources)	Higher (duplicate resources)
Scaling	Horizontal (add instances)	Vertical (more processes)
Complexity	Simpler deployment	Complex orchestration
Security	Depends on Superset RBAC	Stronger isolation
Use Case	Most deployments	High-security multi-tenant

Future Architectural Considerations

Async/Await Support

The current implementation uses synchronous request handling. Future versions could:

Use async/await for I/O operations
Implement connection pooling with asyncpg (PostgreSQL) or aiomysql
Improve throughput for I/O-bound operations

Caching Layer

Adding caching between MCP service and database:

Redis cache for frequently accessed resources (dashboards, charts, datasets)
Cache invalidation on updates
Reduced database load for read-heavy workloads

Event-Driven Updates

WebSocket support for real-time updates:

Push notifications when dashboards/charts change
Streaming query results for large datasets
Live dashboard editing collaboration

References

Flask Application Context: https://flask.palletsprojects.com/en/stable/appcontext/
SQLAlchemy Connection Pooling: https://docs.sqlalchemy.org/en/stable/core/pooling.html
FastMCP Documentation: https://github.com/jlowin/fastmcp
Superset Security Model: https://superset.apache.org/docs/security

20 KiB Raw Blame History Unescape Escape