mirror of
https://github.com/apache/superset.git
synced 2026-04-07 18:35:15 +00:00
460 lines
14 KiB
Markdown
460 lines
14 KiB
Markdown
---
|
|
title: MCP Integration
|
|
hide_title: true
|
|
sidebar_position: 8
|
|
version: 1
|
|
---
|
|
|
|
<!--
|
|
Licensed to the Apache Software Foundation (ASF) under one
|
|
or more contributor license agreements. See the NOTICE file
|
|
distributed with this work for additional information
|
|
regarding copyright ownership. The ASF licenses this file
|
|
to you under the Apache License, Version 2.0 (the
|
|
"License"); you may not use this file except in compliance
|
|
with the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing,
|
|
software distributed under the License is distributed on an
|
|
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
KIND, either express or implied. See the License for the
|
|
specific language governing permissions and limitations
|
|
under the License.
|
|
-->
|
|
|
|
# MCP Integration
|
|
|
|
Model Context Protocol (MCP) integration allows extensions to register custom AI agent capabilities that integrate seamlessly with Superset's MCP service. Extensions can provide both **tools** (executable functions) and **prompts** (interactive guidance) that AI agents can discover and use.
|
|
|
|
## What is MCP?
|
|
|
|
MCP enables extensions to extend Superset's AI capabilities in two ways:
|
|
|
|
### MCP Tools
|
|
Tools are Python functions that AI agents can call to perform specific tasks. They provide executable functionality that extends Superset's capabilities.
|
|
|
|
**Examples of MCP tools:**
|
|
- Data processing and transformation functions
|
|
- Custom analytics calculations
|
|
- Integration with external APIs
|
|
- Specialized report generation
|
|
- Business-specific operations
|
|
|
|
### MCP Prompts
|
|
Prompts provide interactive guidance and context to AI agents. They help agents understand how to better assist users with specific workflows or domain knowledge.
|
|
|
|
**Examples of MCP prompts:**
|
|
- Step-by-step workflow guidance
|
|
- Domain-specific context and knowledge
|
|
- Interactive troubleshooting assistance
|
|
- Template generation helpers
|
|
- Best practices recommendations
|
|
|
|
## Getting Started
|
|
|
|
## MCP Tools
|
|
|
|
### Basic Tool Registration
|
|
|
|
The simplest way to create an MCP tool is using the `@tool` decorator:
|
|
|
|
```python
|
|
from superset_core.mcp import tool
|
|
|
|
@tool
|
|
def hello_world() -> dict:
|
|
"""A simple greeting tool."""
|
|
return {"message": "Hello from my extension!"}
|
|
```
|
|
|
|
This creates a tool that AI agents can call by name. The tool name defaults to the function name.
|
|
|
|
### Decorator Parameters
|
|
|
|
The `@tool` decorator accepts several optional parameters:
|
|
|
|
**Parameter details:**
|
|
- **`name`**: Tool identifier (AI agents use this to call your tool)
|
|
- **`description`**: Explains what the tool does (helps AI agents decide when to use it)
|
|
- **`tags`**: Categories for organization and discovery
|
|
- **`protect`**: Whether the tool requires user authentication (defaults to `True`)
|
|
|
|
### Naming Your Tools
|
|
|
|
For extensions, include your extension ID in tool names to avoid conflicts:
|
|
|
|
## Complete Example
|
|
|
|
Here's a more comprehensive example showing best practices:
|
|
|
|
```python
|
|
# backend/mcp_tools.py
|
|
import random
|
|
from datetime import datetime, timezone
|
|
from pydantic import BaseModel, Field
|
|
from superset_core.mcp import tool
|
|
|
|
class RandomNumberRequest(BaseModel):
|
|
"""Request schema for random number generation."""
|
|
|
|
min_value: int = Field(
|
|
description="Minimum value (inclusive) for random number generation",
|
|
ge=-2147483648,
|
|
le=2147483647
|
|
)
|
|
max_value: int = Field(
|
|
description="Maximum value (inclusive) for random number generation",
|
|
ge=-2147483648,
|
|
le=2147483647
|
|
)
|
|
|
|
@tool(
|
|
name="example_extension.random_number",
|
|
tags=["extension", "utility", "random", "generator"]
|
|
)
|
|
def random_number_generator(request: RandomNumberRequest) -> dict:
|
|
"""
|
|
Generate a random integer between specified bounds.
|
|
|
|
This tool validates input ranges and provides detailed error messages
|
|
for invalid requests.
|
|
"""
|
|
|
|
# Validate business logic (Pydantic handles type/range validation)
|
|
if request.min_value > request.max_value:
|
|
return {
|
|
"status": "error",
|
|
"error": f"min_value ({request.min_value}) cannot be greater than max_value ({request.max_value})",
|
|
"timestamp": datetime.now(timezone.utc).isoformat()
|
|
}
|
|
|
|
# Generate random number
|
|
result = random.randint(request.min_value, request.max_value)
|
|
|
|
return {
|
|
"status": "success",
|
|
"random_number": result,
|
|
"min_value": request.min_value,
|
|
"max_value": request.max_value,
|
|
"range_size": request.max_value - request.min_value + 1,
|
|
"timestamp": datetime.now(timezone.utc).isoformat()
|
|
}
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### Response Format
|
|
|
|
Use consistent response structures:
|
|
|
|
```python
|
|
# Success response
|
|
{
|
|
"status": "success",
|
|
"result": "your_data_here",
|
|
"timestamp": "2024-01-01T00:00:00Z"
|
|
}
|
|
|
|
# Error response
|
|
{
|
|
"status": "error",
|
|
"error": "Clear error message",
|
|
"timestamp": "2024-01-01T00:00:00Z"
|
|
}
|
|
```
|
|
|
|
### Documentation
|
|
|
|
Write clear descriptions and docstrings:
|
|
|
|
```python
|
|
@tool(
|
|
name="my_extension.process_data",
|
|
description="Process customer data and generate insights. Requires valid customer ID and date range.",
|
|
tags=["analytics", "customer", "reporting"]
|
|
)
|
|
def process_data(customer_id: int, start_date: str, end_date: str) -> dict:
|
|
"""
|
|
Process customer data for the specified date range.
|
|
|
|
This tool analyzes customer behavior patterns and generates
|
|
actionable insights for business decision-making.
|
|
|
|
Args:
|
|
customer_id: Unique customer identifier
|
|
start_date: Analysis start date (YYYY-MM-DD format)
|
|
end_date: Analysis end date (YYYY-MM-DD format)
|
|
|
|
Returns:
|
|
Dictionary containing analysis results and recommendations
|
|
"""
|
|
# Implementation here
|
|
pass
|
|
```
|
|
|
|
### Tool Naming
|
|
|
|
- **Extension tools**: Use prefixed names like `my_extension.tool_name`
|
|
- **Descriptive names**: `calculate_tax_amount` vs `calculate`
|
|
- **Consistent naming**: Follow patterns within your extension
|
|
|
|
## How AI Agents Use Your Tools
|
|
|
|
Once registered, AI agents can discover and use your tools automatically:
|
|
|
|
```
|
|
User: "Generate a random number between 1 and 100"
|
|
Agent: I'll use the random number generator tool.
|
|
→ Calls: example_extension.random_number(min_value=1, max_value=100)
|
|
← Returns: {"status": "success", "random_number": 42, ...}
|
|
Agent: I generated the number 42 for you.
|
|
```
|
|
|
|
The AI agent sees your tool's:
|
|
- **Name**: How to call it
|
|
- **Description**: What it does and when to use it
|
|
- **Parameters**: What inputs it expects (from Pydantic schema)
|
|
- **Tags**: Categories for discovery
|
|
|
|
## Troubleshooting
|
|
|
|
### Tool Not Available to AI Agents
|
|
|
|
1. **Check extension registration**: Verify your tool module is listed in extension entrypoints
|
|
2. **Verify decorator**: Ensure `@tool` is correctly applied
|
|
3. **Extension loading**: Confirm your extension is installed and enabled
|
|
|
|
### Input Validation Errors
|
|
|
|
1. **Pydantic models**: Ensure field types match expected inputs
|
|
2. **Field constraints**: Check min/max values and string lengths are reasonable
|
|
3. **Required fields**: Verify which parameters are required vs optional
|
|
|
|
### Runtime Issues
|
|
|
|
1. **Error handling**: Add try/catch blocks with clear error messages
|
|
2. **Response format**: Use consistent status/error/timestamp structure
|
|
3. **Testing**: Test your tools with various input scenarios
|
|
|
|
### Development Tips
|
|
|
|
1. **Start simple**: Begin with basic tools, add complexity gradually
|
|
2. **Test locally**: Use MCP clients (like Claude Desktop) to test your tools
|
|
3. **Clear descriptions**: Write tool descriptions as if explaining to a new user
|
|
4. **Meaningful tags**: Use tags that help categorize and discover tools
|
|
5. **Error messages**: Provide specific, actionable error messages
|
|
|
|
## MCP Prompts
|
|
|
|
### Basic Prompt Registration
|
|
|
|
Create interactive prompts using the `@prompt` decorator:
|
|
|
|
```python
|
|
from superset_core.mcp import prompt
|
|
from fastmcp import Context
|
|
|
|
@prompt("my_extension.workflow_guide")
|
|
async def workflow_guide(ctx: Context) -> str:
|
|
"""Interactive guide for data analysis workflows."""
|
|
return """
|
|
# Data Analysis Workflow Guide
|
|
|
|
Here's a step-by-step approach to effective data analysis in Superset:
|
|
|
|
## 1. Data Discovery
|
|
- Start by exploring your datasets using the dataset browser
|
|
- Check data quality and identify key metrics
|
|
- Look for patterns and relationships in your data
|
|
|
|
## 2. Chart Creation
|
|
- Choose appropriate visualizations for your data types
|
|
- Apply filters to focus on relevant subsets
|
|
- Configure proper aggregations and groupings
|
|
|
|
## 3. Dashboard Assembly
|
|
- Combine related charts into coherent dashboards
|
|
- Use filters and parameters for interactivity
|
|
- Add markdown components for context and explanations
|
|
|
|
Would you like guidance on any specific step?
|
|
"""
|
|
```
|
|
|
|
### Advanced Prompt Examples
|
|
|
|
#### Domain-Specific Context
|
|
|
|
```python
|
|
@prompt(
|
|
"sales_extension.sales_analysis_guide",
|
|
title="Sales Analysis Guide",
|
|
description="Specialized guidance for sales data analysis workflows"
|
|
)
|
|
async def sales_analysis_guide(ctx: Context) -> str:
|
|
"""Provides sales-specific analysis guidance and best practices."""
|
|
return """
|
|
# Sales Data Analysis Best Practices
|
|
|
|
## Key Metrics to Track
|
|
- **Revenue Growth**: Month-over-month and year-over-year trends
|
|
- **Conversion Rates**: Lead-to-opportunity-to-close ratios
|
|
- **Customer Lifetime Value**: Total value per customer segment
|
|
- **Sales Cycle Length**: Time from lead to close by product/region
|
|
|
|
## Recommended Chart Types
|
|
- **Time Series**: Revenue trends, seasonal patterns
|
|
- **Funnel Charts**: Conversion analysis across sales stages
|
|
- **Geographic Maps**: Regional performance comparison
|
|
- **Cohort Analysis**: Customer retention and growth patterns
|
|
|
|
## Common Pitfalls to Avoid
|
|
- Don't mix different time granularities without proper context
|
|
- Always normalize for business days when comparing periods
|
|
- Consider external factors (holidays, market events) in analysis
|
|
- Segment by relevant dimensions (product, region, channel)
|
|
|
|
## Next Steps
|
|
1. Identify your primary sales KPIs
|
|
2. Create baseline trend charts for each metric
|
|
3. Build comparative views across segments
|
|
4. Set up automated alerts for significant changes
|
|
"""
|
|
```
|
|
|
|
#### Interactive Troubleshooting
|
|
|
|
```python
|
|
@prompt("support_extension.troubleshoot_charts")
|
|
async def troubleshoot_charts(ctx: Context) -> str:
|
|
"""Interactive troubleshooting assistant for chart issues."""
|
|
return """
|
|
# Chart Troubleshooting Assistant
|
|
|
|
Let's diagnose your chart issue step by step:
|
|
|
|
## Common Issues and Solutions
|
|
|
|
### 🚫 No Data Showing
|
|
**Possible causes:**
|
|
- Filters are too restrictive
|
|
- Date range doesn't match your data
|
|
- Database connection issues
|
|
- Missing permissions
|
|
|
|
**Check:** Try removing all filters and expanding the date range first.
|
|
|
|
### 📊 Unexpected Aggregation Results
|
|
**Possible causes:**
|
|
- Incorrect grouping dimensions
|
|
- Wrong aggregation function (SUM vs COUNT vs AVG)
|
|
- Data quality issues (duplicates, nulls)
|
|
- Time zone mismatches
|
|
|
|
**Check:** Verify your GROUP BY columns and aggregation logic.
|
|
|
|
### 🐌 Slow Performance
|
|
**Possible causes:**
|
|
- Large dataset without proper indexing
|
|
- Complex joins or calculations
|
|
- Missing query optimizations
|
|
- Resource constraints
|
|
|
|
**Check:** Simplify the query and add appropriate filters first.
|
|
|
|
## Debug Steps
|
|
1. **Start Simple**: Create a basic count query first
|
|
2. **Add Gradually**: Introduce complexity step by step
|
|
3. **Check SQL**: Review the generated SQL for issues
|
|
4. **Test Data**: Verify with a small sample first
|
|
|
|
What specific issue are you experiencing?
|
|
"""
|
|
```
|
|
|
|
### Prompt Best Practices
|
|
|
|
#### Content Structure
|
|
- **Use clear headings** and sections for easy navigation
|
|
- **Provide actionable steps** rather than just theory
|
|
- **Include examples** relevant to the user's domain
|
|
- **Offer next steps** to continue the workflow
|
|
|
|
#### Interactive Design
|
|
- **Ask questions** to engage the user
|
|
- **Provide options** for different scenarios
|
|
- **Reference specific Superset features** by name
|
|
- **Link to related tools** when appropriate
|
|
|
|
#### Context Awareness
|
|
```python
|
|
@prompt("analytics_extension.context_aware_guide")
|
|
async def context_aware_guide(ctx: Context) -> str:
|
|
"""Provides guidance based on current user context."""
|
|
# Access user information if available
|
|
user_info = getattr(ctx, 'user', None)
|
|
|
|
guidance = """# Personalized Analytics Guide\n\n"""
|
|
|
|
if user_info:
|
|
guidance += f"Welcome back! Here's guidance tailored for your role:\n\n"
|
|
|
|
guidance += """
|
|
## Getting Started
|
|
Based on your previous activity, here are recommended next steps:
|
|
|
|
1. **Review Recent Dashboards**: Check your most-used dashboards for updates
|
|
2. **Explore New Data**: Look for recently added datasets in your domain
|
|
3. **Share Insights**: Consider sharing successful analyses with your team
|
|
|
|
## Advanced Techniques
|
|
- Set up automated alerts for key metrics
|
|
- Create parameterized dashboards for different audiences
|
|
- Use SQL Lab for complex custom analyses
|
|
"""
|
|
|
|
return guidance
|
|
```
|
|
|
|
## Combining Tools and Prompts
|
|
|
|
Extensions can provide both tools and prompts that work together:
|
|
|
|
```python
|
|
# Tool for data processing
|
|
@tool("analytics_extension.calculate_metrics")
|
|
def calculate_metrics(data: dict) -> dict:
|
|
"""Calculate advanced analytics metrics."""
|
|
# Implementation here
|
|
pass
|
|
|
|
# Prompt that guides users to the tool
|
|
@prompt("analytics_extension.metrics_guide")
|
|
async def metrics_guide(ctx: Context) -> str:
|
|
"""Guide users through advanced metrics calculation."""
|
|
return """
|
|
# Advanced Metrics Calculation
|
|
|
|
Use the `calculate_metrics` tool to compute specialized analytics:
|
|
|
|
## Available Metrics
|
|
- Customer Lifetime Value (CLV)
|
|
- Cohort Retention Rates
|
|
- Statistical Significance Tests
|
|
- Predictive Trend Analysis
|
|
|
|
## Usage
|
|
Call the tool with your dataset to get detailed calculations
|
|
and recommendations for visualization approaches.
|
|
|
|
Would you like to calculate metrics for your current dataset?
|
|
"""
|
|
```
|
|
|
|
## Next Steps
|
|
|
|
- **[Development](./development)** - Project structure, APIs, and dev workflow
|
|
- **[Security](./security)** - Security best practices for extensions
|