Compare commits

..

58 Commits

Author SHA1 Message Date
Maxime Beauchemin
6d7467bafd feat: add devcontainer profiles for MCP development
- Split devcontainer config into base + profile structure
  - default: Standard Superset development (port 9001)
  - with-mcp: Superset + MCP service (ports 9001, 5008)
- Add MCP service to docker-compose-light.yml as optional profile
- Update start-superset.sh to conditionally start MCP when ENABLE_MCP=true
- Add MCP case to docker-bootstrap.sh for starting MCP service
- Preserve original devcontainer.json as .old for migration reference

This allows developers to choose their development environment:
- Use default profile for standard Superset development
- Use with-mcp profile for MCP/LLM agent development
- MCP service runs on port 5008 when enabled via profile

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-30 18:59:27 -07:00
Amin Ghadersohi
71f294b7d3 docs: update MCP Chart Test Plan with accurate implementation details
- Add authentication setup and dataset discovery prerequisites
- Remove references to unsupported preview formats (base64, interactive, vega_lite)
- Fix schema examples: operator -> op, xlsx -> excel
- Add performance limits for preview formats
- Update error expectations for empty query results
- Enhance cache testing with cache_timeout parameter
- Add Important Schema Notes section for common gotchas
- Update response patterns to match actual implementation
2025-07-30 20:15:41 -04:00
Amin Ghadersohi
e839d0989a refactor: remove redundant docs folder from mcp_service 2025-07-30 19:41:05 -04:00
Amin Ghadersohi
714e21b3ec feat: enhance MCP documentation with Superset best practices
- Add Docusaurus admonitions (:::note, :::tip, :::warning) following Superset patterns
- Improve code block formatting and language tags
- Add important notes about authentication and installation
- Ensure consistency with Superset documentation style
- Maintain accuracy with FastMCP specifications
2025-07-30 19:26:08 -04:00
Amin Ghadersohi
bf78eb69ed fix: resolve Mermaid diagram syntax errors in development guide
Replace @ symbols in node names that cause Mermaid parsing errors
2025-07-30 19:21:45 -04:00
Amin Ghadersohi
a13c1ba8c2 fix: update broken README.md links in MCP documentation
Replace ./README.md references with proper Docusaurus links to intro page
2025-07-30 19:20:38 -04:00
Amin Ghadersohi
422c34a6ee feat: integrate MCP service documentation into Superset Docusaurus site
- Add comprehensive MCP service documentation to docs/docs/mcp-service/
- Create 7 new documentation files covering all aspects of MCP service
- Add MCP Service section to Docusaurus sidebar configuration
- Update README files to reference official documentation
- Remove outdated standalone architecture documentation
- Install missing docs dependencies to fix eslint-docs pre-commit hook

Documentation includes:
- Introduction and overview with getting started guide
- Complete API reference with all 16 tools and examples
- Authentication setup with JWT and identity provider integration
- Development guide with internal architecture and patterns
- System architecture overview with deployment considerations
- Preset integration guide for enterprise features
2025-07-30 19:17:18 -04:00
Amin Ghadersohi
cd213fc57d update: enhance MCP service README with comprehensive setup instructions
Improve README with official Superset installation instructions, detailed Claude Desktop
connection steps using FastMCP proxy, and troubleshooting guide. Adds verification steps
and clear instructions for HTTP-based MCP service deployment.
2025-07-30 18:24:54 -04:00
Amin Ghadersohi
0f20a88598 feat(mcp): major enhancements to chart generation with validation, pooled screenshots, and simplified tests
This commit introduces significant improvements to the MCP chart generation system:

## Core Features
- Enhanced chart generation with comprehensive validation and error handling
- Implemented WebDriver pooling for 10x faster screenshot generation
- Added detailed validation for chart configurations with helpful error messages
- Support for both table and XY (line/bar/area/scatter) chart types

## Validation & Error Handling
- Column existence validation with similarity-based suggestions
- Data type compatibility checks for aggregations (SUM/AVG for numbers only)
- Filter operator validation
- Detailed error responses with actionable suggestions and available columns/metrics

## Performance Improvements
- WebDriver connection pooling eliminates browser startup overhead
- Reuses browser instances across multiple screenshot requests
- Automatic health checking and recovery of WebDriver instances
- Configurable pool size and timeout settings

## Screenshot Enhancements
- New PooledChartScreenshot and PooledExploreScreenshot classes
- FastAPI endpoints for serving chart and explore screenshots as PNG images
- Proper caching with TTL support
- Clean chart-only screenshots for explore views (hides UI elements)

## Test Improvements
- Simplified test structure focusing on schema validation over complex mocking
- Removed brittle integration-style tests in favor of unit tests
- Fixed all test failures with cleaner, more maintainable approach
- Added comprehensive test plan for manual testing with Claude Desktop

## API Enhancements
- Centralized URL generation for screenshots and explore links
- Consistent error response format across all endpoints
- Support for UUID-based chart lookups alongside numeric IDs

## Code Quality
- Fixed all pre-commit issues (mypy, ruff, long lines, complexity)
- Extracted helper functions to reduce cyclomatic complexity
- Proper type annotations throughout
- Better separation of concerns

## Documentation
- Added comprehensive test plan for chart tools
- Detailed guides for table chart configuration
- WebDriver pooling architecture documentation
- Demo scripts for testing functionality

The changes maintain backward compatibility while significantly improving
performance, reliability, and developer experience for chart generation in the
MCP service.
2025-07-30 18:09:23 -04:00
Amin Ghadersohi
33d16eaca1 refactor: rename pydantic_schemas to schemas for MCP service
Simplified the directory structure by renaming `pydantic_schemas` to `schemas` throughout the MCP service codebase. This follows standard Python conventions and makes imports more concise.

Changes:
- Renamed directory: superset/mcp_service/pydantic_schemas/ → superset/mcp_service/schemas/
- Updated all import statements across 42 files
- Updated documentation references in README files
- All tests pass, pre-commit hooks clean
2025-07-30 14:20:42 -04:00
Amin Ghadersohi
5b56bc622b feat: major MCP service enhancements with cache control, testing infrastructure, and bug fixes
This comprehensive update adds significant functionality and improvements to the MCP service:

  New Features:
  - Cache control utilities for metadata operations with configurable TTL
  - Entity testing plan documentation for systematic API validation
  - Chart data retrieval tool with caching support
  - Comprehensive request schema validation framework
  - Sortable columns testing infrastructure

  Documentation Updates:
  - Expanded README with architecture details and usage examples
  - Phase 1 status tracking and implementation notes
  - Enhanced schema documentation with cache control patterns
  - LLMS.md reference added to CLAUDE.md

  Search Functionality Fixes:
  - Dashboard search: removed invalid columns (owners, roles, tags, published)
  - Dataset search: removed non-existent 'tags' and problematic 'catalog' columns
  - Chart search: maintained correct text-only columns (slice_name, description)
  - Updated corresponding unit tests to match implementations

  Infrastructure Improvements:
  - Added comprehensive cache control test coverage
  - Request schema validation testing
  - Sortable columns validation across all entity types
  - Enhanced error handling and logging

  Technical Debt:
  - Standardized import patterns across MCP tools
  - Improved type safety in Pydantic schemas
  - Better separation of concerns for cache operations
2025-07-30 14:20:42 -04:00
Amin Ghadersohi
8cbfe027b5 update: add chart update tools and centralized URL configuration
- Add update_chart and update_chart_preview tools for modifying saved and cached charts
  - Implement centralized URL configuration using SUPERSET_WEBSERVER_ADDRESS
  - Add comprehensive MCP audit logging with impersonation tracking and payload  sanitization
  - Refactor duplicate chart analysis functions to chart_utils.py
  - Update all README documentation with new tools and features
  - Add 194+ unit tests covering URL utils, audit logging, and chart updates
  - Fix all pre-commit issues and ensure test coverage
2025-07-30 14:20:42 -04:00
Amin Ghadersohi
c3b3edc6ba refactor: rename create_chart to generate_chart and standardize MCP test naming
- Rename create_chart function to generate_chart throughout MCP service
  - Update CreateChartRequest to GenerateChartRequest schema
  - Rename test files to follow test_*.py convention consistently
  - Fix test class and method names (TestCreateChart → TestGenerateChart)
  - Update all documentation and README files with new naming
  - Standardize auth scope mappings and tool references
  - Maintain backward compatibility for Superset CreateChartCommand references
2025-07-30 14:20:42 -04:00
Amin Ghadersohi
aa06bb9fda revert unnecessary config change 2025-07-30 14:20:41 -04:00
Amin Ghadersohi
0f222b9034 update: revert comment change 2025-07-30 14:20:41 -04:00
Amin Ghadersohi
7044153ca4 revert: switch webdriver back to Firefox for chart
screenshots

  Changed WEBDRIVER_TYPE from "chrome" to "firefox" and removed
   Chrome-specific
  binary path from WEBDRIVER_CONFIGURATION. Firefox provides
  more reliable
  screenshot generation for the MCP service chart preview
  functionality.

  - Set WEBDRIVER_TYPE = "firefox"
  - Cleared binary_location in WEBDRIVER_CONFIGURATION to use
  default Firefox
  - Chart preview tools now use geckodriver instead of Chrome
  headless mode

  This resolves webdriver timeout issues in chart screenshot
  generation.
2025-07-30 14:20:41 -04:00
Amin Ghadersohi
ce82c35bb6 docs: clarify MCP service capabilities and available operations
Updated all MCP service READMEs to accurately reflect available operations:

- Removed overuse of 'comprehensive' and 'complete' where it overstated capabilities
- Added clear 'Available Operations' section showing:
  - Read operations available for all entities
  - Create operations only for charts and dashboards
  - No update or delete operations available
- Updated tool descriptions to be more specific about capabilities
- Changed 'complete workflow automation' to 'dashboard creation and chart addition'
- Clarified that we have 5 chart types, not 'comprehensive' support
- Updated test count references to be consistent (185+ tests)

This ensures documentation accurately represents Phase 1 capabilities without
overstating what operations are available.
2025-07-30 14:20:41 -04:00
Amin Ghadersohi
9b25bd973f docs: update MCP service documentation to reflect Phase 1 completion
Updated all MCP service README files to reflect current status:

- README_ARCHITECTURE.md: Updated to 95% completion status, documented new
  dashboard generation and SQL Lab tools, corrected file references
- README_PHASE1_STATUS.md: Marked all core epics as complete, updated tool
  count to 16, reflected recent technical completions
- README_SCHEMAS.md: Updated test coverage numbers, documented new schema
  enhancements, marked Phase 1 features as complete

Changes reflect the successful completion of core MCP service functionality
including BaseDAO enhancements, comprehensive testing coverage, and all
planned Phase 1 tools and features.
2025-07-30 14:20:41 -04:00
Amin Ghadersohi
5f7502b85c feat: comprehensive MCP service enhancements with type safety and new tools
Core Improvements:
- Enhanced BaseDAO with type-safe UUID handling and flexible column support
- Added comprehensive test coverage (185+ passing unit tests)
- Eliminated hardcoded string checks with SQLAlchemy type inspection
- Removed unnecessary async patterns for better performance consistency

New MCP Tools & Features:
- Dashboard generation: create_dashboard and add_chart_to_existing_dashboard
- Chart data/preview: get_chart_data and get_chart_preview with multiple formats
- SQL Lab integration: open_sql_lab_with_context with proper parameter handling
- Enhanced chart creation with comprehensive type support

Technical Enhancements:
- Fixed SQL Lab parameter naming (dbid) for frontend compatibility
- Improved error handling with graceful UUID conversion fallbacks
- Enhanced multi-identifier support (ID, UUID, slug) across all tools
- Comprehensive request schema pattern eliminating LLM validation issues

Testing & Quality:
- Added BaseDAO unit tests with edge cases and error scenarios
- Enhanced MCP tool testing with proper mocking patterns
- Fixed trailing whitespace and pre-commit compliance
- Professional error handling and type validation

Documentation:
- Updated all MCP service READMEs with current status (95% complete)
- Documented all 16 MCP tools with examples and usage patterns
- Enhanced architecture documentation with recent improvements
- Updated Phase 1 status reflecting completion of core features

This commit consolidates Phase 1 MCP service development with production-ready
authentication, comprehensive testing, and complete tool functionality.
2025-07-30 14:20:41 -04:00
Amin Ghadersohi
1d5372210f update: read files 2025-07-30 14:20:41 -04:00
Amin Ghadersohi
64af53f6f6 update: enhance MCP service auth with factory pattern and code cleanup
- Implement configurable auth factory pattern as suggested by @dpgaspar
  - Add unit test coverage
  - Refactor auth.py for better pythonic patterns and error handling
  - Add proper type annotations and logging throughout
  - Split complex functions into focused helper functions
  - Add detailed architecture documentation with Mermaid diagrams
  - Improve JWT token extraction with robust fallback logic
2025-07-30 14:20:41 -04:00
Amin Ghadersohi
56b308340e update: use real Superset user for MCP authentication instead of MCPUser
Replace MCPUser wrapper with actual Flask-AppBuilder User from database
  to enable proper RBAC permission filtering. Fixes MCP service returning
  0 counts due to empty permission queries.
2025-07-30 14:20:41 -04:00
Amin Ghadersohi
3b2b7609a8 update: add optional JWT Bearer authentication to MCP service
Implement comprehensive JWT authentication system for production deployments:

  • Add BearerAuthProvider integration with FastMCP for RS256 token validation
  • Support both static public keys and JWKS endpoints for enterprise key rotation
  • Implement scope-based authorization (dashboard:read, chart:write, etc.)
  • Extract user identity from JWT sub claim for proper audit trails
  • Maintain backward compatibility with admin fallback when auth disabled
  • Add comprehensive test coverage for authentication flows
  • Update documentation with human-friendly security setup guide

  Authentication is disabled by default for development convenience.
  Configure via environment variables for production use.
2025-07-30 14:20:40 -04:00
Amin Ghadersohi
61a91f80fe update deps 2025-07-30 14:20:40 -04:00
Amin Ghadersohi
ed3c5ecbc2 enhance MCP service with multi-identifier support and comprehensive UUID/slug functionality
- Add UUID field to ChartInfo and DatasetInfo Pydantic schemas for complete serialization
  - Include UUID in chart and dataset serialization functions (serialize_chart_object,
  serialize_dataset_object)
  - UUID and slug are now included in default response columns for better discoverability:
    * Dashboards: UUID and slug in DEFAULT_DASHBOARD_COLUMNS and returned by default
    * Charts: UUID in DEFAULT_CHART_COLUMNS and returned by default
    * Datasets: UUID in DEFAULT_DATASET_COLUMNS and returned by default
  - Search functionality enhanced to include UUID/slug fields across all relevant tools
  - Add comprehensive test coverage for UUID/slug functionality:
    * Default column verification tests ensuring UUID/slug are in default responses
    * Response data verification tests confirming UUID/slug values are returned
    * Custom column selection tests for explicit UUID/slug requests
    * Metadata accuracy tests verifying columns_requested/columns_loaded tracking
  - Update documentation to reflect enhanced multi-identifier capabilities
  - All 132 tests pass with comprehensive verification of UUID/slug support
2025-07-30 14:20:40 -04:00
Amin Ghadersohi
c8fbd4233c add multi-identifier support and fix columns metadata accuracy
**Multi-Identifier Support:**
  - Enhance ModelGetInfoTool to support ID, UUID, and slug lookup
  - Add intelligent identifier detection for get_*_info tools
  - Dashboards: support ID, UUID, and slug via id_or_slug_filter
  - Datasets/Charts: support ID and UUID via direct database queries
  - Add GetDashboardInfoRequest, GetDatasetInfoRequest, GetChartInfoRequest schemas

  **Enhanced Default Columns & Metadata:**
  - Add uuid to default columns for all list tools (dashboards, datasets, charts)
  - Include uuid/slug in search columns for better discoverability
  - Fix columns_requested to accurately reflect user input vs defaults
  - Fix columns_loaded to show actual DAO columns requested vs serialized fields

  **Testing & Code Quality:**
  - Add multi-identifier tests for all get_*_info tools (ID/UUID/slug scenarios)
  - Remove unused serialized_columns variable in ModelListTool
  - Fix linting issues (line length, docstrings)

  This provides flexible identifier support while ensuring accurate metadata
  tracking for better LLM compatibility and debugging.
2025-07-30 14:20:40 -04:00
Amin Ghadersohi
fc85f68585 update: implement FastMCP complex inputs pattern for MCP tools
- Replace individual parameters with structured request schemas for list_datasets, list_charts, and
  list_dashboards
  - Fix validation issues where LLMs passed arrays/objects as strings
  - Add ListDatasetsRequest, ListChartsRequest, ListDashboardsRequest schemas
  - Fix object serialization (charts, dashboards, datasets)
  - Add validation to prevent search+filters conflicts
  - Update all tests and fix linting issues

  Resolves string/array validation ambiguity that caused tool failures when LLMs sent complex
  parameters incorrectly.
2025-07-30 14:20:40 -04:00
Amin Ghadersohi
e825bbe1f4 update: generate_explore_link and create_chart working well 2025-07-30 14:20:40 -04:00
Amin Ghadersohi
a80b637a2a update: add a save_chart boolean parameter to create chart so that generate link to explore can use its code and just pass false 2025-07-30 14:20:40 -04:00
Amin Ghadersohi
541b3bd727 update: add docs to proxy 2025-07-30 14:20:40 -04:00
Amin Ghadersohi
a909799e5c update: simplify create chart a lot: support area, line, bar, scatter and table charts.
example prompt:
- can you use superset dataset 2 to plot popular baby names in 2024

- plot airline delay by day of week and group by airline use dataset 6
2025-07-30 14:20:39 -04:00
Amin Ghadersohi
01329f1c62 update: tests and deps 2025-07-30 14:20:39 -04:00
Amin Ghadersohi
67f621d360 update: remove create_chart_simple tool 2025-07-30 14:20:39 -04:00
Amin Ghadersohi
b552dbf4a1 update: fix all lint issues 2025-07-30 14:20:39 -04:00
Amin Ghadersohi
364af98c04 update: create_chart: Remove x_axis from groupby if present, update docs and tests
- Updates create_chart logic to automatically remove x_axis from groupby for ECharts timeseries charts, preventing duplicate dimension usage.
- Updates and expands unit test to verify x_axis is excluded from groupby, using improved test mocks for accurate backend simulation.
- Updates documentation (README.md, README_ARCHITECTURE.md, README_PHASE1_STATUS.md, README_SCHEMAS.md) to clarify create_chart tool behavior and schema, including new groupby/x_axis handling.
- No breaking changes to tool signatures; behavior is now more robust and LLM-friendly.
2025-07-30 14:20:39 -04:00
Amin Ghadersohi
afdb8b38a6 update: Add columns and metrics to DatasetInfo/list_datasets, improve test coverage
- Updated DatasetInfo schema to include columns and metrics fields, with new TableColumnInfo and SqlMetricInfo models.
- Updated serialize_dataset_object to serialize columns and metrics for each dataset.
- Modified list_datasets tool to use serialize_dataset_object and include columns/metrics by default.
- Improved and fixed all related unit tests to use proper MagicMock objects for columns/metrics and to parse JSON responses.
- Ensured LLM/OpenAPI compatibility for dataset listing and info tools.
2025-07-30 14:20:39 -04:00
Amin Ghadersohi
fc7ea804bc update: chart create working with union of types 2025-07-30 14:20:39 -04:00
Amin Ghadersohi
7c256ae9aa update: docs 2025-07-30 14:20:39 -04:00
Amin Ghadersohi
1b190abc3b update: add new ModelGetAvailableFiltersTool 2025-07-30 14:20:39 -04:00
Amin Ghadersohi
c6c71bf835 update: update tool registration to use decorators, and update all tests to use mcp client directly in the tests as recommended 2025-07-30 14:20:38 -04:00
Amin Ghadersohi
cd5ead7f11 update: update tool registration 2025-07-30 14:20:38 -04:00
Amin Ghadersohi
e5eebe28f9 update: refactor auth and model tools 2025-07-30 14:20:38 -04:00
Amin Ghadersohi
9eac6ef433 update: add tests for mcp list and info tools 2025-07-30 14:20:38 -04:00
Amin Ghadersohi
d523d523e5 update: refactor tools directories 2025-07-30 14:20:38 -04:00
Amin Ghadersohi
91a3214ed4 update: wip 2025-07-30 14:20:38 -04:00
Amin Ghadersohi
95b787f024 update: wip 2025-07-30 14:20:38 -04:00
Amin Ghadersohi
39121791e8 update: wip 2025-07-30 14:20:38 -04:00
Amin Ghadersohi
9d40fe913f update: wip 2025-07-30 14:20:37 -04:00
Amin Ghadersohi
748ae49c8c update: wip 2025-07-30 14:20:37 -04:00
Amin Ghadersohi
a9d543b6f4 update: unify FastMCP server, modularize tools, and document new DAO-based architecture
- Replace dual Flask/FastAPI setup with a single, unified FastMCP server (`server.py`)
- Introduce `MCPDAOWrapper` for secure, context-aware DAO access (`dao_wrapper.py`)
- Refactor all MCP tools to be modular and domain-organized (`tools/dashboard/`, `tools/chart/`, `tools/dataset/`, `tools/system/`)
- Strongly type all tool contracts with Pydantic v2 models, including full field documentation for LLM/OpenAPI compatibility
- Refactor and extend `BaseDAO` for robust, generic CRUD/list operations
- Add and update documentation:
  - Architecture and flow diagrams (`README_ARCHITECTURE.md`)
  - Tool schema reference and usage instructions (`README.md`, `README_SCHEMAS.md`)
  - Phase 1 status and roadmap (`README_PHASE1_STATUS.md`)
- Implement and test all core list/info tools for dashboards, datasets, and charts, with full search and filter support
- Add chart creation tool (`create_chart_simple`)
- Provide extension points for Preset-specific auth, RBAC, and logging (stubbed in Phase 1)
- Prepare for LLM/agent workflows and future command-based mutations (create/update/delete)
- Expand and update unit/integration test coverage for all tools
2025-07-30 14:20:37 -04:00
Amin Ghadersohi
55d6130fc4 update: docs and deps 2025-07-30 14:20:37 -04:00
Amin Ghadersohi
b98e3eb309 Superset MCP Service - Adding get_instance_info tool 2025-07-30 14:20:37 -04:00
Amin Ghadersohi
b469077e0e update: add list function to BaseDAO and use it to get and filter the dashboards 2025-07-30 14:20:37 -04:00
Amin Ghadersohi
397b4e450b Superset MCP Service - Initial Commit DRAFT 2025-07-30 14:20:37 -04:00
Amin Ghadersohi
0f97002520 update: Refactor BaseDAO: enhance column operator logic and expand test coverage
- Improved the BaseDAO class to robustly handle column operator logic, ensuring all supported operators (eq, ne, sw, ew, in, nin, gt, gte, lt, lte, like, ilike, is_null, is_not_null) are consistently applied via ColumnOperatorEnum.
- Refactored the apply_column_operators and list methods for clarity and reliability, including better handling of columns, relationships, and search.
- Removed 1 index base page handing from list
2025-07-30 14:20:37 -04:00
Amin Ghadersohi
2312250127 update: add idea of ColumnOperator 2025-07-30 14:20:36 -04:00
Amin Ghadersohi
cd52193869 update: add helper for applying base filter 2025-07-30 14:20:36 -04:00
Amin Ghadersohi
9ffe680aaa feat(dao): add robust generic list and count methods to BaseDAO with full test coverage
- Introduced generic `list` and `count` methods to BaseDAO for consistent querying and counting across all DAOs.
- Both methods support filtering (including IN queries), ordering, pagination, search across columns, custom FAB-style filters, and always-on base filters.
- Added comprehensive unit tests for `list` and `count` in `tests/unit_tests/dao/base_test.py`, covering:
  - Filtering (including boolean, None, and IN queries)
  - Ordering (asc/desc, multiple columns)
  - Pagination (including out-of-range)
  - Search across columns
  - Custom filter logic
  - Always-on base filter logic
  - Edge cases and skip_base_filter
- Moved common test fixtures to `conftest.py` for reuse.
2025-07-30 14:20:36 -04:00
1125 changed files with 43801 additions and 71181 deletions

View File

@@ -1,20 +0,0 @@
# Keep this in sync with the base image in the main Dockerfile (ARG PY_VER)
FROM python:3.11.13-bookworm AS base
# Install system dependencies that Superset needs
# This layer will be cached across Codespace sessions
RUN apt-get update && apt-get install -y \
libsasl2-dev \
libldap2-dev \
libpq-dev \
tmux \
gh \
&& rm -rf /var/lib/apt/lists/*
# Install uv for fast Python package management
# This will also be cached in the image
RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \
echo 'export PATH="/root/.cargo/bin:$PATH"' >> /etc/bash.bashrc
# Set the cargo/bin directory in PATH for all users
ENV PATH="/root/.cargo/bin:${PATH}"

View File

@@ -3,14 +3,3 @@
For complete documentation on using GitHub Codespaces with Apache Superset, please see:
**[Setting up a Development Environment - GitHub Codespaces](https://superset.apache.org/docs/contributing/development#github-codespaces-cloud-development)**
## Pre-installed Development Environment
When you create a new Codespace from this repository, it automatically:
1. **Creates a Python virtual environment** using `uv venv`
2. **Installs all development dependencies** via `uv pip install -r requirements/development.txt`
3. **Sets up pre-commit hooks** with `pre-commit install`
4. **Activates the virtual environment** automatically in all terminals
The virtual environment is located at `/workspaces/{repository-name}/.venv` and is automatically activated through environment variables set in the devcontainer configuration.

View File

@@ -1,62 +0,0 @@
# Superset Codespaces environment setup
# This file is appended to ~/.bashrc during Codespace setup
# Find the workspace directory (handles both 'superset' and 'superset-2' names)
WORKSPACE_DIR=$(find /workspaces -maxdepth 1 -name "superset*" -type d | head -1)
if [ -n "$WORKSPACE_DIR" ]; then
# Check if virtual environment exists
if [ -d "$WORKSPACE_DIR/.venv" ]; then
# Activate the virtual environment
source "$WORKSPACE_DIR/.venv/bin/activate"
echo "✅ Python virtual environment activated"
# Verify pre-commit is installed and set up
if command -v pre-commit &> /dev/null; then
echo "✅ pre-commit is available ($(pre-commit --version))"
# Install git hooks if not already installed
if [ -d "$WORKSPACE_DIR/.git" ] && [ ! -f "$WORKSPACE_DIR/.git/hooks/pre-commit" ]; then
echo "🪝 Installing pre-commit hooks..."
cd "$WORKSPACE_DIR" && pre-commit install
fi
else
echo "⚠️ pre-commit not found. Run: pip install pre-commit"
fi
else
echo "⚠️ Python virtual environment not found at $WORKSPACE_DIR/.venv"
echo " Run: cd $WORKSPACE_DIR && .devcontainer/setup-dev.sh"
fi
# Always cd to the workspace directory for convenience
cd "$WORKSPACE_DIR"
fi
# Add helpful aliases for Superset development
alias start-superset="$WORKSPACE_DIR/.devcontainer/start-superset.sh"
alias setup-dev="$WORKSPACE_DIR/.devcontainer/setup-dev.sh"
# Show helpful message on login
echo ""
echo "🚀 Superset Codespaces Environment"
echo "=================================="
# Check if Superset is running
if docker ps 2>/dev/null | grep -q "superset"; then
echo "✅ Superset is running!"
echo " - Check the 'Ports' tab for your live Superset URL"
echo " - Initial startup takes 10-20 minutes"
echo " - Login: admin/admin"
else
echo "⚠️ Superset is not running. Use: start-superset"
# Check if there's a startup log
if [ -f "/tmp/superset-startup.log" ]; then
echo " 📋 Startup log found: cat /tmp/superset-startup.log"
fi
fi
echo ""
echo "Quick commands:"
echo " start-superset - Start Superset with Docker Compose"
echo " setup-dev - Set up Python environment (if not already done)"
echo " pre-commit run - Run pre-commit checks on staged files"
echo ""

View File

@@ -1,20 +0,0 @@
#!/bin/bash
# Script to build and push the devcontainer image to GitHub Container Registry
# This allows caching the image between Codespace sessions
# You'll need to run this with appropriate GitHub permissions
# gh auth login --scopes write:packages
REGISTRY="ghcr.io"
OWNER="apache"
REPO="superset"
TAG="devcontainer-base"
echo "Building devcontainer image..."
docker build -t $REGISTRY/$OWNER/$REPO:$TAG .devcontainer/
echo "Pushing to GitHub Container Registry..."
docker push $REGISTRY/$OWNER/$REPO:$TAG
echo "Done! Update .devcontainer/devcontainer.json to use:"
echo " \"image\": \"$REGISTRY/$OWNER/$REPO:$TAG\""

View File

@@ -0,0 +1,19 @@
{
// Extend the base configuration
"extends": "../devcontainer-base.json",
"name": "Apache Superset Development (Default)",
// Forward ports for development
"forwardPorts": [9001],
"portsAttributes": {
"9001": {
"label": "Superset (via Webpack Dev Server)",
"onAutoForward": "notify",
"visibility": "public"
}
},
// Auto-start Superset on Codespace resume
"postStartCommand": ".devcontainer/start-superset.sh"
}

View File

@@ -0,0 +1,39 @@
{
"name": "Apache Superset Development",
// Keep this in sync with the base image in Dockerfile (ARG PY_VER)
// Using the same base as Dockerfile, but non-slim for dev tools
"image": "python:3.11.13-bookworm",
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:2": {
"moby": true,
"dockerDashComposeVersion": "v2"
},
"ghcr.io/devcontainers/features/node:1": {
"version": "20"
},
"ghcr.io/devcontainers/features/git:1": {},
"ghcr.io/devcontainers/features/common-utils:2": {
"configureZshAsDefaultShell": true
},
"ghcr.io/devcontainers/features/sshd:1": {
"version": "latest"
}
},
// Run commands after container is created
"postCreateCommand": "chmod +x .devcontainer/setup-dev.sh && .devcontainer/setup-dev.sh",
// VS Code customizations
"customizations": {
"vscode": {
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance",
"charliermarsh.ruff",
"dbaeumer.vscode-eslint",
"esbenp.prettier-vscode"
]
}
}
}

View File

@@ -1,15 +1,8 @@
{
"name": "Apache Superset Development",
// Option 1: Use pre-built image directly
// "image": "ghcr.io/apache/superset:devcontainer-base",
// Option 2: Build from Dockerfile with cache (current approach)
"build": {
"dockerfile": "Dockerfile",
"context": ".",
// Cache from the Apache registry image
"cacheFrom": ["ghcr.io/apache/superset:devcontainer-base"]
},
// Keep this in sync with the base image in Dockerfile (ARG PY_VER)
// Using the same base as Dockerfile, but non-slim for dev tools
"image": "python:3.11.13-bookworm",
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:2": {
@@ -39,17 +32,10 @@
},
// Run commands after container is created
"postCreateCommand": "bash .devcontainer/setup-dev.sh || echo '⚠️ Setup had issues - run .devcontainer/setup-dev.sh manually'",
"postCreateCommand": "chmod +x .devcontainer/setup-dev.sh && .devcontainer/setup-dev.sh",
// Auto-start Superset after ensuring Docker is ready
// Run in foreground to see any errors, but don't block on failures
"postStartCommand": "bash -c 'echo \"Waiting 30s for services to initialize...\"; sleep 30; .devcontainer/start-superset.sh || echo \"⚠️ Auto-start failed - run start-superset manually\"'",
// Set environment variables
"remoteEnv": {
// Removed automatic venv activation to prevent startup issues
// The setup script will handle this
},
// Auto-start Superset on Codespace resume
"postStartCommand": ".devcontainer/start-superset.sh",
// VS Code customizations
"customizations": {

View File

@@ -3,76 +3,30 @@
echo "🔧 Setting up Superset development environment..."
# System dependencies and uv are now pre-installed in the Docker image
# This speeds up Codespace creation significantly!
# The universal image has most tools, just need Superset-specific libs
echo "📦 Installing Superset-specific dependencies..."
sudo apt-get update
sudo apt-get install -y \
libsasl2-dev \
libldap2-dev \
libpq-dev \
tmux \
gh
# Create virtual environment using uv
echo "🐍 Creating Python virtual environment..."
if ! uv venv; then
echo "❌ Failed to create virtual environment"
exit 1
fi
# Install uv for fast Python package management
echo "📦 Installing uv..."
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install Python dependencies
echo "📦 Installing Python dependencies..."
if ! uv pip install -r requirements/development.txt; then
echo "❌ Failed to install Python dependencies"
echo "💡 You may need to run this manually after the Codespace starts"
exit 1
fi
# Install pre-commit hooks
echo "🪝 Installing pre-commit hooks..."
if source .venv/bin/activate && pre-commit install; then
echo "✅ Pre-commit hooks installed"
else
echo "⚠️ Pre-commit hooks installation failed (non-critical)"
fi
# Add cargo/bin to PATH for uv
echo 'export PATH="$HOME/.cargo/bin:$PATH"' >> ~/.bashrc
echo 'export PATH="$HOME/.cargo/bin:$PATH"' >> ~/.zshrc
# Install Claude Code CLI via npm
echo "🤖 Installing Claude Code..."
if npm install -g @anthropic-ai/claude-code; then
echo "✅ Claude Code installed"
else
echo "⚠️ Claude Code installation failed (non-critical)"
fi
npm install -g @anthropic-ai/claude-code
# Make the start script executable
chmod +x .devcontainer/start-superset.sh
# Add bashrc additions for automatic venv activation
echo "🔧 Setting up automatic environment activation..."
if [ -f ~/.bashrc ]; then
# Check if we've already added our additions
if ! grep -q "Superset Codespaces environment setup" ~/.bashrc; then
echo "" >> ~/.bashrc
cat .devcontainer/bashrc-additions >> ~/.bashrc
echo "✅ Added automatic venv activation to ~/.bashrc"
else
echo "✅ Bashrc additions already present"
fi
else
# Create bashrc if it doesn't exist
cat .devcontainer/bashrc-additions > ~/.bashrc
echo "✅ Created ~/.bashrc with automatic venv activation"
fi
# Also add to zshrc since that's the default shell
if [ -f ~/.zshrc ] || [ -n "$ZSH_VERSION" ]; then
if ! grep -q "Superset Codespaces environment setup" ~/.zshrc; then
echo "" >> ~/.zshrc
cat .devcontainer/bashrc-additions >> ~/.zshrc
echo "✅ Added automatic venv activation to ~/.zshrc"
fi
fi
echo "✅ Development environment setup complete!"
echo ""
echo "📝 The virtual environment will be automatically activated in new terminals"
echo ""
echo "🔄 To activate in this terminal, run:"
echo " source ~/.bashrc"
echo ""
echo "🚀 To start Superset:"
echo " start-superset"
echo ""
echo "🚀 Run '.devcontainer/start-superset.sh' to start Superset"

View File

@@ -1,14 +1,14 @@
#!/bin/bash
# Startup script for Superset in Codespaces
# Log to a file for debugging
LOG_FILE="/tmp/superset-startup.log"
echo "[$(date)] Starting Superset startup script" >> "$LOG_FILE"
echo "[$(date)] User: $(whoami), PWD: $(pwd)" >> "$LOG_FILE"
echo "🚀 Starting Superset in Codespaces..."
echo "🌐 Frontend will be available at port 9001"
# Check if MCP is enabled
if [ "$ENABLE_MCP" = "true" ]; then
echo "🤖 MCP Service will be available at port 5008"
fi
# Find the workspace directory (Codespaces clones as 'superset', not 'superset-2')
WORKSPACE_DIR=$(find /workspaces -maxdepth 1 -name "superset*" -type d | head -1)
if [ -n "$WORKSPACE_DIR" ]; then
@@ -18,71 +18,32 @@ else
echo "📁 Using current directory: $(pwd)"
fi
# Wait for Docker to be available
echo "⏳ Waiting for Docker to start..."
echo "[$(date)] Waiting for Docker..." >> "$LOG_FILE"
max_attempts=30
attempt=0
while ! docker info > /dev/null 2>&1; do
if [ $attempt -eq $max_attempts ]; then
echo "❌ Docker failed to start after $max_attempts attempts"
echo "[$(date)] Docker failed to start after $max_attempts attempts" >> "$LOG_FILE"
echo "🔄 Please restart the Codespace or run this script manually later"
exit 1
fi
echo " Attempt $((attempt + 1))/$max_attempts..."
echo "[$(date)] Docker check attempt $((attempt + 1))/$max_attempts" >> "$LOG_FILE"
sleep 2
attempt=$((attempt + 1))
done
echo "✅ Docker is ready!"
echo "[$(date)] Docker is ready" >> "$LOG_FILE"
# Check if Superset containers are already running
if docker ps | grep -q "superset"; then
echo "✅ Superset containers are already running!"
echo ""
echo "🌐 To access Superset:"
echo " 1. Click the 'Ports' tab at the bottom of VS Code"
echo " 2. Find port 9001 and click the globe icon to open"
echo " 3. Wait 10-20 minutes for initial startup"
echo ""
echo "📝 Login credentials: admin/admin"
exit 0
# Check if docker is running
if ! docker info > /dev/null 2>&1; then
echo " Waiting for Docker to start..."
sleep 5
fi
# Clean up any existing containers
echo "🧹 Cleaning up existing containers..."
docker-compose -f docker-compose-light.yml down
docker-compose -f docker-compose-light.yml --profile mcp down
# Start services
echo "🏗️ Starting Superset in background (daemon mode)..."
echo "🏗️ Building and starting services..."
echo ""
echo "📝 Once started, login with:"
echo " Username: admin"
echo " Password: admin"
echo ""
echo "📋 Running in foreground with live logs (Ctrl+C to stop)..."
# Start in detached mode
docker-compose -f docker-compose-light.yml up -d
echo ""
echo "✅ Docker Compose started successfully!"
echo ""
echo "📋 Important information:"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "⏱️ Initial startup takes 10-20 minutes"
echo "🌐 Check the 'Ports' tab for your Superset URL (port 9001)"
echo "👤 Login: admin / admin"
echo ""
echo "📊 Useful commands:"
echo " docker-compose -f docker-compose-light.yml logs -f # Follow logs"
echo " docker-compose -f docker-compose-light.yml ps # Check status"
echo " docker-compose -f docker-compose-light.yml down # Stop services"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
echo "💤 Keeping terminal open for 60 seconds to test persistence..."
sleep 60
echo "✅ Test complete - check if this terminal is still visible!"
# Show final status
docker-compose -f docker-compose-light.yml ps
# Run docker-compose and capture exit code
if [ "$ENABLE_MCP" = "true" ]; then
echo "🤖 Starting with MCP Service enabled..."
docker-compose -f docker-compose-light.yml --profile mcp up
else
docker-compose -f docker-compose-light.yml up
fi
EXIT_CODE=$?
# If it failed, provide helpful instructions

View File

@@ -0,0 +1,29 @@
{
// Extend the base configuration
"extends": "../devcontainer-base.json",
"name": "Apache Superset Development with MCP",
// Forward ports for development
"forwardPorts": [9001, 5008],
"portsAttributes": {
"9001": {
"label": "Superset (via Webpack Dev Server)",
"onAutoForward": "notify",
"visibility": "public"
},
"5008": {
"label": "MCP Service (Model Context Protocol)",
"onAutoForward": "notify",
"visibility": "private"
}
},
// Auto-start Superset with MCP on Codespace resume
"postStartCommand": "ENABLE_MCP=true .devcontainer/start-superset.sh",
// Environment variables
"containerEnv": {
"ENABLE_MCP": "true"
}
}

12
.github/CODEOWNERS vendored
View File

@@ -2,7 +2,7 @@
# https://github.com/apache/superset/issues/13351
/superset/migrations/ @mistercrunch @michael-s-molina @betodealmeida @eschutho @sadpandajoe
/superset/migrations/ @mistercrunch @michael-s-molina @betodealmeida @eschutho
# Notify some committers of changes in the components
@@ -30,13 +30,3 @@
**/*.geojson @villebro @rusackas
/superset-frontend/plugins/legacy-plugin-chart-country-map/ @villebro @rusackas
# Notify PMC members of changes to extension-related files
/superset-core/ @michael-s-molina @villebro
/superset-cli/ @michael-s-molina @villebro
/superset/core/ @michael-s-molina @villebro
/superset/extensions/ @michael-s-molina @villebro
/superset-frontend/src/packages/superset-core/ @michael-s-molina @villebro
/superset-frontend/src/core/ @michael-s-molina @villebro
/superset-frontend/src/extensions/ @michael-s-molina @villebro

View File

@@ -1,27 +1,24 @@
name: Change Detector
description: Detects file changes for pull request and push events
name: 'Change Detector'
description: 'Detects file changes for pull request and push events'
inputs:
token:
description: GitHub token for authentication
description: 'GitHub token for authentication'
required: true
outputs:
python:
description: Whether Python-related files were changed
description: 'Whether Python-related files were changed'
value: ${{ steps.change-detector.outputs.python }}
frontend:
description: Whether frontend-related files were changed
description: 'Whether frontend-related files were changed'
value: ${{ steps.change-detector.outputs.frontend }}
docker:
description: Whether docker-related files were changed
description: 'Whether docker-related files were changed'
value: ${{ steps.change-detector.outputs.docker }}
docs:
description: Whether docs-related files were changed
description: 'Whether docs-related files were changed'
value: ${{ steps.change-detector.outputs.docs }}
superset-cli:
description: Whether superset-cli package-related files were changed
value: ${{ steps.change-detector.outputs.superset-cli }}
runs:
using: composite
using: 'composite'
steps:
- name: Detect file changes
id: change-detector

View File

@@ -102,7 +102,7 @@ jobs:
docker history $IMAGE_TAG
- name: docker-compose sanity check
if: (steps.check.outputs.python || steps.check.outputs.frontend || steps.check.outputs.docker) && matrix.build_preset == 'dev'
if: (steps.check.outputs.python || steps.check.outputs.frontend || steps.check.outputs.docker) && (matrix.build_preset == 'dev' || matrix.build_preset == 'lean')
shell: bash
run: |
export SUPERSET_BUILD_TARGET=${{ matrix.build_preset }}

View File

@@ -1,67 +0,0 @@
name: Superset App CLI tests
on:
push:
branches:
- "master"
- "[0-9].[0-9]*"
pull_request:
types: [synchronize, opened, reopened, ready_for_review]
# cancel previous workflow jobs for PRs
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true
jobs:
test-load-examples:
runs-on: ubuntu-24.04
env:
PYTHONPATH: ${{ github.workspace }}
SUPERSET_CONFIG: tests.integration_tests.superset_test_config
REDIS_PORT: 16379
SUPERSET__SQLALCHEMY_DATABASE_URI: postgresql+psycopg2://superset:superset@127.0.0.1:15432/superset
services:
postgres:
image: postgres:16-alpine
env:
POSTGRES_USER: superset
POSTGRES_PASSWORD: superset
ports:
# Use custom ports for services to avoid accidentally connecting to
# GitHub action runner's default installations
- 15432:5432
redis:
image: redis:7-alpine
ports:
- 16379:6379
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@v4
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
if: steps.check.outputs.python
uses: ./.github/actions/setup-backend/
- name: Setup Postgres
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
run: setup-postgres
- name: superset init
if: steps.check.outputs.python
run: |
pip install -e .
superset db upgrade
superset load_test_users
- name: superset load_examples
if: steps.check.outputs.python
run: |
# load examples without test data
superset load_examples --load-big-data

View File

@@ -1,4 +1,4 @@
name: Superset CLI Package Tests
name: Superset CLI tests
on:
push:
@@ -14,51 +14,54 @@ concurrency:
cancel-in-progress: true
jobs:
test-superset-cli-package:
test-load-examples:
runs-on: ubuntu-24.04
strategy:
matrix:
python-version: ["previous", "current", "next"]
defaults:
run:
working-directory: superset-cli
env:
PYTHONPATH: ${{ github.workspace }}
SUPERSET_CONFIG: tests.integration_tests.superset_test_config
REDIS_PORT: 16379
SUPERSET__SQLALCHEMY_DATABASE_URI: postgresql+psycopg2://superset:superset@127.0.0.1:15432/superset
services:
postgres:
image: postgres:16-alpine
env:
POSTGRES_USER: superset
POSTGRES_PASSWORD: superset
ports:
# Use custom ports for services to avoid accidentally connecting to
# GitHub action runner's default installations
- 15432:5432
redis:
image: redis:7-alpine
ports:
- 16379:6379
steps:
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
uses: actions/checkout@v4
with:
persist-credentials: false
submodules: recursive
- name: Check for file changes
id: check
uses: ./.github/actions/change-detector/
with:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Python
if: steps.check.outputs.superset-cli
if: steps.check.outputs.python
uses: ./.github/actions/setup-backend/
- name: Setup Postgres
if: steps.check.outputs.python
uses: ./.github/actions/cached-dependencies
with:
python-version: ${{ matrix.python-version }}
requirements-type: dev
- name: Run pytest with coverage
if: steps.check.outputs.superset-cli
run: setup-postgres
- name: superset init
if: steps.check.outputs.python
run: |
pytest --cov=superset_cli --cov-report=xml --cov-report=term-missing --cov-report=html -v --tb=short
- name: Upload coverage reports to Codecov
if: steps.check.outputs.superset-cli
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
flags: superset-cli
name: superset-cli-coverage
fail_ci_if_error: false
- name: Upload HTML coverage report
if: steps.check.outputs.superset-cli
uses: actions/upload-artifact@v4
with:
name: superset-cli-coverage-html
path: htmlcov/
pip install -e .
superset db upgrade
superset load_test_users
- name: superset load_examples
if: steps.check.outputs.python
run: |
# load examples without test data
superset load_examples --load-big-data

View File

@@ -12,7 +12,7 @@ jobs:
steps:
- name: Welcome Message
uses: actions/first-interaction@v2
uses: actions/first-interaction@v1
continue-on-error: true
with:
repo-token: ${{ github.token }}

5
.gitignore vendored
View File

@@ -43,7 +43,7 @@ _modules
_static
build
app.db
*.egg-info/
apache_superset.egg-info/
changelog.sh
dist
dump.rdb
@@ -130,7 +130,4 @@ superset/static/stats/statistics.html
# LLM-related
CLAUDE.local.md
PROJECT.md
.aider*
.claude_rc*
.env.local

View File

@@ -23,9 +23,7 @@ repos:
rev: v1.15.0
hooks:
- id: mypy
name: mypy (main)
args: [--check-untyped-defs]
exclude: ^superset-cli/
additional_dependencies: [
types-simplejson,
types-python-dateutil,
@@ -40,10 +38,6 @@ repos:
types-paramiko,
types-Markdown,
]
- id: mypy
name: mypy (superset-cli)
args: [--check-untyped-defs]
files: ^superset-cli/
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
@@ -60,25 +54,25 @@ repos:
args: ["--markdown-linebreak-ext=md"]
- repo: local
hooks:
- id: eslint-frontend
name: eslint (frontend)
entry: ./scripts/eslint.sh
language: system
pass_filenames: true
files: ^superset-frontend/.*\.(js|jsx|ts|tsx)$
- id: eslint-docs
name: eslint (docs)
entry: bash -c 'cd docs && FILES=$(echo "$@" | sed "s|docs/||g") && yarn eslint --fix --ext .js,.jsx,.ts,.tsx --quiet $FILES'
language: system
pass_filenames: true
files: ^docs/.*\.(js|jsx|ts|tsx)$
- id: type-checking-frontend
name: Type-Checking (Frontend)
entry: ./scripts/check-type.js package=superset-frontend excludeDeclarationDir=cypress-base
language: system
files: ^superset-frontend\/.*\.(js|jsx|ts|tsx)$
exclude: ^superset-frontend/cypress-base\/
require_serial: true
- id: eslint-frontend
name: eslint (frontend)
entry: ./scripts/eslint.sh
language: system
pass_filenames: true
files: ^superset-frontend/.*\.(js|jsx|ts|tsx)$
- id: eslint-docs
name: eslint (docs)
entry: bash -c 'cd docs && FILES=$(echo "$@" | sed "s|docs/||g") && yarn eslint --fix --ext .js,.jsx,.ts,.tsx --quiet $FILES'
language: system
pass_filenames: true
files: ^docs/.*\.(js|jsx|ts|tsx)$
- id: type-checking-frontend
name: Type-Checking (Frontend)
entry: ./scripts/check-type.js package=superset-frontend excludeDeclarationDir=cypress-base
language: system
files: ^superset-frontend\/.*\.(js|jsx|ts|tsx)$
exclude: ^superset-frontend/cypress-base\/
require_serial: true
# blacklist unsafe functions like make_url (see #19526)
- repo: https://github.com/skorokithakis/blacklist-pre-commit-hook
rev: e2f070289d8eddcaec0b580d3bde29437e7c8221
@@ -100,21 +94,21 @@ repos:
args: [--fix]
- repo: local
hooks:
- id: pylint
name: pylint with custom Superset plugins
entry: bash
language: system
types: [python]
exclude: ^(tests/|superset/migrations/|scripts/|RELEASING/|docker/)
args:
- -c
- |
TARGET_BRANCH=${GITHUB_BASE_REF:-master}
git fetch origin "$TARGET_BRANCH"
BASE=$(git merge-base origin/"$TARGET_BRANCH" HEAD)
files=$(git diff --name-only --diff-filter=ACM "$BASE"..HEAD | grep '^superset/.*\.py$' || true)
if [ -n "$files" ]; then
pylint --rcfile=.pylintrc --load-plugins=superset.extensions.pylint --reports=no $files
else
echo "No Python files to lint."
fi
- id: pylint
name: pylint with custom Superset plugins
entry: bash
language: system
types: [python]
exclude: ^(tests/|superset/migrations/|scripts/|RELEASING/|docker/)
args:
- -c
- |
TARGET_BRANCH=${GITHUB_BASE_REF:-master}
git fetch origin "$TARGET_BRANCH"
BASE=$(git merge-base origin/"$TARGET_BRANCH" HEAD)
files=$(git diff --name-only --diff-filter=ACM "$BASE"..HEAD | grep '^superset/.*\.py$' || true)
if [ -n "$files" ]; then
pylint --rcfile=.pylintrc --load-plugins=superset.extensions.pylint --reports=no $files
else
echo "No Python files to lint."
fi

View File

@@ -32,8 +32,6 @@ apache_superset.egg-info
# json and csv in general cannot have comments
.*json
.*csv
# jinja templates often need to be as-is
.*j2
# Generated doc files
env/*
docs/.htaccess*

215
CHART_METADATA_API.md Normal file
View File

@@ -0,0 +1,215 @@
# Chart Metadata API Reference
The Superset MCP service provides rich metadata alongside chart generation to enable better UI integration and user experiences.
## Background & Design Philosophy
Modern chart systems need to provide more than just visual output. Inspired by contemporary web standards and LLM integration patterns, this metadata system addresses several key needs:
**Accessibility-First Design**: Following WCAG guidelines and `aria-*` attribute patterns, charts include semantic descriptions and accessibility metadata to ensure inclusive experiences.
**Rich Context for AI Systems**: Similar to how platforms like social media generate rich previews (OpenGraph, Twitter Cards), charts provide semantic understanding beyond just visual representation - enabling AI agents to reason about and describe visualizations meaningfully.
**Performance-Aware Integration**: Modern web APIs emphasize performance transparency (Core Web Vitals, etc.). Charts include execution metrics and optimization suggestions to help UIs make informed decisions about rendering and user feedback.
**Capability-Driven UX**: Rather than requiring UIs to hardcode chart type behaviors, the system exposes what each chart can actually do - enabling dynamic, contextual interfaces that adapt to chart capabilities.
## Overview
When generating charts via `generate_chart`, the response includes structured metadata that helps UIs:
- Present appropriate controls and interactions
- Generate accessible descriptions
- Optimize rendering performance
- Guide user workflows
## Metadata Types
### ChartCapabilities
Describes what interactions and features the chart supports.
```python
{
"supports_interaction": bool, # User can interact (zoom, pan, hover)
"supports_real_time": bool, # Chart can update with live data
"supports_drill_down": bool, # Can navigate to more detailed views
"supports_export": bool, # Can be exported to other formats
"optimal_formats": [ # Recommended preview formats
"url", # Static image URL
"interactive", # HTML with JavaScript controls
"ascii", # Text-based representation
"vega_lite" # Vega-Lite specification
],
"data_types": [ # Types of data visualized
"time_series", # Time-based data
"categorical", # Discrete categories
"metric" # Numeric measurements
]
}
```
**UI Integration:**
- Show/hide interaction controls based on `supports_interaction`
- Enable real-time updates if `supports_real_time`
- Display drill-down options for `supports_drill_down`
- Choose optimal preview format from `optimal_formats`
### ChartSemantics
Provides semantic understanding of what the chart represents and reveals.
```python
{
"primary_insight": "Shows trends and changes over time",
"data_story": "This line chart analyzes sales, revenue over Q1-Q4",
"recommended_actions": [
"Review data patterns and trends",
"Consider filtering for more detail",
"Export chart for reporting"
],
"anomalies": [], # Notable outliers (future enhancement)
"statistical_summary": {} # Key statistics (future enhancement)
}
```
**UI Integration:**
- Display `primary_insight` as chart description
- Use `data_story` for accessibility and tooltips
- Show `recommended_actions` as suggested next steps
- Highlight `anomalies` in the visualization
### AccessibilityMetadata
Information for creating inclusive, accessible chart experiences.
```python
{
"color_blind_safe": bool, # Uses colorblind-friendly palette
"alt_text": "Chart showing Sales Data over time",
"high_contrast_available": bool # High contrast version available
}
```
**UI Integration:**
- Use `alt_text` for screen readers
- Show accessibility indicators if `color_blind_safe`
- Offer high contrast mode if available
### PerformanceMetadata
Performance information for optimization and user feedback.
```python
{
"query_duration_ms": 1250, # Time to generate chart data
"cache_status": "hit|miss|error", # Whether data came from cache
"optimization_suggestions": [ # Performance improvement tips
"Consider adding date filters to reduce data volume",
"Chart complexity may impact load time"
]
}
```
**UI Integration:**
- Show loading indicators based on `query_duration_ms`
- Display cache status for debugging
- Present `optimization_suggestions` to users
- Warn about slow queries
## Example Response
```json
{
"chart": {
"id": 123,
"slice_name": "Sales Trends Q1-Q4",
"viz_type": "echarts_timeseries_line",
"url": "/explore/?slice_id=123"
},
"capabilities": {
"supports_interaction": true,
"supports_real_time": false,
"supports_drill_down": false,
"supports_export": true,
"optimal_formats": ["url", "interactive", "ascii"],
"data_types": ["time_series", "metric"]
},
"semantics": {
"primary_insight": "Shows trends and changes over time",
"data_story": "This line chart analyzes sales over Q1-Q4",
"recommended_actions": [
"Review seasonal patterns",
"Export for quarterly report"
]
},
"accessibility": {
"color_blind_safe": true,
"alt_text": "Line chart showing sales trends from Q1 to Q4",
"high_contrast_available": false
},
"performance": {
"query_duration_ms": 450,
"cache_status": "miss",
"optimization_suggestions": []
}
}
```
## Usage Examples
### React Component Integration
```jsx
function ChartComponent({ chartData }) {
const { capabilities, semantics, accessibility, performance } = chartData;
return (
<div>
{/* Accessibility */}
<img
src={chartData.chart.url}
alt={accessibility.alt_text}
aria-describedby="chart-description"
/>
{/* Semantic description */}
<p id="chart-description">{semantics.primary_insight}</p>
{/* Conditional controls based on capabilities */}
{capabilities.supports_interaction && (
<InteractiveControls />
)}
{capabilities.supports_export && (
<ExportButton />
)}
{/* Performance feedback */}
{performance.query_duration_ms > 2000 && (
<SlowQueryWarning suggestions={performance.optimization_suggestions} />
)}
{/* Recommended actions */}
<ActionSuggestions actions={semantics.recommended_actions} />
</div>
);
}
```
## Chart Type Mapping
Different chart types provide different capabilities:
| Chart Type | Interaction | Real-time | Drill-down | Optimal Formats |
|------------|------------|-----------|------------|-----------------|
| `echarts_timeseries_line` | ✅ | ✅ | ❌ | url, interactive, ascii |
| `echarts_timeseries_bar` | ✅ | ✅ | ❌ | url, interactive, ascii |
| `table` | ❌ | ❌ | ✅ | url, table, ascii |
| `pie` | ✅ | ❌ | ❌ | url, interactive |
## Future Enhancements
- **Statistical Summary**: Automatic calculation of mean, median, trends
- **Anomaly Detection**: Identification of outliers and unusual patterns
- **Smart Recommendations**: ML-powered suggestions for chart improvements
- **Accessibility Scoring**: Automated accessibility compliance checking

View File

@@ -145,9 +145,6 @@ RUN if [ "$BUILD_TRANSLATIONS" = "true" ]; then \
######################################################################
FROM python-base AS python-common
# Build arg to pre-populate examples DuckDB file
ARG LOAD_EXAMPLES_DUCKDB="false"
ENV SUPERSET_HOME="/app/superset_home" \
HOME="/app/superset_home" \
SUPERSET_ENV="production" \
@@ -199,18 +196,6 @@ RUN /app/docker/apt-install.sh \
libecpg-dev \
libldap2-dev
# Pre-load examples DuckDB file if requested
RUN if [ "$LOAD_EXAMPLES_DUCKDB" = "true" ]; then \
mkdir -p /app/data && \
echo "Downloading pre-built examples.duckdb..." && \
curl -L -o /app/data/examples.duckdb \
"https://raw.githubusercontent.com/apache-superset/examples-data/master/examples.duckdb" && \
chown -R superset:superset /app/data; \
else \
mkdir -p /app/data && \
chown -R superset:superset /app/data; \
fi
# Copy compiled things from previous stages
COPY --from=superset-node /app/superset/static/assets superset/static/assets
@@ -234,10 +219,6 @@ FROM python-common AS lean
# Install Python dependencies using docker/pip-install.sh
COPY requirements/base.txt requirements/
# Copy superset-core package needed for editable install in base.txt
COPY superset-core superset-core
RUN --mount=type=cache,target=${SUPERSET_HOME}/.cache/uv \
/app/docker/pip-install.sh --requires-build-essential -r requirements/base.txt
# Install the superset package
@@ -260,11 +241,6 @@ RUN /app/docker/apt-install.sh \
# Copy development requirements and install them
COPY requirements/*.txt requirements/
# Copy local packages needed for editable installs in development.txt
COPY superset-core superset-core
COPY superset-cli superset-cli
# Install Python dependencies using docker/pip-install.sh
RUN --mount=type=cache,target=${SUPERSET_HOME}/.cache/uv \
/app/docker/pip-install.sh --requires-build-essential -r requirements/development.txt
@@ -282,15 +258,6 @@ USER superset
######################################################################
FROM lean AS ci
USER root
RUN uv pip install .[postgres,duckdb]
USER superset
CMD ["/app/docker/entrypoints/docker-ci.sh"]
######################################################################
# Showtime image - lean + DuckDB for examples database
######################################################################
FROM lean AS showtime
USER root
RUN uv pip install .[duckdb]
RUN uv pip install .[postgres]
USER superset
CMD ["/app/docker/entrypoints/docker-ci.sh"]

View File

@@ -9,16 +9,13 @@ Apache Superset is a data visualization platform with Flask/Python backend and R
### Frontend Modernization
- **NO `any` types** - Use proper TypeScript types
- **NO JavaScript files** - Convert to TypeScript (.ts/.tsx)
- **Use @superset-ui/core** - Don't import Ant Design directly, prefer Ant Design component wrappers from @superset-ui/core/components
- **Use antd theming tokens** - Prefer antd tokens over legacy theming tokens
- **Avoid custom css and styles** - Follow antd best practices and avoid styling and custom CSS whenever possible
- **Use @superset-ui/core** - Don't import Ant Design directly
### Testing Strategy Migration
- **Prefer unit tests** over integration tests
- **Prefer integration tests** over Cypress end-to-end tests
- **Cypress is last resort** - Actively moving away from Cypress
- **Use Jest + React Testing Library** for component testing
- **Use `test()` instead of `describe()`** - Follow [avoid nesting when testing](https://kentcdodds.com/blog/avoid-nesting-when-youre-testing) principles
### Backend Type Safety
- **Add type hints** - All new Python code needs proper typing
@@ -183,6 +180,7 @@ pre-commit run eslint # Frontend linting
## Platform-Specific Instructions
- **[LLMS.md](LLMS.md)** - General LLM development guide (READ THIS FIRST)
- **[CLAUDE.md](CLAUDE.md)** - For Claude/Anthropic tools
- **[.github/copilot-instructions.md](.github/copilot-instructions.md)** - For GitHub Copilot
- **[GEMINI.md](GEMINI.md)** - For Google Gemini tools

View File

@@ -32,10 +32,11 @@ else
SUPERSET_VERSION="${1}"
SUPERSET_RC="${2}"
SUPERSET_PGP_FULLNAME="${3}"
SUPERSET_VERSION_RC="${SUPERSET_VERSION}rc${SUPERSET_RC}"
SUPERSET_RELEASE_RC_TARBALL="apache_superset-${SUPERSET_VERSION_RC}-source.tar.gz"
fi
SUPERSET_VERSION_RC="${SUPERSET_VERSION}rc${SUPERSET_RC}"
if [ -z "${SUPERSET_SVN_DEV_PATH}" ]; then
SUPERSET_SVN_DEV_PATH="$HOME/svn/superset_dev"
fi

View File

@@ -28,7 +28,6 @@ These features are considered **unfinished** and should only be used on developm
[//]: # "PLEASE KEEP THE LIST SORTED ALPHABETICALLY"
- ALERT_REPORT_TABS
- DATE_RANGE_TIMESHIFTS_ENABLED
- ENABLE_ADVANCED_DATA_TYPES
- PRESTO_EXPAND_DATA
- SHARE_QUERIES_VIA_KV_STORE

View File

@@ -94,9 +94,9 @@ under the License.
| can available domains on Superset |:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|O|
| can request access on Superset |:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|O|
| can dashboard on Superset |:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|O|
| can post on TableSchemaView |:heavy_check_mark:|O|O|:heavy_check_mark:|
| can expanded on TableSchemaView |:heavy_check_mark:|O|O|:heavy_check_mark:|
| can delete on TableSchemaView |:heavy_check_mark:|O|O|:heavy_check_mark:|
| can post on TableSchemaView |:heavy_check_mark:|:heavy_check_mark:|O|O|
| can expanded on TableSchemaView |:heavy_check_mark:|:heavy_check_mark:|O|O|
| can delete on TableSchemaView |:heavy_check_mark:|:heavy_check_mark:|O|O|
| can get on TabStateView |:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|
| can post on TabStateView |:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|
| can delete query on TabStateView |:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|

View File

@@ -23,14 +23,6 @@ This file documents any backwards-incompatible changes in Superset and
assists people when migrating to a new version.
## Next
- [34782](https://github.com/apache/superset/pull/34782): Dataset exports now include the dataset ID in their file name (similar to charts and dashboards). If managing assets as code, make sure to rename existing dataset YAMLs to include the ID (and avoid duplicated files).
- [34536](https://github.com/apache/superset/pull/34536): The `ENVIRONMENT_TAG_CONFIG` color values have changed to support only Ant Design semantic colors. Update your `superset_config.py`:
- Change `"error.base"` to just `"error"` after this PR
- Change any hex color values to one of: `"success"`, `"processing"`, `"error"`, `"warning"`, `"default"`
- Custom colors are no longer supported to maintain consistency with Ant Design components
- [34561](https://github.com/apache/superset/pull/34561) Added tiled screenshot functionality for Playwright-based reports to handle large dashboards more efficiently. When enabled (default: `SCREENSHOT_TILED_ENABLED = True`), dashboards with 20+ charts or height exceeding 5000px will be captured using multiple viewport-sized tiles and combined into a single image. This improves report generation performance and reliability for large dashboards.
Note: Pillow is now a required dependency (previously optional) to support image processing for tiled screenshots.
`thumbnails` optional dependency is now deprecated and will be removed in the next major release (7.0).
- [33084](https://github.com/apache/superset/pull/33084) The DISALLOWED_SQL_FUNCTIONS configuration now includes additional potentially sensitive database functions across PostgreSQL, MySQL, SQLite, MS SQL Server, and ClickHouse. Existing queries using these functions may now be blocked. Review your SQL Lab queries and dashboards if you encounter "disallowed function" errors after upgrading
- [34235](https://github.com/apache/superset/pull/34235) CSV exports now use `utf-8-sig` encoding by default to include a UTF-8 BOM, improving compatibility with Excel.
- [34258](https://github.com/apache/superset/pull/34258) changing the default in Dockerfile to INCLUDE_CHROMIUM="false" (from "true") in the past. This ensures the `lean` layer is lean by default, and people can opt-in to the `chromium` layer by setting the build arg `INCLUDE_CHROMIUM=true`. This is a breaking change for anyone using the `lean` layer, as it will no longer include Chromium by default.
@@ -40,7 +32,6 @@ Note: Pillow is now a required dependency (previously optional) to support image
- [32317](https://github.com/apache/superset/pull/32317) The horizontal filter bar feature is now out of testing/beta development and its feature flag `HORIZONTAL_FILTER_BAR` has been removed.
- [31590](https://github.com/apache/superset/pull/31590) Marks the begining of intricate work around supporting dynamic Theming, and breaks support for [THEME_OVERRIDES](https://github.com/apache/superset/blob/732de4ac7fae88e29b7f123b6cbb2d7cd411b0e4/superset/config.py#L671) in favor of a new theming system based on AntD V5. Likely this will be in disrepair until settling over the 5.x lifecycle.
- [32432](https://github.com/apache/superset/pull/31260) Moves the List Roles FAB view to the frontend and requires `FAB_ADD_SECURITY_API` to be enabled in the configuration and `superset init` to be executed.
- [34319](https://github.com/apache/superset/pull/34319) Drill to Detail and Drill By is now supported in Embedded mode, and also with the `DASHBOARD_RBAC` FF. If you don't want to expose these features in Embedded / `DASHBOARD_RBAC`, make sure the roles used for Embedded / `DASHBOARD_RBAC`don't have the required permissions to perform D2D actions.
## 5.0.0

View File

@@ -17,47 +17,22 @@
# -----------------------------------------------------------------------
# Lightweight docker-compose for running multiple Superset instances
# This includes only essential services: database and Superset app (no Redis)
# This includes only essential services: database, Redis, and Superset app
#
# RUNNING SUPERSET:
# 1. Start services: docker-compose -f docker-compose-light.yml up
# 2. Access at: http://localhost:9001 (or NODE_PORT if specified)
#
# RUNNING MULTIPLE INSTANCES:
# IMPORTANT: To run multiple instances in parallel:
# - Use different project names: docker-compose -p project1 -f docker-compose-light.yml up
# - Use different NODE_PORT values: NODE_PORT=9002 docker-compose -p project2 -f docker-compose-light.yml up
# - Volumes are isolated by project name (e.g., project1_db_home_light, project2_db_home_light)
# - Database name is intentionally different (superset_light) to prevent accidental cross-connections
#
# RUNNING TESTS WITH PYTEST:
# Tests run in an isolated environment with a separate test database.
# The pytest-runner service automatically creates and initializes the test database on first use.
# MCP Service (Model Context Protocol):
# - Optional service for LLM agent integration, available under 'mcp' profile
# - To include MCP: docker-compose -f docker-compose-light.yml --profile mcp up
# - MCP runs on port 5008 by default (customize with MCP_PORT=5009)
# - Enable SQL debugging with MCP_SQL_DEBUG=true
#
# Basic usage:
# docker-compose -f docker-compose-light.yml run --rm pytest-runner pytest tests/unit_tests/
#
# Run specific test file:
# docker-compose -f docker-compose-light.yml run --rm pytest-runner pytest tests/unit_tests/test_foo.py
#
# Run with pytest options:
# docker-compose -f docker-compose-light.yml run --rm pytest-runner pytest -v -s -x tests/
#
# Force reload test database and run tests (when tests are failing due to bad state):
# docker-compose -f docker-compose-light.yml run --rm -e FORCE_RELOAD=true pytest-runner pytest tests/
#
# Run any command in test environment:
# docker-compose -f docker-compose-light.yml run --rm pytest-runner bash
# docker-compose -f docker-compose-light.yml run --rm pytest-runner pytest --collect-only
#
# For parallel test execution with different projects:
# docker-compose -p project1 -f docker-compose-light.yml run --rm pytest-runner pytest tests/
#
# DEVELOPMENT TIPS:
# - First test run takes ~20-30 seconds (database creation + initialization)
# - Subsequent runs are fast (~2-3 seconds startup)
# - Use FORCE_RELOAD=true when you need a clean test database
# - Tests use SimpleCache instead of Redis (no Redis required)
# - Set SUPERSET_LOG_LEVEL=debug in docker/.env-local for detailed logs
# For verbose logging during development:
# - Set SUPERSET_LOG_LEVEL=debug in docker/.env-local for detailed Superset logs
# -----------------------------------------------------------------------
x-superset-user: &superset-user root
x-superset-volumes: &superset-volumes
@@ -77,7 +52,6 @@ x-common-build: &common-build
INCLUDE_CHROMIUM: ${INCLUDE_CHROMIUM:-false}
INCLUDE_FIREFOX: ${INCLUDE_FIREFOX:-false}
BUILD_TRANSLATIONS: ${BUILD_TRANSLATIONS:-false}
LOAD_EXAMPLES_DUCKDB: ${LOAD_EXAMPLES_DUCKDB:-true}
services:
db-light:
@@ -88,12 +62,13 @@ services:
required: false
image: postgres:16
restart: unless-stopped
# No host port mapping - only accessible within Docker network
volumes:
- db_home_light:/var/lib/postgresql/data
- ./docker/docker-entrypoint-initdb.d:/docker-entrypoint-initdb.d
environment:
# Override database name to avoid conflicts
POSTGRES_DB: superset_light
command: postgres -c max_connections=200
superset-light:
env_file:
@@ -105,6 +80,7 @@ services:
<<: *common-build
command: ["/app/docker/docker-bootstrap.sh", "app"]
restart: unless-stopped
# No host port mapping - accessed via webpack dev server proxy
extra_hosts:
- "host.docker.internal:host-gateway"
user: *superset-user
@@ -113,10 +89,15 @@ services:
condition: service_completed_successfully
volumes: *superset-volumes
environment:
# Override DB connection for light service
DATABASE_HOST: db-light
DATABASE_DB: superset_light
POSTGRES_DB: superset_light
SUPERSET__SQLALCHEMY_EXAMPLES_URI: "duckdb:////app/data/examples.duckdb"
EXAMPLES_HOST: db-light
EXAMPLES_DB: superset_light
EXAMPLES_USER: superset
EXAMPLES_PASSWORD: superset
# Use light-specific config that disables Redis
SUPERSET_CONFIG_PATH: /app/docker/pythonpath_dev/superset_config_docker_light.py
superset-init-light:
@@ -128,16 +109,21 @@ services:
required: true
- path: docker/.env-local # optional override
required: false
user: *superset-user
depends_on:
db-light:
condition: service_started
user: *superset-user
volumes: *superset-volumes
environment:
# Override DB connection for light service
DATABASE_HOST: db-light
DATABASE_DB: superset_light
POSTGRES_DB: superset_light
SUPERSET__SQLALCHEMY_EXAMPLES_URI: "duckdb:////app/data/examples.duckdb"
EXAMPLES_HOST: db-light
EXAMPLES_DB: superset_light
EXAMPLES_USER: superset
EXAMPLES_PASSWORD: superset
# Use light-specific config that disables Redis
SUPERSET_CONFIG_PATH: /app/docker/pythonpath_dev/superset_config_docker_light.py
healthcheck:
disable: true
@@ -170,30 +156,36 @@ services:
required: false
volumes: *superset-volumes
pytest-runner:
superset-mcp-light:
profiles:
- mcp
build:
<<: *common-build
entrypoint: ["/app/docker/docker-pytest-entrypoint.sh"]
command: ["/app/docker/docker-bootstrap.sh", "mcp"]
restart: unless-stopped
ports:
- "127.0.0.1:${MCP_PORT:-5008}:5008" # Parameterized port
extra_hosts:
- "host.docker.internal:host-gateway"
user: *superset-user
depends_on:
superset-init-light:
condition: service_completed_successfully
volumes: *superset-volumes
env_file:
- path: docker/.env # default
required: true
- path: docker/.env-local # optional override
required: false
profiles:
- test # Only starts when --profile test is used
depends_on:
db-light:
condition: service_started
user: *superset-user
volumes: *superset-volumes
environment:
# Override DB connection for light service
DATABASE_HOST: db-light
DATABASE_DB: test
POSTGRES_DB: test
SUPERSET__SQLALCHEMY_DATABASE_URI: postgresql+psycopg2://superset:superset@db-light:5432/test
SUPERSET__SQLALCHEMY_EXAMPLES_URI: "duckdb:////app/data/examples.duckdb"
SUPERSET_CONFIG: superset_test_config_light
PYTHONPATH: /app/pythonpath:/app/docker/pythonpath_dev:/app
DATABASE_DB: superset_light
POSTGRES_DB: superset_light
# Use light-specific config that disables Redis
SUPERSET_CONFIG_PATH: /app/docker/pythonpath_dev/superset_config_docker_light.py
# Enable SQL debugging for MCP if needed
SQLALCHEMY_DEBUG: ${MCP_SQL_DEBUG:-false}
volumes:
superset_home_light:

View File

@@ -42,7 +42,6 @@ x-common-build: &common-build
INCLUDE_CHROMIUM: ${INCLUDE_CHROMIUM:-false}
INCLUDE_FIREFOX: ${INCLUDE_FIREFOX:-false}
BUILD_TRANSLATIONS: ${BUILD_TRANSLATIONS:-false}
LOAD_EXAMPLES_DUCKDB: ${LOAD_EXAMPLES_DUCKDB:-true}
services:
nginx:
@@ -108,8 +107,6 @@ services:
superset-init:
condition: service_completed_successfully
volumes: *superset-volumes
environment:
SUPERSET__SQLALCHEMY_EXAMPLES_URI: "duckdb:////app/data/examples.duckdb"
superset-websocket:
container_name: superset_websocket
@@ -161,8 +158,6 @@ services:
condition: service_started
user: *superset-user
volumes: *superset-volumes
environment:
SUPERSET__SQLALCHEMY_EXAMPLES_URI: "duckdb:////app/data/examples.duckdb"
healthcheck:
disable: true

View File

@@ -78,6 +78,10 @@ case "${1}" in
echo "Starting web app..."
/usr/bin/run-server.sh
;;
mcp)
echo "Starting MCP service..."
superset mcp run --host 0.0.0.0 --port ${MCP_PORT:-5008} --debug
;;
*)
echo "Unknown Operation!!!"
;;

View File

@@ -69,8 +69,6 @@ echo_step "3" "Complete" "Setting up roles and perms"
if [ "$SUPERSET_LOAD_EXAMPLES" = "yes" ]; then
# Load some data to play with
echo_step "4" "Starting" "Loading examples"
# If Cypress run which consumes superset_test_config load required data for tests
if [ "$CYPRESS_CONFIG" == "true" ]; then
superset load_examples --load-test-data

View File

@@ -1,152 +0,0 @@
#!/bin/bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
set -e
# Wait for PostgreSQL to be ready
echo "Waiting for database to be ready..."
for i in {1..30}; do
if python3 -c "
import psycopg2
try:
conn = psycopg2.connect(host='db-light', user='superset', password='superset', database='superset_light')
conn.close()
print('Database is ready!')
except:
exit(1)
" 2>/dev/null; then
echo "Database connection established!"
break
fi
echo "Waiting for database... ($i/30)"
if [ $i -eq 30 ]; then
echo "Database connection timeout after 30 seconds"
exit 1
fi
sleep 1
done
# Handle database setup based on FORCE_RELOAD
if [ "${FORCE_RELOAD}" = "true" ]; then
echo "Force reload requested - resetting test database"
# Drop and recreate the test database using Python
python3 -c "
import psycopg2
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
# Connect to default database
conn = psycopg2.connect(host='db-light', user='superset', password='superset', database='superset_light')
conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
cur = conn.cursor()
# Drop and recreate test database
try:
cur.execute('DROP DATABASE IF EXISTS test')
except:
pass
cur.execute('CREATE DATABASE test')
conn.close()
# Connect to test database to create schemas
conn = psycopg2.connect(host='db-light', user='superset', password='superset', database='test')
conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
cur = conn.cursor()
cur.execute('CREATE SCHEMA sqllab_test_db')
cur.execute('CREATE SCHEMA admin_database')
cur.close()
conn.close()
print('Test database reset successfully')
"
# Use --no-reset-db since we already reset it
FLAGS="--no-reset-db"
else
echo "Using existing test database (set FORCE_RELOAD=true to reset)"
FLAGS="--no-reset-db"
# Ensure test database exists using Python
python3 -c "
import psycopg2
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
# Check if test database exists
try:
conn = psycopg2.connect(host='db-light', user='superset', password='superset', database='test')
conn.close()
print('Test database already exists')
except:
print('Creating test database...')
# Connect to default database to create test database
conn = psycopg2.connect(host='db-light', user='superset', password='superset', database='superset_light')
conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
cur = conn.cursor()
# Create test database
cur.execute('CREATE DATABASE test')
conn.close()
# Connect to test database to create schemas
conn = psycopg2.connect(host='db-light', user='superset', password='superset', database='test')
conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
cur = conn.cursor()
cur.execute('CREATE SCHEMA IF NOT EXISTS sqllab_test_db')
cur.execute('CREATE SCHEMA IF NOT EXISTS admin_database')
cur.close()
conn.close()
print('Test database created successfully')
"
fi
# Always run database migrations to ensure schema is up to date
echo "Running database migrations..."
cd /app
superset db upgrade
# Initialize test environment if needed
if [ "${FORCE_RELOAD}" = "true" ] || [ ! -f "/app/superset_home/.test_initialized" ]; then
echo "Initializing test environment..."
# Run initialization commands
superset init
echo "Loading test users..."
superset load-test-users
# Mark as initialized
touch /app/superset_home/.test_initialized
else
echo "Test environment already initialized (skipping init and load-test-users)"
echo "Tip: Use FORCE_RELOAD=true to reinitialize the test database"
fi
# Create missing scripts needed for tests
if [ ! -f "/app/scripts/tag_latest_release.sh" ]; then
echo "Creating missing tag_latest_release.sh script for tests..."
cp /app/docker/tag_latest_release.sh /app/scripts/tag_latest_release.sh 2>/dev/null || true
fi
# Install pip module for Shillelagh compatibility (aligns with CI environment)
echo "Installing pip module for Shillelagh compatibility..."
uv pip install pip
# If arguments provided, execute them
if [ $# -gt 0 ]; then
exec "$@"
fi

View File

@@ -26,7 +26,7 @@ gunicorn \
--workers ${SERVER_WORKER_AMOUNT:-1} \
--worker-class ${SERVER_WORKER_CLASS:-gthread} \
--threads ${SERVER_THREADS_AMOUNT:-20} \
--log-level "${GUNICORN_LOGLEVEL:-info}" \
--log-level "${GUNICORN_LOGLEVEL:info}" \
--timeout ${GUNICORN_TIMEOUT:-60} \
--keep-alive ${GUNICORN_KEEPALIVE:-2} \
--max-requests ${WORKER_MAX_REQUESTS:-0} \

View File

@@ -23,57 +23,25 @@ MIN_MEM_FREE_GB=3
MIN_MEM_FREE_KB=$(($MIN_MEM_FREE_GB*1000000))
echo_mem_warn() {
# Check if running in Codespaces first
if [[ -n "${CODESPACES}" ]]; then
echo "Memory available: Codespaces managed"
return
fi
MEM_FREE_KB=$(awk '/MemFree/ { printf "%s \n", $2 }' /proc/meminfo)
MEM_FREE_GB=$(awk '/MemFree/ { printf "%s \n", $2/1024/1024 }' /proc/meminfo)
# Check platform and get memory accordingly
if [[ -f /proc/meminfo ]]; then
# Linux
if grep -q MemAvailable /proc/meminfo; then
MEM_AVAIL_KB=$(awk '/MemAvailable/ { printf "%s \n", $2 }' /proc/meminfo)
MEM_AVAIL_GB=$(awk '/MemAvailable/ { printf "%s \n", $2/1024/1024 }' /proc/meminfo)
else
MEM_AVAIL_KB=$(awk '/MemFree/ { printf "%s \n", $2 }' /proc/meminfo)
MEM_AVAIL_GB=$(awk '/MemFree/ { printf "%s \n", $2/1024/1024 }' /proc/meminfo)
fi
elif [[ "$(uname)" == "Darwin" ]]; then
# macOS - use vm_stat to get free memory
# vm_stat reports in pages, typically 4096 bytes per page
PAGE_SIZE=$(pagesize)
FREE_PAGES=$(vm_stat | awk '/Pages free:/ {print $3}' | tr -d '.')
INACTIVE_PAGES=$(vm_stat | awk '/Pages inactive:/ {print $3}' | tr -d '.')
# Free + inactive pages give us available memory (similar to MemAvailable on Linux)
AVAIL_PAGES=$((FREE_PAGES + INACTIVE_PAGES))
MEM_AVAIL_KB=$((AVAIL_PAGES * PAGE_SIZE / 1024))
MEM_AVAIL_GB=$(echo "scale=2; $MEM_AVAIL_KB / 1024 / 1024" | bc)
else
# Other platforms
echo "Memory available: Unable to determine"
return
fi
if [[ "${MEM_AVAIL_KB}" -lt "${MIN_MEM_FREE_KB}" ]]; then
if [[ "${MEM_FREE_KB}" -lt "${MIN_MEM_FREE_KB}" ]]; then
cat <<EOF
===============================================
======== Memory Insufficient Warning =========
===============================================
It looks like you only have ${MEM_AVAIL_GB}GB of
memory ${MEM_TYPE}. Please increase your Docker
It looks like you only have ${MEM_FREE_GB}GB of
memory free. Please increase your Docker
resources to at least ${MIN_MEM_FREE_GB}GB
Note: During builds, available memory may be
temporarily low due to caching and compilation.
===============================================
======== Memory Insufficient Warning =========
===============================================
EOF
else
echo "Memory available: ${MEM_AVAIL_GB} GB"
echo "Memory check Ok [${MEM_FREE_GB}GB free]"
fi
}

View File

@@ -49,18 +49,12 @@ SQLALCHEMY_DATABASE_URI = (
f"{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
)
# Use environment variable if set, otherwise construct from components
# This MUST take precedence over any other configuration
SQLALCHEMY_EXAMPLES_URI = os.getenv(
"SUPERSET__SQLALCHEMY_EXAMPLES_URI",
(
f"{DATABASE_DIALECT}://"
f"{EXAMPLES_USER}:{EXAMPLES_PASSWORD}@"
f"{EXAMPLES_HOST}:{EXAMPLES_PORT}/{EXAMPLES_DB}"
),
SQLALCHEMY_EXAMPLES_URI = (
f"{DATABASE_DIALECT}://"
f"{EXAMPLES_USER}:{EXAMPLES_PASSWORD}@"
f"{EXAMPLES_HOST}:{EXAMPLES_PORT}/{EXAMPLES_DB}"
)
REDIS_HOST = os.getenv("REDIS_HOST", "redis")
REDIS_PORT = os.getenv("REDIS_PORT", "6379")
REDIS_CELERY_DB = os.getenv("REDIS_CELERY_DB", "0")

View File

@@ -1,55 +0,0 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
# Test configuration for docker-compose-light.yml - uses SimpleCache instead of Redis
# Import all settings from the main test config first
import os
import sys
# Add the tests directory to the path to import the test config
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
from tests.integration_tests.superset_test_config import * # noqa: F403
# Override Redis-based caching to use simple in-memory cache
CACHE_CONFIG = {
"CACHE_TYPE": "SimpleCache",
"CACHE_DEFAULT_TIMEOUT": 300,
"CACHE_KEY_PREFIX": "superset_test_",
}
DATA_CACHE_CONFIG = {
**CACHE_CONFIG,
"CACHE_DEFAULT_TIMEOUT": 30,
"CACHE_KEY_PREFIX": "superset_test_data_",
}
# Keep SimpleCache for these as they're already using it
# FILTER_STATE_CACHE_CONFIG - already SimpleCache in parent
# EXPLORE_FORM_DATA_CACHE_CONFIG - already SimpleCache in parent
# Disable Celery for lightweight testing
CELERY_CONFIG = None
# Use FileSystemCache for SQL Lab results instead of Redis
from flask_caching.backends.filesystemcache import FileSystemCache # noqa: E402
RESULTS_BACKEND = FileSystemCache("/app/superset_home/sqllab_test")
# Override WEBDRIVER_BASEURL for tests to match expected values
WEBDRIVER_BASEURL = "http://0.0.0.0:8080/"
WEBDRIVER_BASEURL_USER_FRIENDLY = WEBDRIVER_BASEURL

View File

@@ -1,190 +0,0 @@
#! /bin/bash
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
run_git_tag () {
if [[ "$DRY_RUN" == "false" ]] && [[ "$SKIP_TAG" == "false" ]]
then
git tag -a -f latest "${GITHUB_TAG_NAME}" -m "latest tag"
echo "${GITHUB_TAG_NAME} has been tagged 'latest'"
fi
exit 0
}
###
# separating out git commands into functions so they can be mocked in unit tests
###
git_show_ref () {
if [[ "$TEST_ENV" == "true" ]]
then
if [[ "$GITHUB_TAG_NAME" == "does_not_exist" ]]
# mock return for testing only
then
echo ""
else
echo "2817aebd69dc7d199ec45d973a2079f35e5658b6 refs/tags/${GITHUB_TAG_NAME}"
fi
fi
result=$(git show-ref "${GITHUB_TAG_NAME}")
echo "${result}"
}
get_latest_tag_list () {
if [[ "$TEST_ENV" == "true" ]]
then
echo "(tag: 2.1.0, apache/2.1test)"
else
result=$(git show-ref --tags --dereference latest | awk '{print $2}' | xargs git show --pretty=tformat:%d -s | grep tag:)
echo "${result}"
fi
}
###
split_string () {
local version="$1"
local delimiter="$2"
local components=()
local tmp=""
for (( i=0; i<${#version}; i++ )); do
local char="${version:$i:1}"
if [[ "$char" != "$delimiter" ]]; then
tmp="$tmp$char"
elif [[ -n "$tmp" ]]; then
components+=("$tmp")
tmp=""
fi
done
if [[ -n "$tmp" ]]; then
components+=("$tmp")
fi
echo "${components[@]}"
}
DRY_RUN=false
# get params passed in with script when it was run
# --dry-run is optional and returns the value of SKIP_TAG, but does not run the git tag statement
# A tag name is required as a param. A SHA won't work. You must first tag a sha with a release number
# and then run this script
while [[ $# -gt 0 ]]
do
key="$1"
case $key in
--dry-run)
DRY_RUN=true
shift # past value
;;
*) # this should be the tag name
GITHUB_TAG_NAME=$key
shift # past value
;;
esac
done
if [ -z "${GITHUB_TAG_NAME}" ]; then
echo "Missing tag parameter, usage: ./scripts/tag_latest_release.sh <GITHUB_TAG_NAME>"
echo "SKIP_TAG=true" >> $GITHUB_OUTPUT
exit 1
fi
if [ -z "$(git_show_ref)" ]; then
echo "The tag ${GITHUB_TAG_NAME} does not exist. Please use a different tag."
echo "SKIP_TAG=true" >> $GITHUB_OUTPUT
exit 0
fi
# check that this tag only contains a proper semantic version
if ! [[ ${GITHUB_TAG_NAME} =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]
then
echo "This tag ${GITHUB_TAG_NAME} is not a valid release version. Not tagging."
echo "SKIP_TAG=true" >> $GITHUB_OUTPUT
exit 1
fi
## split the current GITHUB_TAG_NAME into an array at the dot
THIS_TAG_NAME=$(split_string "${GITHUB_TAG_NAME}" ".")
# look up the 'latest' tag on git
LATEST_TAG_LIST=$(get_latest_tag_list) || echo 'not found'
# if 'latest' tag doesn't exist, then set this commit to latest
if [[ -z "$LATEST_TAG_LIST" ]]
then
echo "there are no latest tags yet, so I'm going to start by tagging this sha as the latest"
run_git_tag
exit 0
fi
# remove parenthesis and tag: from the list of tags
LATEST_TAGS_STRINGS=$(echo "$LATEST_TAG_LIST" | sed 's/tag: \([^,]*\)/\1/g' | tr -d '()')
LATEST_TAGS=$(split_string "$LATEST_TAGS_STRINGS" ",")
TAGS=($(split_string "$LATEST_TAGS" " "))
# Initialize a flag for comparison result
compare_result=""
# Iterate through the tags of the latest release
for tag in $TAGS
do
if [[ $tag == "latest" ]]; then
continue
else
## extract just the version from this tag
LATEST_RELEASE_TAG="$tag"
echo "LATEST_RELEASE_TAG: ${LATEST_RELEASE_TAG}"
# check that this only contains a proper semantic version
if ! [[ ${LATEST_RELEASE_TAG} =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]
then
echo "'Latest' has been associated with tag ${LATEST_RELEASE_TAG} which is not a valid release version. Looking for another."
continue
fi
echo "The current release with the latest tag is version ${LATEST_RELEASE_TAG}"
# Split the version strings into arrays
THIS_TAG_NAME_ARRAY=($(split_string "$THIS_TAG_NAME" "."))
LATEST_RELEASE_TAG_ARRAY=($(split_string "$LATEST_RELEASE_TAG" "."))
# Iterate through the components of the version strings
for (( j=0; j<${#THIS_TAG_NAME_ARRAY[@]}; j++ )); do
echo "Comparing ${THIS_TAG_NAME_ARRAY[$j]} to ${LATEST_RELEASE_TAG_ARRAY[$j]}"
if [[ $((THIS_TAG_NAME_ARRAY[$j])) > $((LATEST_RELEASE_TAG_ARRAY[$j])) ]]; then
compare_result="greater"
break
elif [[ $((THIS_TAG_NAME_ARRAY[$j])) < $((LATEST_RELEASE_TAG_ARRAY[$j])) ]]; then
compare_result="lesser"
break
fi
done
fi
done
# Determine the result based on the comparison
if [[ -z "$compare_result" ]]; then
echo "Versions are equal"
echo "SKIP_TAG=true" >> $GITHUB_OUTPUT
elif [[ "$compare_result" == "greater" ]]; then
echo "This release tag ${GITHUB_TAG_NAME} is newer than the latest."
echo "SKIP_TAG=false" >> $GITHUB_OUTPUT
# Add other actions you want to perform for a newer version
elif [[ "$compare_result" == "lesser" ]]; then
echo "This release tag ${GITHUB_TAG_NAME} is older than the latest."
echo "This release tag ${GITHUB_TAG_NAME} is not the latest. Not tagging."
# if you've gotten this far, then we don't want to run any tags in the next step
echo "SKIP_TAG=true" >> $GITHUB_OUTPUT
fi

View File

@@ -21,183 +21,3 @@ This is the public documentation site for Superset, built using
[Docusaurus 3](https://docusaurus.io/). See
[CONTRIBUTING.md](../CONTRIBUTING.md#documentation) for documentation on
contributing to documentation.
## Version Management
The Superset documentation site uses Docusaurus versioning with three independent versioned sections:
- **Main Documentation** (`/docs/`) - Core Superset documentation
- **Developer Portal** (`/developer_portal/`) - Developer guides and tutorials
- **Component Playground** (`/components/`) - Interactive component examples (currently disabled)
Each section maintains its own version history and can be versioned independently.
### Creating a New Version
To create a new version for any section, use the Docusaurus version command with the appropriate plugin ID or use our automated scripts:
#### Using Automated Scripts (Required)
**⚠️ Important:** Always use these custom commands instead of the native Docusaurus commands. These scripts ensure that both the Docusaurus versioning system AND the `versions-config.json` file are updated correctly.
```bash
# Main Documentation
yarn version:add:docs 1.2.0
# Developer Portal
yarn version:add:developer_portal 1.2.0
# Component Playground (when enabled)
yarn version:add:components 1.2.0
```
**Do NOT use** the native Docusaurus commands directly (`yarn docusaurus docs:version`), as they will:
- ❌ Create version files but NOT update `versions-config.json`
- ❌ Cause versions to not appear in dropdown menus
- ❌ Require manual fixes to synchronize the configuration
### Managing Versions
#### With Automated Scripts
The automated scripts handle all configuration updates automatically. No manual editing required!
#### Manual Configuration
If creating versions manually, you'll need to:
1. **Update `versions-config.json`** (or `docusaurus.config.ts` if not using dynamic config):
- Add version to `onlyIncludeVersions` array
- Add version metadata to `versions` object
- Update `lastVersion` if needed
2. **Files Created by Versioning**:
When a new version is created, Docusaurus generates:
- **Versioned docs folder**: `[section]_versioned_docs/version-X.X.X/`
- **Versioned sidebars**: `[section]_versioned_sidebars/version-X.X.X-sidebars.json`
- **Versions list**: `[section]_versions.json`
Note: For main docs, the prefix is omitted (e.g., `versioned_docs/` instead of `docs_versioned_docs/`)
3. **Important**: After adding a version, restart the development server to see changes:
```bash
yarn stop
yarn start
```
### Removing a Version
#### Using Automated Scripts (Recommended)
```bash
# Main Documentation
yarn version:remove:docs 1.0.0
# Developer Portal
yarn version:remove:developer_portal 1.0.0
# Component Playground
yarn version:remove:components 1.0.0
```
#### Manual Removal
To manually remove a version:
1. **Delete the version folder** from the appropriate location:
- Main docs: `versioned_docs/version-X.X.X/` (no prefix for main)
- Developer Portal: `developer_portal_versioned_docs/version-X.X.X/`
- Components: `components_versioned_docs/version-X.X.X/`
2. **Delete the version metadata file**:
- Main docs: `versioned_sidebars/version-X.X.X-sidebars.json` (no prefix)
- Developer Portal: `developer_portal_versioned_sidebars/version-X.X.X-sidebars.json`
- Components: `components_versioned_sidebars/version-X.X.X-sidebars.json`
3. **Update the versions list file**:
- Main docs: `versions.json`
- Developer Portal: `developer_portal_versions.json`
- Components: `components_versions.json`
4. **Update configuration**:
- If using dynamic config: Update `versions-config.json`
- If using static config: Update `docusaurus.config.ts`
5. **Restart the server** to see changes
### Version Configuration Examples
#### Main Documentation (default plugin)
```typescript
docs: {
includeCurrentVersion: true,
lastVersion: 'current', // Makes /docs/ show Next version
onlyIncludeVersions: ['current', '1.1.0', '1.0.0'],
versions: {
current: {
label: 'Next',
path: '', // Empty path for default routing
banner: 'unreleased',
},
'1.1.0': {
label: '1.1.0',
path: '1.1.0',
banner: 'none',
},
},
}
```
#### Developer Portal & Components (custom plugins)
```typescript
{
id: 'developer_portal',
path: 'developer_portal',
routeBasePath: 'developer_portal',
includeCurrentVersion: true,
lastVersion: '1.1.0', // Default version
onlyIncludeVersions: ['current', '1.1.0', '1.0.0'],
versions: {
current: {
label: 'Next',
path: 'next',
banner: 'unreleased',
},
'1.1.0': {
label: '1.1.0',
path: '1.1.0',
banner: 'none',
},
},
}
```
### Best Practices
1. **Version naming**: Use semantic versioning (e.g., 1.0.0, 1.1.0, 2.0.0)
2. **Version banners**: Use `'unreleased'` for development versions, `'none'` for stable releases
3. **Limit displayed versions**: Use `onlyIncludeVersions` to show only relevant versions
4. **Test locally**: Always test version changes locally before deploying
5. **Independent versioning**: Each section can have different version numbers and release cycles
### Troubleshooting
#### Version Not Showing After Creation
If you accidentally used `yarn docusaurus docs:version` instead of `yarn version:add`:
1. **Problem**: The version files were created but `versions-config.json` wasn't updated
2. **Solution**: Either:
- Revert the changes: `git restore versions.json && rm -rf versioned_docs/ versioned_sidebars/`
- Then use the correct command: `yarn version:add:docs <version>`
For other issues:
- **Restart the server**: Changes to version configuration require a server restart
- **Check config file**: Ensure `versions-config.json` includes the new version
- **Verify files exist**: Check that versioned docs folder was created
#### Broken Links in Versioned Documentation
When creating a new version, links in the documentation are preserved as-is. Common issues:
- **Cross-section links**: Links between sections (e.g., from developer_portal to docs) need to be version-aware
- **Absolute vs relative paths**: Use relative paths within the same section
- **Version-specific URLs**: Update hardcoded URLs to use version variables
To fix broken links:
1. Use `type: 'doc'` with `docId` for version-aware navigation in navbar
2. Use relative paths within the same documentation section
3. Test all versions after creation to identify broken links

View File

@@ -1,105 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
---
title: Bar Chart
sidebar_position: 1
---
# Bar Chart Component
The Bar Chart component is used to visualize categorical data with rectangular bars.
## Props
| Prop | Type | Default | Description |
|------|------|---------|-------------|
| `data` | `array` | `[]` | Array of data objects to visualize |
| `width` | `number` | `800` | Width of the chart in pixels |
| `height` | `number` | `600` | Height of the chart in pixels |
| `xField` | `string` | - | Field name for x-axis values |
| `yField` | `string` | - | Field name for y-axis values |
| `colorField` | `string` | - | Field name for color encoding |
| `colorScheme` | `string` | `'supersetColors'` | Color scheme to use |
| `showLegend` | `boolean` | `true` | Whether to show the legend |
| `showGrid` | `boolean` | `true` | Whether to show grid lines |
| `labelPosition` | `string` | `'top'` | Position of bar labels: 'top', 'middle', 'bottom' |
## Examples
### Basic Bar Chart
```jsx
import { BarChart } from '@superset-ui/chart-components';
const data = [
{ category: 'A', value: 10 },
{ category: 'B', value: 20 },
{ category: 'C', value: 15 },
{ category: 'D', value: 25 },
];
function Example() {
return (
<BarChart
data={data}
width={800}
height={400}
xField="category"
yField="value"
colorScheme="supersetColors"
/>
);
}
```
### Grouped Bar Chart
```jsx
import { BarChart } from '@superset-ui/chart-components';
const data = [
{ category: 'A', group: 'Group 1', value: 10 },
{ category: 'A', group: 'Group 2', value: 15 },
{ category: 'B', group: 'Group 1', value: 20 },
{ category: 'B', group: 'Group 2', value: 25 },
{ category: 'C', group: 'Group 1', value: 15 },
{ category: 'C', group: 'Group 2', value: 10 },
];
function Example() {
return (
<BarChart
data={data}
width={800}
height={400}
xField="category"
yField="value"
colorField="group"
colorScheme="supersetColors"
/>
);
}
```
## Best Practices
- Use bar charts when comparing quantities across categories
- Sort bars by value for better readability, unless there's a natural order to the categories
- Use consistent colors for the same categories across different charts
- Consider using horizontal bar charts when category labels are long

View File

@@ -1,59 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
---
title: Component Library
sidebar_position: 1
---
# Superset Component Library
Welcome to the Apache Superset Component Library documentation. This section provides comprehensive documentation for all the UI components, chart components, and layout components used in Superset.
## What is the Component Library?
The Component Library is a collection of reusable UI components that are used to build the Superset user interface. These components are designed to be consistent, accessible, and easy to use.
## Component Categories
The Component Library is organized into the following categories:
### UI Components
Basic UI components like buttons, inputs, dropdowns, and other form elements.
### Chart Components
Visualization components used to render different types of charts and graphs.
### Layout Components
Components used for page layout, such as containers, grids, and navigation elements.
## Versioning
The Component Library documentation follows its own versioning scheme, independent from the main Superset documentation. This allows us to update the component documentation as the components evolve, without affecting the main documentation.
## Getting Started
Browse the sidebar to explore the different components available in the library. Each component documentation includes:
- Component description and purpose
- Props and configuration options
- Usage examples
- Best practices

View File

@@ -1,113 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
---
title: Grid
sidebar_position: 1
---
# Grid Component
The Grid component provides a flexible layout system for arranging content in rows and columns.
## Props
| Prop | Type | Default | Description |
|------|------|---------|-------------|
| `gutter` | `number` or `[number, number]` | `0` | Grid spacing between items, can be a single number or [horizontal, vertical] |
| `columns` | `number` | `12` | Number of columns in the grid |
| `justify` | `string` | `'start'` | Horizontal alignment: 'start', 'center', 'end', 'space-between', 'space-around' |
| `align` | `string` | `'top'` | Vertical alignment: 'top', 'middle', 'bottom' |
| `wrap` | `boolean` | `true` | Whether to wrap items when they overflow |
### Row Props
| Prop | Type | Default | Description |
|------|------|---------|-------------|
| `gutter` | `number` or `[number, number]` | `0` | Spacing between items in the row |
| `justify` | `string` | `'start'` | Horizontal alignment for this row |
| `align` | `string` | `'top'` | Vertical alignment for this row |
### Col Props
| Prop | Type | Default | Description |
|------|------|---------|-------------|
| `span` | `number` | - | Number of columns the grid item spans |
| `offset` | `number` | `0` | Number of columns the grid item is offset |
| `xs`, `sm`, `md`, `lg`, `xl` | `number` or `object` | - | Responsive props for different screen sizes |
## Examples
### Basic Grid
```jsx
import { Grid, Row, Col } from '@superset-ui/core';
function Example() {
return (
<Grid>
<Row gutter={16}>
<Col span={8}>
<div>Column 1</div>
</Col>
<Col span={8}>
<div>Column 2</div>
</Col>
<Col span={8}>
<div>Column 3</div>
</Col>
</Row>
</Grid>
);
}
```
### Responsive Grid
```jsx
import { Grid, Row, Col } from '@superset-ui/core';
function Example() {
return (
<Grid>
<Row gutter={[16, 24]}>
<Col xs={24} sm={12} md={8} lg={6}>
<div>Responsive Column 1</div>
</Col>
<Col xs={24} sm={12} md={8} lg={6}>
<div>Responsive Column 2</div>
</Col>
<Col xs={24} sm={12} md={8} lg={6}>
<div>Responsive Column 3</div>
</Col>
<Col xs={24} sm={12} md={8} lg={6}>
<div>Responsive Column 4</div>
</Col>
</Row>
</Grid>
);
}
```
## Best Practices
- Use the Grid system for complex layouts that need to be responsive
- Specify column widths for different screen sizes to ensure proper responsive behavior
- Use gutters to create appropriate spacing between grid items
- Keep the grid structure consistent throughout your application
- Consider using the grid system for dashboard layouts to ensure consistent spacing and alignment

View File

@@ -1,35 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
---
title: Test
---
import { StoryExample } from '../src/components/StorybookWrapper';
# Test
This is a test using our custom StorybookWrapper component.
<StoryExample
component={() => (
<div style={{ padding: '10px', background: '#f0f0f0', borderRadius: '4px' }}>
This is a simple example component
</div>
)}
/>

View File

@@ -1,146 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
---
title: Button Component
sidebar_position: 1
---
import { StoryExample, StoryWithControls } from '../../src/components/StorybookWrapper';
import { Button } from '../../../superset-frontend/packages/superset-ui-core/src/components/Button';
# Button Component
The Button component is a fundamental UI element used throughout Superset for user interactions.
## Basic Usage
The default button with primary styling:
<StoryExample
component={() => (
<Button buttonStyle="primary" onClick={() => console.log('Clicked!')}>
Click Me
</Button>
)}
/>
## Interactive Example
<StoryWithControls
component={({ buttonStyle, buttonSize, label, disabled }) => (
<Button
buttonStyle={buttonStyle}
buttonSize={buttonSize}
disabled={disabled}
onClick={() => console.log('Clicked!')}
>
{label}
</Button>
)}
props={{
buttonStyle: 'primary',
buttonSize: 'default',
label: 'Click Me',
disabled: false
}}
controls={[
{
name: 'buttonStyle',
label: 'Button Style',
type: 'select',
options: ['primary', 'secondary', 'tertiary', 'success', 'warning', 'danger', 'default', 'link', 'dashed']
},
{
name: 'buttonSize',
label: 'Button Size',
type: 'select',
options: ['default', 'small', 'xsmall']
},
{
name: 'label',
label: 'Button Text',
type: 'text'
},
{
name: 'disabled',
label: 'Disabled',
type: 'boolean'
}
]}
/>
## Props
| Prop | Type | Default | Description |
|------|------|---------|-------------|
| `buttonStyle` | `'primary' \| 'secondary' \| 'tertiary' \| 'success' \| 'warning' \| 'danger' \| 'default' \| 'link' \| 'dashed'` | `'default'` | Button style |
| `buttonSize` | `'default' \| 'small' \| 'xsmall'` | `'default'` | Button size |
| `disabled` | `boolean` | `false` | Whether the button is disabled |
| `cta` | `boolean` | `false` | Whether the button is a call-to-action button |
| `tooltip` | `ReactNode` | - | Tooltip content |
| `placement` | `TooltipProps['placement']` | - | Tooltip placement |
| `onClick` | `function` | - | Callback when button is clicked |
| `href` | `string` | - | Turns button into an anchor link |
| `target` | `string` | - | Target attribute for anchor links |
## Usage
```jsx
import Button from 'src/components/Button';
function MyComponent() {
return (
<Button
buttonStyle="primary"
onClick={() => console.log('Button clicked')}
>
Click Me
</Button>
);
}
```
## Button Styles
Superset provides a variety of button styles for different purposes:
- **Primary**: Used for primary actions
- **Secondary**: Used for secondary actions
- **Tertiary**: Used for less important actions
- **Success**: Used for successful or confirming actions
- **Warning**: Used for actions that require caution
- **Danger**: Used for destructive actions
- **Link**: Used for navigation
- **Dashed**: Used for adding new items or features
## Button Sizes
Buttons come in three sizes:
- **Default**: Standard size for most use cases
- **Small**: Compact size for tight spaces
- **XSmall**: Extra small size for very limited spaces
## Best Practices
- Use primary buttons for the main action in a form or page
- Use secondary buttons for alternative actions
- Use danger buttons for destructive actions
- Limit the number of primary buttons on a page to avoid confusion
- Use consistent button styles throughout your application
- Add tooltips to buttons when their purpose might not be immediately clear

View File

@@ -1 +0,0 @@
[]

View File

@@ -1 +0,0 @@
[]

View File

@@ -1,477 +0,0 @@
---
title: Frontend API Reference
sidebar_position: 1
hide_title: true
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Frontend API Reference
The `@apache-superset/core` package provides comprehensive APIs for frontend extension development. All APIs are organized into logical namespaces for easy discovery and use.
## Core API
The core namespace provides fundamental extension functionality.
### registerView
Registers a new view or panel in the specified contribution point.
```typescript
core.registerView(
id: string,
component: React.ComponentType
): Disposable
```
**Example:**
```typescript
const panel = context.core.registerView('my-extension.panel', () => (
<MyPanelComponent />
));
```
### getActiveView
Gets the currently active view in a contribution area.
```typescript
core.getActiveView(area: string): View | undefined
```
## Commands API
Manages command registration and execution.
### registerCommand
Registers a new command that can be triggered by menus, shortcuts, or programmatically.
```typescript
commands.registerCommand(
id: string,
handler: CommandHandler
): Disposable
interface CommandHandler {
title: string;
icon?: string;
execute: (...args: any[]) => any;
isEnabled?: (...args: any[]) => boolean;
}
```
**Example:**
```typescript
const cmd = context.commands.registerCommand('my-extension.analyze', {
title: 'Analyze Query',
icon: 'BarChartOutlined',
execute: () => {
const query = context.sqlLab.getCurrentQuery();
// Perform analysis
},
isEnabled: () => {
return context.sqlLab.hasActiveEditor();
}
});
```
### executeCommand
Executes a registered command by ID.
```typescript
commands.executeCommand(id: string, ...args: any[]): Promise<any>
```
## SQL Lab API
Provides access to SQL Lab functionality and events.
### Query Access
```typescript
// Get current tab
sqlLab.getCurrentTab(): Tab | undefined
// Get all tabs
sqlLab.getTabs(): Tab[]
// Get current query
sqlLab.getCurrentQuery(): string
// Get selected text
sqlLab.getSelectedText(): string | undefined
```
### Database Access
```typescript
// Get available databases
sqlLab.getDatabases(): Database[]
// Get database by ID
sqlLab.getDatabase(id: number): Database | undefined
// Get schemas for database
sqlLab.getSchemas(databaseId: number): Promise<string[]>
// Get tables for schema
sqlLab.getTables(
databaseId: number,
schema: string
): Promise<Table[]>
```
### Events
```typescript
// Query execution events
sqlLab.onDidQueryRun: Event<QueryResult>
sqlLab.onDidQueryStop: Event<QueryResult>
sqlLab.onDidQueryFail: Event<QueryError>
// Editor events
sqlLab.onDidChangeEditorContent: Event<string>
sqlLab.onDidChangeSelection: Event<Selection>
// Tab events
sqlLab.onDidChangeActiveTab: Event<Tab>
sqlLab.onDidCloseTab: Event<Tab>
sqlLab.onDidChangeTabTitle: Event<{tab: Tab, title: string}>
// Panel events
sqlLab.onDidOpenPanel: Event<Panel>
sqlLab.onDidClosePanel: Event<Panel>
sqlLab.onDidChangeActivePanel: Event<Panel>
```
**Event Usage Example:**
```typescript
const disposable = context.sqlLab.onDidQueryRun((result) => {
console.log('Query executed:', result.query);
console.log('Rows returned:', result.rowCount);
console.log('Execution time:', result.executionTime);
});
// Remember to dispose when done
context.subscriptions.push(disposable);
```
## Authentication API
Handles authentication and security tokens.
### getCSRFToken
Gets the current CSRF token for API requests.
```typescript
authentication.getCSRFToken(): Promise<string>
```
### getCurrentUser
Gets information about the current user.
```typescript
authentication.getCurrentUser(): User
interface User {
id: number;
username: string;
email: string;
roles: Role[];
permissions: Permission[];
}
```
### hasPermission
Checks if the current user has a specific permission.
```typescript
authentication.hasPermission(permission: string): boolean
```
## Extensions API
Manages extension lifecycle and inter-extension communication.
### getExtension
Gets information about an installed extension.
```typescript
extensions.getExtension(id: string): Extension | undefined
interface Extension {
id: string;
name: string;
version: string;
isActive: boolean;
metadata: ExtensionMetadata;
}
```
### getActiveExtensions
Gets all currently active extensions.
```typescript
extensions.getActiveExtensions(): Extension[]
```
### Events
```typescript
// Extension lifecycle events
extensions.onDidActivateExtension: Event<Extension>
extensions.onDidDeactivateExtension: Event<Extension>
```
## UI Components
Import pre-built UI components from `@apache-superset/core`:
```typescript
import {
Button,
Select,
Input,
Table,
Modal,
Alert,
Tabs,
Card,
Dropdown,
Menu,
Tooltip,
Icon,
// ... many more
} from '@apache-superset/core';
```
### Example Component Usage
```typescript
import { Button, Alert } from '@apache-superset/core';
function MyExtensionPanel() {
return (
<div>
<Alert
message="Extension Loaded"
description="Your extension is ready to use"
type="success"
/>
<Button
type="primary"
onClick={() => console.log('Clicked!')}
>
Execute Action
</Button>
</div>
);
}
```
## Storage API
Provides persistent storage for extension data.
### Local Storage
```typescript
// Store data
storage.local.set(key: string, value: any): Promise<void>
// Retrieve data
storage.local.get(key: string): Promise<any>
// Remove data
storage.local.remove(key: string): Promise<void>
// Clear all extension data
storage.local.clear(): Promise<void>
```
### Workspace Storage
Workspace storage is shared across all users for collaborative features.
```typescript
storage.workspace.set(key: string, value: any): Promise<void>
storage.workspace.get(key: string): Promise<any>
storage.workspace.remove(key: string): Promise<void>
```
## Network API
Utilities for making API calls to Superset.
### fetch
Enhanced fetch with CSRF token handling.
```typescript
network.fetch(url: string, options?: RequestInit): Promise<Response>
```
### API Client
Type-safe API client for Superset endpoints.
```typescript
// Get chart data
network.api.charts.get(id: number): Promise<Chart>
// Query database
network.api.sqlLab.execute(
databaseId: number,
query: string
): Promise<QueryResult>
// Get datasets
network.api.datasets.list(): Promise<Dataset[]>
```
## Utility Functions
### Formatting
```typescript
// Format numbers
utils.formatNumber(value: number, format?: string): string
// Format dates
utils.formatDate(date: Date, format?: string): string
// Format SQL
utils.formatSQL(sql: string): string
```
### Validation
```typescript
// Validate SQL syntax
utils.validateSQL(sql: string): ValidationResult
// Check if valid database ID
utils.isValidDatabaseId(id: any): boolean
```
## TypeScript Types
Import common types for type safety:
```typescript
import type {
Database,
Dataset,
Chart,
Dashboard,
Query,
QueryResult,
Tab,
Panel,
User,
Role,
Permission,
ExtensionContext,
Disposable,
Event,
// ... more types
} from '@apache-superset/core';
```
## Extension Context
The context object passed to your extension's `activate` function:
```typescript
interface ExtensionContext {
// Subscription management
subscriptions: Disposable[];
// Extension metadata
extensionId: string;
extensionPath: string;
// API namespaces
core: CoreAPI;
commands: CommandsAPI;
sqlLab: SqlLabAPI;
authentication: AuthenticationAPI;
extensions: ExtensionsAPI;
storage: StorageAPI;
network: NetworkAPI;
utils: UtilsAPI;
// Logging
logger: Logger;
}
```
## Event Handling
Events follow the VS Code pattern with subscribe/dispose:
```typescript
// Subscribe to event
const disposable = sqlLab.onDidQueryRun((result) => {
// Handle event
});
// Dispose when done
disposable.dispose();
// Or add to context for automatic cleanup
context.subscriptions.push(disposable);
```
## Best Practices
1. **Always dispose subscriptions** to prevent memory leaks
2. **Use TypeScript** for better IDE support and type safety
3. **Handle errors gracefully** with try-catch blocks
4. **Check permissions** before sensitive operations
5. **Use provided UI components** for consistency
6. **Cache API responses** when appropriate
7. **Validate user input** before processing
## Version Compatibility
The frontend API follows semantic versioning:
- **Major version**: Breaking changes
- **Minor version**: New features, backward compatible
- **Patch version**: Bug fixes
Check compatibility in your `extension.json`:
```json
{
"engines": {
"@apache-superset/core": "^1.0.0"
}
}
```

View File

@@ -1,348 +0,0 @@
---
title: Architecture Overview
sidebar_position: 1
hide_title: true
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Extension Architecture Overview
The Superset extension architecture is designed to be modular, secure, and performant. This document provides a comprehensive overview of how extensions work and interact with the Superset host application.
## Core Principles
### 1. Lean Core
Superset's core remains minimal, with features delegated to extensions wherever possible. Built-in features use the same APIs as external extensions, ensuring API quality through dogfooding.
### 2. Explicit Contribution Points
All extension points are clearly defined and documented. Extensions declare their capabilities in metadata files, enabling predictable lifecycle management.
### 3. Versioned APIs
Public interfaces follow semantic versioning, ensuring backward compatibility and safe evolution of the platform.
### 4. Lazy Loading
Extensions load only when needed, minimizing performance impact and resource consumption.
### 5. Composability
Architecture patterns and APIs are reusable across different Superset modules, promoting consistency.
### 6. Community-Driven
The system evolves based on real-world feedback, with new extension points added as needs emerge.
## System Architecture
```mermaid
graph TB
subgraph "Superset Host Application"
Core[Core Application]
API[Extension APIs]
Loader[Extension Loader]
Manager[Extension Manager]
end
subgraph "Core Packages"
FrontendCore["@apache-superset/core<br/>(Frontend)"]
BackendCore["apache-superset-core<br/>(Backend)"]
CLI["apache-superset-extensions-cli"]
end
subgraph "Extension"
Metadata[extension.json]
Frontend[Frontend Code]
Backend[Backend Code]
Bundle[.supx Bundle]
end
Core --> API
API --> FrontendCore
API --> BackendCore
Loader --> Manager
Manager --> Bundle
Frontend --> FrontendCore
Backend --> BackendCore
CLI --> Bundle
```
## Key Components
### Host Application
The Superset host application provides:
- **Extension APIs**: Well-defined interfaces for extensions to interact with Superset
- **Extension Manager**: Handles lifecycle, activation, and deactivation
- **Module Loader**: Dynamically loads extension code using Webpack Module Federation
- **Security Context**: Manages permissions and sandboxing for extensions
### Core Packages
#### @apache-superset/core (Frontend)
- Shared UI components and utilities
- TypeScript type definitions
- Frontend API implementations
- Event system and command registry
#### apache-superset-core (Backend)
- Python base classes and utilities
- Database access APIs
- Security and permission helpers
- REST API registration
#### apache-superset-extensions-cli
- Project scaffolding
- Build and bundling tools
- Development server
- Package management
### Extension Structure
Each extension consists of:
- **Metadata** (`extension.json`): Declares capabilities and requirements
- **Frontend**: React components and TypeScript code
- **Backend**: Python modules and API endpoints
- **Assets**: Styles, images, and other resources
- **Bundle** (`.supx`): Packaged distribution format
## Module Federation
Extensions use Webpack Module Federation for dynamic loading:
```javascript
// Extension webpack.config.js
new ModuleFederationPlugin({
name: 'my_extension',
filename: 'remoteEntry.[contenthash].js',
exposes: {
'./index': './src/index.tsx',
},
externals: {
'@apache-superset/core': 'superset',
},
shared: {
react: { singleton: true },
'react-dom': { singleton: true },
}
})
```
This allows:
- **Independent builds**: Extensions compile separately from Superset
- **Shared dependencies**: Common libraries like React aren't duplicated
- **Dynamic loading**: Extensions load at runtime without rebuilding Superset
- **Version compatibility**: Extensions declare compatible core versions
## Extension Lifecycle
### 1. Registration
```typescript
// Extension registered with host
extensionManager.register({
name: 'my-extension',
version: '1.0.0',
manifest: manifestData
});
```
### 2. Activation
```typescript
// activate() called when extension loads
export function activate(context: ExtensionContext) {
// Register contributions
const disposables = [];
// Add panel
disposables.push(
context.core.registerView('my-panel', MyPanel)
);
// Register command
disposables.push(
context.commands.registerCommand('my-command', {
execute: () => { /* ... */ }
})
);
// Store for cleanup
context.subscriptions.push(...disposables);
}
```
### 3. Runtime
- Extension responds to events
- Provides UI components when requested
- Executes commands when triggered
- Accesses APIs as needed
### 4. Deactivation
```typescript
// Automatic cleanup of registered items
export function deactivate() {
// context.subscriptions automatically disposed
// Additional cleanup if needed
}
```
## Contribution Types
### Views
Extensions can add panels and UI components:
```json
{
"views": {
"sqllab.panels": [{
"id": "my-panel",
"name": "My Panel",
"icon": "ToolOutlined"
}]
}
}
```
### Commands
Define executable actions:
```json
{
"commands": [{
"command": "my-extension.run",
"title": "Run Analysis",
"icon": "PlayCircleOutlined"
}]
}
```
### Menus
Add items to existing menus:
```json
{
"menus": {
"sqllab.editor": {
"primary": [{
"command": "my-extension.run",
"when": "editorHasSelection"
}]
}
}
}
```
### API Endpoints
Register backend REST endpoints:
```python
from superset_core.api import rest_api
@rest_api.route('/my-endpoint')
def my_endpoint():
return {'data': 'value'}
```
## Security Model
### Permissions
- Extensions run with user's permissions
- No elevation of privileges
- Access controlled by Superset's RBAC
### Sandboxing
- Frontend code runs in browser context
- Backend code runs in Python process
- Future: Optional sandboxed execution
### Validation
- Manifest validation on upload
- Signature verification (future)
- Dependency scanning
## Performance Considerations
### Lazy Loading
- Extensions load only when features are accessed
- Code splitting for large extensions
- Cached after first load
### Bundle Optimization
- Tree shaking removes unused code
- Minification reduces size
- Compression for network transfer
### Resource Management
- Automatic cleanup on deactivation
- Memory leak prevention
- Event listener management
## Development vs Production
### Development Mode
```python
# superset_config.py
ENABLE_EXTENSIONS = True
LOCAL_EXTENSIONS = ['/path/to/extension']
```
- Hot reloading
- Source maps
- Debug logging
### Production Mode
- Optimized bundles
- Cached assets
- Performance monitoring
## Future Enhancements
### Planned Features
- Enhanced sandboxing
- Extension marketplace
- Inter-extension communication
- Theme contributions
- Chart type extensions
### API Expansion
- Dashboard extensions
- Database connector API
- Security provider interface
- Workflow automation
## Best Practices
### Do's
- ✅ Use TypeScript for type safety
- ✅ Follow semantic versioning
- ✅ Handle errors gracefully
- ✅ Clean up resources properly
- ✅ Document your extension
### Don'ts
- ❌ Access private APIs
- ❌ Modify global state directly
- ❌ Block the main thread
- ❌ Store sensitive data insecurely
- ❌ Assume API stability in 0.x versions
## Learn More
- [API Reference](../api/frontend)
- [Development Guide](../getting-started)
- [Security Guidelines](./security)
- [Performance Optimization](./performance)

View File

@@ -1,466 +0,0 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
---
title: CLI Documentation
sidebar_position: 1
hide_title: true
---
# Superset Extensions CLI
The `apache-superset-extensions-cli` provides command-line tools for creating, developing, and packaging Superset extensions.
## Installation
```bash
pip install apache-superset-extensions-cli
```
## Commands
### init
Creates a new extension project with the standard folder structure.
```bash
superset-extensions init <extension-name> [options]
```
**Options:**
- `--template <template>`: Use a specific template (default: basic)
- `--author <name>`: Set the author name
- `--description <text>`: Set the extension description
- `--with-backend`: Include backend code structure
**Example:**
```bash
superset-extensions init my-extension \
--author "John Doe" \
--description "Adds custom analytics to SQL Lab" \
--with-backend
```
**Generated Structure:**
```
my-extension/
├── extension.json
├── frontend/
│ ├── src/
│ │ └── index.tsx
│ ├── package.json
│ ├── tsconfig.json
│ └── webpack.config.js
├── backend/
│ ├── src/
│ │ └── my_extension/
│ │ ├── __init__.py
│ │ └── entrypoint.py
│ ├── tests/
│ ├── pyproject.toml
│ └── requirements.txt
└── README.md
```
### dev
Starts the development server with hot reloading.
```bash
superset-extensions dev [options]
```
**Options:**
- `--port <port>`: Development server port (default: 9001)
- `--host <host>`: Development server host (default: localhost)
- `--no-watch`: Disable file watching
- `--verbose`: Show detailed output
**Example:**
```bash
# Start development server
superset-extensions dev
# Output:
⚙️ Building frontend assets...
✅ Frontend rebuilt
✅ Backend files synced
✅ Manifest updated
👀 Watching for changes...
```
### build
Builds the extension for production.
```bash
superset-extensions build [options]
```
**Options:**
- `--mode <mode>`: Build mode (development | production)
- `--analyze`: Generate bundle analysis
- `--source-maps`: Include source maps
**Example:**
```bash
# Production build
superset-extensions build --mode production
# With analysis
superset-extensions build --analyze
```
### bundle
Creates a `.supx` package for distribution.
```bash
superset-extensions bundle [options]
```
**Options:**
- `--output <path>`: Output directory (default: current)
- `--sign`: Sign the package (requires certificate)
- `--compress`: Compression level (0-9, default: 6)
**Example:**
```bash
# Create bundle
superset-extensions bundle
# Creates: my-extension-1.0.0.supx
```
### validate
Validates extension configuration and structure.
```bash
superset-extensions validate [options]
```
**Options:**
- `--fix`: Auto-fix common issues
- `--strict`: Enable strict validation
**Checks:**
- Valid extension.json syntax
- Required files present
- Dependency versions
- Module exports
- TypeScript configuration
**Example:**
```bash
superset-extensions validate --strict
# Output:
✅ extension.json valid
✅ Frontend structure valid
✅ Backend structure valid
⚠️ Warning: Missing LICENSE file
✅ Validation passed with warnings
```
### test
Runs extension tests.
```bash
superset-extensions test [options]
```
**Options:**
- `--coverage`: Generate coverage report
- `--watch`: Run in watch mode
- `--frontend-only`: Run only frontend tests
- `--backend-only`: Run only backend tests
**Example:**
```bash
# Run all tests
superset-extensions test --coverage
# Watch mode for frontend
superset-extensions test --frontend-only --watch
```
### publish
Publishes extension to a registry (future feature).
```bash
superset-extensions publish [options]
```
**Options:**
- `--registry <url>`: Registry URL
- `--token <token>`: Authentication token
- `--dry-run`: Simulate publish
## Configuration
### Project Configuration
The CLI reads configuration from multiple sources:
1. **extension.json** - Extension metadata
2. **package.json** - Frontend dependencies
3. **pyproject.toml** - Backend configuration
4. **.extensionrc** - CLI-specific settings
### .extensionrc Example
```json
{
"dev": {
"port": 9001,
"host": "localhost",
"autoReload": true
},
"build": {
"mode": "production",
"sourceMaps": false,
"optimization": true
},
"test": {
"coverage": true,
"threshold": {
"statements": 80,
"branches": 70,
"functions": 80,
"lines": 80
}
}
}
```
## Templates
### Available Templates
- **basic**: Simple extension with frontend only
- **full-stack**: Frontend and backend components
- **sql-panel**: SQL Lab panel extension
- **api-only**: Backend API extension
- **chart-plugin**: Custom chart visualization
### Using Templates
```bash
# Use specific template
superset-extensions init my-chart --template chart-plugin
# List available templates
superset-extensions init --list-templates
```
### Custom Templates
Create custom templates in `~/.superset-extensions/templates/`:
```
~/.superset-extensions/templates/
└── my-template/
├── template.json
└── files/
└── ... template files ...
```
## Development Workflow
### 1. Create Extension
```bash
superset-extensions init awesome-feature
cd awesome-feature
```
### 2. Install Dependencies
```bash
# Frontend
cd frontend && npm install
# Backend (if applicable)
cd ../backend && pip install -r requirements.txt
```
### 3. Configure Superset
```python
# superset_config.py
ENABLE_EXTENSIONS = True
LOCAL_EXTENSIONS = [
"/path/to/awesome-feature"
]
```
### 4. Start Development
```bash
# Terminal 1: Extension dev server
superset-extensions dev
# Terminal 2: Superset
superset run -p 8088 --reload
```
### 5. Test Changes
Make changes to your code and see them reflected immediately in Superset.
### 6. Build and Package
```bash
# Validate
superset-extensions validate
# Test
superset-extensions test
# Build
superset-extensions build --mode production
# Bundle
superset-extensions bundle
```
### 7. Deploy
Upload the `.supx` file to your Superset instance.
## Environment Variables
The CLI respects these environment variables:
- `SUPERSET_EXTENSIONS_DEV_PORT`: Development server port
- `SUPERSET_EXTENSIONS_DEV_HOST`: Development server host
- `SUPERSET_BASE_URL`: Superset instance URL
- `NODE_ENV`: Node environment (development/production)
- `PYTHONPATH`: Python module search path
## Troubleshooting
### Common Issues
#### Port Already in Use
```bash
# Use different port
superset-extensions dev --port 9002
```
#### Module Federation Errors
```bash
# Rebuild with clean cache
rm -rf dist/ node_modules/.cache
superset-extensions build
```
#### Python Import Errors
```bash
# Ensure virtual environment is activated
source venv/bin/activate
superset-extensions dev
```
### Debug Mode
Enable verbose output for troubleshooting:
```bash
# Verbose output
superset-extensions dev --verbose
# Debug webpack
DEBUG=webpack:* superset-extensions build
```
## Best Practices
1. **Version Control**: Commit `extension.json` but not `dist/`
2. **Dependencies**: Pin versions in package.json
3. **Testing**: Write tests for critical functionality
4. **Documentation**: Keep README.md updated
5. **Validation**: Run validate before bundling
6. **Semantic Versioning**: Follow semver for releases
## Advanced Usage
### Custom Webpack Configuration
Extend the default webpack config:
```javascript
// webpack.config.js
const baseConfig = require('./webpack.base.config');
module.exports = {
...baseConfig,
// Custom modifications
resolve: {
...baseConfig.resolve,
alias: {
'@': path.resolve(__dirname, 'src'),
},
},
};
```
### CI/CD Integration
```yaml
# .github/workflows/extension.yml
name: Extension CI
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-node@v2
- uses: actions/setup-python@v2
- name: Install CLI
run: pip install apache-superset-extensions-cli
- name: Validate
run: superset-extensions validate --strict
- name: Test
run: superset-extensions test --coverage
- name: Build
run: superset-extensions build --mode production
- name: Bundle
run: superset-extensions bundle
```
## Getting Help
- **Documentation**: [Developer Portal](../)
- **Examples**: [GitHub Repository](https://github.com/apache/superset/tree/master/extensions)
- **Issues**: [GitHub Issues](https://github.com/apache/superset/issues)
- **Community**: [Slack Channel](https://apache-superset.slack.com)

View File

@@ -1,464 +0,0 @@
---
title: Extension Examples
sidebar_position: 1
hide_title: true
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Extension Examples
Learn from real-world extension implementations that showcase different capabilities of the Superset extension system.
## Dataset References Panel
A SQL Lab panel that analyzes queries and displays information about referenced tables.
### Features
- Parses SQL to extract table references
- Shows table owners and permissions
- Displays last partition information
- Provides row count estimates
### Key Implementation
```typescript
// Parse SQL and extract tables
function extractTables(sql: string): TableReference[] {
const tables = [];
const tableRegex = /FROM\s+(\w+\.?\w+)/gi;
let match;
while ((match = tableRegex.exec(sql)) !== null) {
tables.push({
schema: match[1].split('.')[0],
table: match[1].split('.')[1] || match[1],
});
}
return tables;
}
// Register panel
export function activate(context: ExtensionContext) {
const panel = context.core.registerView('dataset-references.panel', () => (
<DatasetReferencesPanel />
));
// Listen for query changes
const listener = context.sqlLab.onDidChangeEditorContent((content) => {
const tables = extractTables(content);
updatePanelWithTables(tables);
});
context.subscriptions.push(panel, listener);
}
```
### Manifest
```json
{
"name": "dataset-references",
"contributions": {
"views": {
"sqllab.panels": [{
"id": "dataset-references.panel",
"name": "Dataset References",
"icon": "DatabaseOutlined",
"location": "right"
}]
}
}
}
```
## Query Optimizer
Analyzes SQL queries and suggests optimizations.
### Features
- Detects missing indexes
- Suggests query rewrites
- Identifies expensive operations
- Provides execution plan analysis
### Implementation Highlights
```typescript
// Register optimization command
const optimizeCommand = context.commands.registerCommand('query-optimizer.analyze', {
title: 'Analyze Query Performance',
icon: 'ThunderboltOutlined',
execute: async () => {
const query = context.sqlLab.getCurrentQuery();
const database = context.sqlLab.getCurrentDatabase();
// Get execution plan
const plan = await getExecutionPlan(database.id, query);
// Analyze and suggest improvements
const suggestions = analyzeExecutionPlan(plan);
// Show results in panel
showOptimizationResults(suggestions);
}
});
// Add to editor menu
"menus": {
"sqllab.editor": {
"primary": [{
"command": "query-optimizer.analyze",
"when": "editorHasContent"
}]
}
}
```
## Natural Language to SQL
Converts natural language questions to SQL queries using LLM integration.
### Features
- Natural language input
- Context-aware SQL generation
- Query validation
- History tracking
### Key Components
```typescript
// Backend API endpoint
@rest_api.route('/nl2sql/generate')
def generate_sql(prompt: str, context: dict):
# Use LLM to generate SQL
sql = llm_client.generate(
prompt=prompt,
schema=context['schema'],
examples=context['examples']
)
# Validate generated SQL
validation = validate_sql(sql)
return {
'sql': sql,
'valid': validation.is_valid,
'errors': validation.errors
}
```
```typescript
// Frontend integration
function NL2SQLPanel() {
const [prompt, setPrompt] = useState('');
const [loading, setLoading] = useState(false);
const generateSQL = async () => {
setLoading(true);
const response = await context.network.api.post('/extensions/nl2sql/generate', {
prompt,
context: {
database: context.sqlLab.getCurrentDatabase(),
schema: await context.sqlLab.getCurrentSchema(),
}
});
if (response.valid) {
// Insert SQL into editor
context.sqlLab.insertText(response.sql);
}
setLoading(false);
};
return (
<div>
<Input.TextArea
value={prompt}
onChange={(e) => setPrompt(e.target.value)}
placeholder="Describe what data you want..."
/>
<Button onClick={generateSQL} loading={loading}>
Generate SQL
</Button>
</div>
);
}
```
## Schema Visualizer
Interactive database schema visualization.
### Features
- Visual ERD diagram
- Table relationships
- Column details on hover
- Export to image
### Implementation
```typescript
import { Graph } from '@antv/g6';
function SchemaVisualizer() {
const containerRef = useRef<HTMLDivElement>(null);
const [graph, setGraph] = useState<Graph>();
useEffect(() => {
if (!containerRef.current) return;
const g = new Graph({
container: containerRef.current,
layout: {
type: 'dagre',
rankdir: 'LR',
},
defaultNode: {
type: 'sql-table-node',
},
defaultEdge: {
type: 'sql-relation-edge',
},
});
setGraph(g);
loadSchemaData(g);
return () => g.destroy();
}, []);
const loadSchemaData = async (g: Graph) => {
const tables = await context.sqlLab.getTables();
const nodes = tables.map(table => ({
id: table.name,
label: table.name,
columns: table.columns,
}));
const edges = extractRelationships(tables);
g.data({ nodes, edges });
g.render();
};
return <div ref={containerRef} style={{ height: '100%' }} />;
}
```
## SQL Formatter
Formats and beautifies SQL code with customizable rules.
### Features
- Multiple formatting styles
- Custom rule configuration
- Batch formatting
- Format on save
### Simple Implementation
```typescript
import { format } from 'sql-formatter';
const formatCommand = context.commands.registerCommand('sql-formatter.format', {
title: 'Format SQL',
execute: () => {
const sql = context.sqlLab.getCurrentQuery();
const formatted = format(sql, {
language: 'sql',
indent: ' ',
uppercase: true,
linesBetweenQueries: 2,
});
context.sqlLab.replaceQuery(formatted);
}
});
// Auto-format on save
context.sqlLab.onWillSaveQuery((event) => {
if (context.storage.local.get('autoFormat')) {
const formatted = format(event.query);
event.waitUntil(Promise.resolve(formatted));
}
});
```
## Query History Search
Enhanced query history with advanced search and filtering.
### Features
- Full-text search
- Filter by date, user, database
- Query statistics
- Export capabilities
### UI Component
```typescript
function QueryHistoryPanel() {
const [queries, setQueries] = useState<Query[]>([]);
const [filters, setFilters] = useState<Filters>({});
useEffect(() => {
loadQueries();
}, [filters]);
const loadQueries = async () => {
const history = await context.network.api.get('/api/v1/query', {
params: {
...filters,
page_size: 100,
}
});
setQueries(history.result);
};
return (
<div>
<SearchFilters onChange={setFilters} />
<Table
dataSource={queries}
columns={[
{ title: 'Query', dataIndex: 'sql', ellipsis: true },
{ title: 'Database', dataIndex: 'database' },
{ title: 'Status', dataIndex: 'status' },
{ title: 'Duration', dataIndex: 'duration' },
{ title: 'User', dataIndex: 'user' },
{
title: 'Actions',
render: (query) => (
<Button
icon={<CopyOutlined />}
onClick={() => context.sqlLab.insertText(query.sql)}
/>
),
},
]}
/>
</div>
);
}
```
## Git Integration
Version control for SQL queries and dashboards.
### Features
- Save queries to Git
- Track changes
- Collaborative editing
- Branch management
### Backend Integration
```python
from git import Repo
class GitExtension:
def __init__(self, repo_path):
self.repo = Repo(repo_path)
def save_query(self, query, message):
# Save query to file
path = f"queries/{query.name}.sql"
with open(path, 'w') as f:
f.write(query.sql)
# Commit to Git
self.repo.index.add([path])
self.repo.index.commit(message)
return {
'status': 'success',
'commit': self.repo.head.commit.hexsha
}
```
## Best Practices from Examples
### 1. User Experience
- Provide clear feedback for async operations
- Handle errors gracefully
- Include loading states
- Add keyboard shortcuts
### 2. Performance
- Debounce expensive operations
- Cache API responses
- Use virtual scrolling for large lists
- Lazy load heavy components
### 3. Integration
- Respect Superset's theme
- Use provided UI components
- Follow existing UX patterns
- Integrate with existing menus
### 4. Code Organization
```
extension/
├── frontend/
│ ├── src/
│ │ ├── components/ # UI components
│ │ ├── hooks/ # Custom hooks
│ │ ├── services/ # API services
│ │ ├── utils/ # Utilities
│ │ └── index.tsx # Entry point
│ └── tests/
├── backend/
│ ├── src/
│ │ ├── api/ # REST endpoints
│ │ ├── models/ # Data models
│ │ ├── services/ # Business logic
│ │ └── entrypoint.py
│ └── tests/
```
### 5. Testing
```typescript
// Test example
describe('DatasetReferences', () => {
it('should extract tables from SQL', () => {
const sql = 'SELECT * FROM users JOIN orders ON users.id = orders.user_id';
const tables = extractTables(sql);
expect(tables).toEqual([
{ schema: 'public', table: 'users' },
{ schema: 'public', table: 'orders' },
]);
});
});
```
## Learn More
- [API Reference](../api/frontend)
- [Architecture Overview](../architecture/overview)
- [Getting Started Guide](../getting-started)
- [CLI Documentation](../cli/overview)

View File

@@ -1,248 +0,0 @@
---
title: Getting Started
sidebar_position: 2
hide_title: true
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Getting Started with Extensions
This guide will walk you through creating, developing, and deploying your first Superset extension.
## Prerequisites
Before you begin, make sure you have:
- **Node.js 16+** and npm/yarn installed
- **Python 3.9+** and pip installed
- **Superset** running locally or access to a Superset instance
- **Git** for version control
- Basic knowledge of React and TypeScript
## Quick Start
### 1. Install the CLI
First, install the Superset Extensions CLI globally:
```bash
pip install apache-superset-extensions-cli
```
### 2. Create Your First Extension
Use the CLI to scaffold a new extension project:
```bash
superset-extensions init my-first-extension
cd my-first-extension
```
This creates the following structure:
```
my-first-extension/
├── extension.json # Extension metadata
├── frontend/ # Frontend code
│ ├── src/
│ │ └── index.tsx # Main entry point
│ ├── package.json
│ └── webpack.config.js
├── backend/ # Backend code (optional)
│ ├── src/
│ └── requirements.txt
└── README.md
```
### 3. Configure Your Extension
Edit `extension.json` to define your extension's capabilities:
```json
{
"name": "my-first-extension",
"version": "1.0.0",
"description": "My first Superset extension",
"author": "Your Name",
"frontend": {
"contributions": {
"views": {
"sqllab.panels": [
{
"id": "my-extension.main",
"name": "My Panel",
"icon": "ToolOutlined"
}
]
}
}
}
}
```
### 4. Develop Your Extension
Edit `frontend/src/index.tsx` to implement your extension:
```typescript
import React from 'react';
import { ExtensionContext } from '@apache-superset/core';
export function activate(context: ExtensionContext) {
// Register your panel component
const panel = context.core.registerView('my-extension.main', () => (
<div style={{ padding: 20 }}>
<h2>Hello from My Extension!</h2>
<p>This is my first Superset extension.</p>
</div>
));
// Clean up on deactivation
context.subscriptions.push(panel);
}
export function deactivate() {
// Cleanup code if needed
}
```
### 5. Test Locally
Enable development mode in your Superset configuration:
```python
# superset_config.py
ENABLE_EXTENSIONS = True
LOCAL_EXTENSIONS = [
"/path/to/my-first-extension"
]
```
Run the development server:
```bash
# In your extension directory
superset-extensions dev
# In a separate terminal, start Superset
superset run -p 8088 --with-threads --reload
```
Your extension will now appear in SQL Lab!
### 6. Build and Package
When ready to distribute your extension:
```bash
superset-extensions build
superset-extensions bundle
```
This creates a `my-first-extension-1.0.0.supx` file that can be uploaded to any Superset instance.
### 7. Deploy
Upload your extension via the Superset UI or API:
```bash
curl -X POST http://localhost:8088/api/v1/extensions/import/ \
-H "Authorization: Bearer YOUR_TOKEN" \
-F "file=@my-first-extension-1.0.0.supx"
```
## What's Next?
Now that you have a basic extension working, explore:
- [Extension Architecture](../architecture/overview) - Understand how extensions work
- [API Reference](../api/frontend) - Learn about available APIs
- [Examples](../examples) - See more complex extension examples
- [Best Practices](./best-practices) - Learn extension development best practices
## Common Patterns
### Adding a SQL Lab Panel
```typescript
const panel = context.core.registerView('my-extension.panel', () => (
<MyPanelComponent />
));
```
### Registering Commands
```typescript
const command = context.commands.registerCommand('my-extension.run', {
title: 'Run My Command',
execute: () => {
// Command logic
}
});
```
### Listening to Events
```typescript
const listener = context.sqlLab.onDidQueryRun((query) => {
console.log('Query executed:', query);
});
```
### Adding Menu Items
```json
{
"menus": {
"sqllab.editor": {
"primary": [{
"command": "my-extension.run",
"when": "editorHasSelection"
}]
}
}
}
```
## Troubleshooting
### Extension Not Loading
- Ensure `ENABLE_EXTENSIONS = True` in your config
- Check the browser console for errors
- Verify the extension path in `LOCAL_EXTENSIONS`
### Module Not Found Errors
- Run `npm install` in the frontend directory
- Check that `@apache-superset/core` is properly externalized
### Build Failures
- Ensure all dependencies are installed
- Check webpack.config.js for proper Module Federation setup
- Verify extension.json is valid JSON
## Get Help
- Join the [Superset Slack](https://join.slack.com/t/apache-superset/shared_invite/zt-16jvzmoi8-sI1TY1Pm~y_RnSiUAN0jqQ)
- Ask questions on [GitHub Discussions](https://github.com/apache/superset/discussions)
- Report issues on [GitHub Issues](https://github.com/apache/superset/issues)

View File

@@ -1,126 +0,0 @@
---
title: Developer Portal
sidebar_position: 1
hide_title: true
---
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# Superset Developer Portal
Welcome to the Apache Superset Developer Portal! This is your comprehensive guide to extending and customizing Superset through our new extension architecture.
## What are Superset Extensions?
Superset Extensions provide a powerful way to enhance and customize Apache Superset without modifying the core codebase. Following the successful model pioneered by VS Code, our extension architecture enables developers to:
- **Add custom features** to SQL Lab and other Superset modules
- **Create reusable components** that can be shared across organizations
- **Integrate external tools** and services seamlessly
- **Customize workflows** to match your team's specific needs
## Why Extensions?
As Superset has grown, we've recognized the need for a more modular architecture that allows:
- **Innovation without fragmentation** - Build features without forking the codebase
- **Community-driven development** - Share and reuse extensions across organizations
- **Stable APIs** - Develop against versioned, well-documented interfaces
- **Rapid iteration** - Deploy custom features without waiting for core releases
## Key Features
### 🎯 Well-Defined Extension Points
Extensions can contribute to specific areas of Superset:
- SQL Lab panels (left, right, bottom, editor)
- Custom commands and menu items
- Status bar components
- API endpoints and backend functionality
### 🔧 Modern Development Experience
- **CLI tools** for scaffolding, building, and packaging
- **Hot reloading** during development
- **TypeScript support** with full type safety
- **Module Federation** for dynamic loading
### 📦 Simple Distribution
- Package extensions as `.supx` files
- Upload via REST API or UI
- Automatic activation and lifecycle management
- Version compatibility checking
## Getting Started
Ready to build your first extension? Check out our [Getting Started Guide](./getting-started) to:
1. Set up your development environment
2. Create your first extension
3. Test it locally
4. Package and deploy it
## Architecture Overview
Our extension architecture is built on several key principles:
- **Lean Core**: Keep Superset's core minimal and delegate features to extensions
- **Explicit APIs**: Clear, versioned interfaces for extension interactions
- **Lazy Loading**: Extensions load only when needed for optimal performance
- **Security First**: Extensions run with appropriate permissions and sandboxing
Learn more in our [Architecture Documentation](./architecture/overview).
## Current Status
The extension architecture is currently in active development. The initial focus is on SQL Lab extensions, with plans to expand to:
- Dashboard extensions
- Chart plugins (enhanced from current system)
- Database connectors
- Security providers
## Example: Dataset References Extension
See the extension system in action with our Dataset References example, which adds a SQL Lab panel showing:
- Tables referenced in queries
- Table owners for permission requests
- Last available partitions
- Estimated row counts
## Join the Community
- **Contribute**: Help shape the future of Superset extensions
- **Share**: Publish your extensions for others to use
- **Learn**: Explore extensions built by the community
## Quick Links
- [Getting Started Guide](./getting-started)
- [Extension Architecture](./architecture/overview)
- [API Reference](./api/frontend)
- [CLI Documentation](./cli/overview)
- [Examples](./examples)
---
*The Superset extension architecture is inspired by the successful model of VS Code Extensions, bringing similar flexibility and power to the data exploration domain.*

View File

@@ -1,59 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
module.exports = {
developerPortalSidebar: [
'index',
{
type: 'category',
label: 'Getting Started',
items: [
'getting-started/index',
],
},
{
type: 'category',
label: 'Architecture',
items: [
'architecture/overview',
],
},
{
type: 'category',
label: 'API Reference',
items: [
'api/frontend',
],
},
{
type: 'category',
label: 'CLI',
items: [
'cli/overview',
],
},
{
type: 'category',
label: 'Examples',
items: [
'examples/index',
],
},
],
};

View File

@@ -1 +0,0 @@
[]

View File

@@ -1 +0,0 @@
[]

View File

@@ -13,9 +13,9 @@ apache-superset>=6.0
Superset now rides on **Ant Design v5's token-based theming**.
Every Antd token works, plus a handful of Superset-specific ones for charts and dashboard chrome.
## Managing Themes via UI
## Managing Themes via CRUD Interface
Superset includes a built-in **Theme Management** interface accessible from the admin menu under **Settings > Themes**.
Superset now includes a built-in **Theme Management** interface accessible from the admin menu under **Settings > Themes**.
### Creating a New Theme
@@ -29,38 +29,22 @@ Superset includes a built-in **Theme Management** interface accessible from the
You can also extend with Superset-specific tokens (documented in the default theme object) before you import.
### System Theme Administration
When `ENABLE_UI_THEME_ADMINISTRATION = True` is configured, administrators can manage system-wide themes directly from the UI:
#### Setting System Themes
- **System Default Theme**: Click the sun icon on any theme to set it as the system-wide default
- **System Dark Theme**: Click the moon icon on any theme to set it as the system dark mode theme
- **Automatic OS Detection**: When both default and dark themes are set, Superset automatically detects and applies the appropriate theme based on OS preferences
#### Managing System Themes
- System themes are indicated with special badges in the theme list
- Only administrators with write permissions can modify system theme settings
- Removing a system theme designation reverts to configuration file defaults
### Applying Themes to Dashboards
Once created, themes can be applied to individual dashboards:
- Edit any dashboard and select your custom theme from the theme dropdown
- Each dashboard can have its own theme, allowing for branded or context-specific styling
## Configuration Options
## Alternative: Instance-wide Configuration
### Python Configuration
For system-wide theming, you can configure default themes via Python configuration:
Configure theme behavior via `superset_config.py`:
### Setting Default Themes
```python
# Enable UI-based theme administration for admins
ENABLE_UI_THEME_ADMINISTRATION = True
# superset_config.py
# Optional: Set initial default themes via configuration
# These can be overridden via the UI when ENABLE_UI_THEME_ADMINISTRATION = True
# Default theme (light mode)
THEME_DEFAULT = {
"token": {
"colorPrimary": "#2893B3",
@@ -69,7 +53,7 @@ THEME_DEFAULT = {
}
}
# Optional: Dark theme configuration
# Dark theme configuration
THEME_DARK = {
"algorithm": "dark",
"token": {
@@ -78,28 +62,23 @@ THEME_DARK = {
}
}
# To force a single theme on all users, set THEME_DARK = None
# When both themes are defined (via UI or config):
# - Users can manually switch between themes
# - OS preference detection is automatically enabled
# Theme behavior settings
THEME_SETTINGS = {
"enforced": False, # If True, forces default theme always
"allowSwitching": True, # Allow users to switch between themes
"allowOSPreference": True, # Auto-detect system theme preference
}
```
### Migration from Configuration to UI
### Copying Themes from CRUD Interface
When `ENABLE_UI_THEME_ADMINISTRATION = True`:
To use a theme created via the CRUD interface as your system default:
1. System themes set via the UI take precedence over configuration file settings
2. The UI shows which themes are currently set as system defaults
3. Administrators can change system themes without restarting Superset
4. Configuration file themes serve as fallbacks when no UI themes are set
1. Navigate to **Settings > Themes** and edit your desired theme
2. Copy the complete JSON configuration from the theme definition field
3. Paste it directly into your `superset_config.py` as shown above
### Copying Themes Between Systems
To export a theme for use in configuration files or another instance:
1. Navigate to **Settings > Themes** and click the export icon on your desired theme
2. Extract the JSON configuration from the exported YAML file
3. Use this JSON in your `superset_config.py` or import it into another Superset instance
Restart Superset to apply changes.
## Theme Development Workflow
@@ -108,85 +87,8 @@ To export a theme for use in configuration files or another instance:
3. **Apply**: Assign themes to specific dashboards or configure instance-wide
4. **Iterate**: Modify theme JSON directly in the CRUD interface or re-import from the theme editor
## Custom Fonts
Superset supports custom fonts through runtime configuration, allowing you to use branded or custom typefaces without rebuilding the application.
### Configuring Custom Fonts
Add font URLs to your `superset_config.py`:
```python
# Load fonts from Google Fonts, Adobe Fonts, or self-hosted sources
CUSTOM_FONT_URLS = [
"https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap",
"https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500&display=swap",
]
# Update CSP to allow font sources
TALISMAN_CONFIG = {
"content_security_policy": {
"font-src": ["'self'", "https://fonts.googleapis.com", "https://fonts.gstatic.com"],
"style-src": ["'self'", "'unsafe-inline'", "https://fonts.googleapis.com"],
}
}
```
### Using Custom Fonts in Themes
Once configured, reference the fonts in your theme configuration:
```python
THEME_DEFAULT = {
"token": {
"fontFamily": "Inter, -apple-system, BlinkMacSystemFont, sans-serif",
"fontFamilyCode": "JetBrains Mono, Monaco, monospace",
# ... other theme tokens
}
}
```
Or in the CRUD interface theme JSON:
```json
{
"token": {
"fontFamily": "Inter, -apple-system, BlinkMacSystemFont, sans-serif",
"fontFamilyCode": "JetBrains Mono, Monaco, monospace"
}
}
```
### Font Sources
- **Google Fonts**: Free, CDN-hosted fonts with wide variety
- **Adobe Fonts**: Premium fonts (requires subscription and kit ID)
- **Self-hosted**: Place font files in `/static/assets/fonts/` and reference via CSS
This feature works with the stock Docker image - no custom build required!
## Advanced Features
- **System Themes**: Manage system-wide default and dark themes via UI or configuration
- **System Themes**: Superset includes built-in light and dark themes
- **Per-Dashboard Theming**: Each dashboard can have its own visual identity
- **JSON Editor**: Edit theme configurations directly within Superset's interface
- **Custom Fonts**: Load external fonts via configuration without rebuilding
- **OS Dark Mode Detection**: Automatically switches themes based on system preferences
- **Theme Import/Export**: Share themes between instances via YAML files
## API Access
For programmatic theme management, Superset provides REST endpoints:
- `GET /api/v1/theme/` - List all themes
- `POST /api/v1/theme/` - Create a new theme
- `PUT /api/v1/theme/{id}` - Update a theme
- `DELETE /api/v1/theme/{id}` - Delete a theme
- `PUT /api/v1/theme/{id}/set_system_default` - Set as system default theme (admin only)
- `PUT /api/v1/theme/{id}/set_system_dark` - Set as system dark theme (admin only)
- `DELETE /api/v1/theme/unset_system_default` - Remove system default designation
- `DELETE /api/v1/theme/unset_system_dark` - Remove system dark designation
- `GET /api/v1/theme/export/` - Export themes as YAML
- `POST /api/v1/theme/import/` - Import themes from YAML
These endpoints require appropriate permissions and are subject to RBAC controls.

View File

@@ -137,7 +137,7 @@ contributing to Apache Superset more accessible to developers worldwide.
1. **Create a Codespace**: Use this pre-configured link that sets up everything you need:
[**Launch Superset Codespace →**](https://github.com/codespaces/new?skip_quickstart=true&machine=standardLinux32gb&repo=39464018&ref=master&devcontainer_path=.devcontainer%2Fdevcontainer.json&geo=UsWest)
[**Launch Superset Codespace →**](https://github.com/codespaces/new?skip_quickstart=true&machine=standardLinux32gb&repo=39464018&ref=codespaces&geo=UsWest&devcontainer_path=.devcontainer%2Fdevcontainer.json)
:::caution
**Important**: You must select at least the **4 CPU / 16GB RAM** machine type (pre-selected in the link above).
@@ -421,6 +421,14 @@ Then make sure you run your WSGI server using the right worker type:
gunicorn "superset.app:create_app()" -k "geventwebsocket.gunicorn.workers.GeventWebSocketWorker" -b 127.0.0.1:8088 --reload
```
You can log anything to the browser console, including objects:
```python
from superset import app
app.logger.error('An exception occurred!')
app.logger.info(form_data)
```
### Frontend
Frontend assets (TypeScript, JavaScript, CSS, and images) must be compiled in order to properly display the web UI. The `superset-frontend` directory contains all NPM-managed frontend assets. Note that for some legacy pages there are additional frontend assets bundled with Flask-Appbuilder (e.g. jQuery and bootstrap). These are not managed by NPM and may be phased out in the future.

View File

@@ -0,0 +1,741 @@
---
title: API Reference
sidebar_position: 3
version: 1
---
# MCP Tools API Reference
Complete reference for all 16 MCP tools with request/response examples.
> 🚀 **First time here?** Start with [Dashboard Tools](#dashboard-tools) or [Chart Tools](#chart-tools) to see the most commonly used features.
>
> 🔐 **Need authentication?** See the [Authentication Guide](./authentication) for JWT setup.
>
> 🔧 **Want to add tools?** Check the [Development Guide](./development#adding-new-tools) for step-by-step instructions.
## Dashboard Tools
### list_dashboards
List dashboards with search, filtering, and pagination support.
**Request Schema:**
```json
{
"search": "sales", // Optional: Search term
"filters": [ // Optional: Advanced filters
{
"col": "published",
"opr": "eq",
"value": true
}
],
"page": 1, // Optional: Page number (default: 1)
"page_size": 20, // Optional: Items per page (default: 20)
"select_columns": [ // Optional: Specific columns
"id", "dashboard_title", "uuid"
],
"use_cache": true // Optional: Use cached data (default: true)
}
```
**Response Example:**
```json
{
"dashboards": [
{
"id": 1,
"uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"dashboard_title": "Sales Performance",
"url": "/superset/dashboard/1/",
"published": true,
"owners": ["admin"],
"created_on": "2024-01-15T10:30:00Z",
"changed_on": "2024-01-20T14:15:00Z"
}
],
"total_count": 45,
"page": 1,
"page_size": 20,
"cache_status": {
"cache_hit": true,
"cache_age_seconds": 300
}
}
```
### get_dashboard_info
Get detailed information about a specific dashboard.
**Request Schema:**
```json
{
"identifier": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", // ID, UUID, or slug
"use_cache": true
}
```
**Response Example:**
```json
{
"dashboard_id": 1,
"uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"dashboard_title": "Sales Performance Dashboard",
"slug": "sales-performance",
"url": "/superset/dashboard/1/",
"published": true,
"owners": ["admin", "analyst"],
"roles": ["Sales Team"],
"charts": [
{
"id": 10,
"slice_name": "Monthly Revenue",
"viz_type": "line"
},
{
"id": 11,
"slice_name": "Regional Sales",
"viz_type": "bar"
}
],
"filters": [
{
"column": "region",
"type": "select"
}
],
"created_on": "2024-01-15T10:30:00Z",
"changed_on": "2024-01-20T14:15:00Z"
}
```
### generate_dashboard
Create a new dashboard with multiple charts.
**Request Schema:**
```json
{
"chart_ids": [10, 11, 12, 13],
"dashboard_title": "Q4 Performance Dashboard",
"description": "Quarterly performance metrics and KPIs",
"published": true,
"layout_type": "grid" // Optional: "grid" or "tabs"
}
```
**Response Example:**
```json
{
"dashboard_id": 25,
"uuid": "new-dash-uuid-here",
"dashboard_title": "Q4 Performance Dashboard",
"url": "/superset/dashboard/25/",
"charts_added": 4,
"layout": {
"type": "grid",
"columns": 2,
"rows": 2
},
"created_on": "2024-01-25T16:45:00Z"
}
```
## Chart Tools
### list_charts
List charts with advanced filtering and search capabilities.
**Request Schema:**
```json
{
"search": "revenue",
"filters": [
{
"col": "viz_type",
"opr": "in",
"value": ["line", "bar", "area"]
}
],
"page": 1,
"page_size": 25,
"select_columns": ["id", "slice_name", "viz_type", "uuid"],
"use_cache": true
}
```
**Response Example:**
```json
{
"charts": [
{
"id": 10,
"uuid": "chart-uuid-1",
"slice_name": "Monthly Revenue Trend",
"viz_type": "line",
"datasource_name": "sales_data",
"owners": ["admin"],
"created_on": "2024-01-10T09:15:00Z"
}
],
"total_count": 125,
"page": 1,
"page_size": 25
}
```
### get_chart_info
Get comprehensive chart information including configuration.
**Request Schema:**
```json
{
"identifier": 10, // ID or UUID
"include_form_data": true, // Include chart configuration
"use_cache": true
}
```
**Response Example:**
```json
{
"chart_id": 10,
"uuid": "chart-uuid-1",
"slice_name": "Monthly Revenue Trend",
"viz_type": "line",
"datasource_id": 5,
"datasource_name": "sales_data",
"datasource_type": "table",
"form_data": {
"viz_type": "line",
"x_axis": "month",
"metrics": ["sum__revenue"],
"time_range": "Last 12 months"
},
"query_context": {
"datasource": {"id": 5, "type": "table"},
"queries": [{"columns": [], "metrics": ["sum__revenue"]}]
},
"explore_url": "/superset/explore/?form_data=%7B%22slice_id%22%3A10%7D",
"owners": ["admin"],
"created_on": "2024-01-10T09:15:00Z",
"changed_on": "2024-01-15T11:30:00Z"
}
```
### generate_chart
Create a new chart with specified configuration.
**Request Schema:**
```json
{
"dataset_id": "5",
"config": {
"chart_type": "xy",
"x": {"name": "month", "label": "Month"},
"y": [
{
"name": "revenue",
"aggregate": "SUM",
"label": "Total Revenue"
},
{
"name": "orders",
"aggregate": "COUNT",
"label": "Order Count"
}
],
"kind": "line",
"x_axis": {
"title": "Month",
"format": "smart_date"
},
"y_axis": {
"title": "Revenue ($)",
"format": "$,.0f"
},
"legend": {
"show": true,
"position": "top"
}
},
"slice_name": "Revenue and Orders Trend",
"description": "Monthly revenue and order count comparison",
"save_chart": true,
"generate_preview": true,
"preview_formats": ["url", "ascii"]
}
```
**Response Example:**
```json
{
"chart_id": 45,
"uuid": "new-chart-uuid",
"slice_name": "Revenue and Orders Trend",
"viz_type": "echarts_timeseries_line",
"datasource_id": 5,
"explore_url": "/superset/explore/?form_data=%7B%22slice_id%22%3A45%7D",
"query_executed": true,
"query_result": {
"status": "success",
"row_count": 12,
"execution_time": 0.145
},
"preview": {
"url": {
"preview_url": "http://localhost:5008/screenshot/chart/45.png",
"width": 800,
"height": 600
},
"ascii": {
"ascii_content": "Revenue Trend\n==============\nJan |████████████████ $125K\nFeb |██████████████████ $140K\n...",
"width": 80,
"height": 20
}
},
"created_on": "2024-01-25T14:20:00Z"
}
```
### get_chart_data
Export chart data in multiple formats.
**Request Schema:**
```json
{
"identifier": 10,
"format": "json", // "json", "csv", "excel"
"limit": 1000, // Optional: Row limit
"offset": 0, // Optional: Row offset
"filters": [ // Optional: Additional filters
{
"column": "region",
"op": "=",
"value": "US"
}
],
"use_cache": true,
"force_refresh": false
}
```
**Response Example:**
```json
{
"data": [
{
"month": "2024-01",
"revenue": 125000,
"orders": 450
},
{
"month": "2024-02",
"revenue": 140000,
"orders": 520
}
],
"total_rows": 12,
"columns": [
{"name": "month", "type": "DATE"},
{"name": "revenue", "type": "BIGINT"},
{"name": "orders", "type": "BIGINT"}
],
"query": {
"sql": "SELECT month, SUM(revenue) as revenue, COUNT(*) as orders FROM sales_data GROUP BY month ORDER BY month",
"execution_time": 0.089
},
"cache_status": {
"cache_hit": false,
"cache_type": "query",
"refreshed": true
}
}
```
### get_chart_preview
Generate chart previews in multiple formats.
**Request Schema:**
```json
{
"identifier": 10,
"format": "url", // "url", "base64", "ascii", "table"
"width": 800, // For image formats
"height": 600, // For image formats
"ascii_width": 80, // For ASCII format
"ascii_height": 20, // For ASCII format
"use_cache": true
}
```
**Response Examples:**
**URL Format:**
```json
{
"format": "url",
"preview_url": "http://localhost:5008/screenshot/chart/10.png",
"width": 800,
"height": 600,
"supports_interaction": false,
"expires_at": "2024-01-26T14:20:00Z"
}
```
**ASCII Format:**
```json
{
"format": "ascii",
"ascii_content": "Monthly Revenue Trend\n=====================\n\nJan |████████████████████ $125K\nFeb |██████████████████████ $140K\nMar |███████████████████ $135K\nApr |█████████████████████████ $155K\n\nRange: $125K to $155K\n▁▃▂▅▇▆▄▃▂▄▅▆▇▅▃▂",
"width": 80,
"height": 20,
"supports_color": false
}
```
**Table Format:**
```json
{
"format": "table",
"table_data": "Monthly Revenue Data\n====================\n\nMonth | Revenue | Orders\n---------|----------|--------\nJan 2024 | $125,000 | 450\nFeb 2024 | $140,000 | 520\nMar 2024 | $135,000 | 495\n\nTotal: 12 rows × 3 columns",
"row_count": 12,
"supports_sorting": true
}
```
## Dataset Tools
### list_datasets
List available datasets with columns and metrics.
**Request Schema:**
```json
{
"search": "sales",
"filters": [
{
"col": "is_active",
"opr": "eq",
"value": true
}
],
"include_columns": true, // Include column metadata
"include_metrics": true, // Include metric metadata
"page": 1,
"page_size": 15
}
```
**Response Example:**
```json
{
"datasets": [
{
"id": 1,
"uuid": "dataset-uuid-1",
"table_name": "sales_data",
"database_name": "main_warehouse",
"schema": "public",
"owners": ["admin"],
"columns": [
{
"column_name": "region",
"type": "VARCHAR",
"is_active": true,
"is_dttm": false
},
{
"column_name": "revenue",
"type": "DECIMAL",
"is_active": true,
"is_dttm": false
}
],
"metrics": [
{
"metric_name": "sum__revenue",
"expression": "SUM(revenue)",
"metric_type": "sum"
}
],
"created_on": "2024-01-05T08:00:00Z"
}
],
"total_count": 23,
"page": 1,
"page_size": 15
}
```
### get_dataset_info
Get detailed dataset information with full column/metric metadata.
**Request Schema:**
```json
{
"identifier": "dataset-uuid-1", // ID or UUID
"include_columns": true,
"include_metrics": true,
"use_cache": true
}
```
**Response Example:**
```json
{
"dataset_id": 1,
"uuid": "dataset-uuid-1",
"table_name": "sales_data",
"database_name": "main_warehouse",
"database_id": 1,
"schema": "public",
"sql": null,
"is_active": true,
"owners": ["admin", "data_team"],
"columns": [
{
"id": 101,
"column_name": "region",
"type": "VARCHAR",
"is_active": true,
"is_dttm": false,
"groupby": true,
"filterable": true,
"description": "Geographic region"
},
{
"id": 102,
"column_name": "order_date",
"type": "DATE",
"is_active": true,
"is_dttm": true,
"groupby": true,
"filterable": true
}
],
"metrics": [
{
"id": 201,
"metric_name": "sum__revenue",
"expression": "SUM(revenue)",
"metric_type": "sum",
"is_active": true,
"description": "Total revenue"
}
],
"created_on": "2024-01-05T08:00:00Z",
"changed_on": "2024-01-18T12:30:00Z"
}
```
## System Tools
### get_superset_instance_info
Get Superset instance information and statistics.
**Request Schema:**
```json
{
"include_statistics": true, // Include usage statistics
"include_tools": true, // Include available MCP tools
"use_cache": true
}
```
**Response Example:**
```json
{
"version": "4.1.0",
"build": "apache-superset-4.1.0",
"mcp_service_version": "1.0.0",
"authentication": {
"enabled": true,
"type": "jwt_bearer",
"required_scopes": ["dashboard:read", "chart:read"]
},
"statistics": {
"dashboards": {
"total": 45,
"published": 32
},
"charts": {
"total": 125,
"by_viz_type": {
"line": 35,
"bar": 28,
"table": 42,
"pie": 20
}
},
"datasets": {
"total": 23,
"active": 18
},
"users": {
"total": 15,
"active": 12
}
},
"mcp_tools": [
{
"name": "list_dashboards",
"description": "List dashboards with search and filtering",
"category": "dashboard"
},
{
"name": "generate_chart",
"description": "Create new charts programmatically",
"category": "chart"
}
],
"database_connections": [
{
"id": 1,
"database_name": "main_warehouse",
"backend": "postgresql",
"status": "healthy"
}
],
"cache_status": {
"enabled": true,
"backend": "redis",
"hit_rate": 0.85
}
}
```
### generate_explore_link
Generate Superset explore URLs with pre-configured chart settings.
**Request Schema:**
```json
{
"dataset_id": "1",
"chart_config": {
"viz_type": "line",
"x_axis": "month",
"metrics": ["sum__revenue"],
"time_range": "Last 6 months"
},
"title": "Revenue Analysis",
"cache_form_data": true
}
```
**Response Example:**
```json
{
"explore_url": "/superset/explore/?form_data_key=abc123def456",
"full_url": "http://localhost:8088/superset/explore/?form_data_key=abc123def456",
"form_data_key": "abc123def456",
"expires_at": "2024-01-26T16:45:00Z",
"chart_config": {
"viz_type": "line",
"datasource": "1__table",
"x_axis": "month",
"metrics": ["sum__revenue"],
"time_range": "Last 6 months"
}
}
```
## SQL Lab Tools
### open_sql_lab_with_context
Open SQL Lab with pre-configured database, schema, and SQL.
**Request Schema:**
```json
{
"database_connection_id": 1,
"schema": "public",
"dataset_in_context": "sales_data",
"sql": "SELECT region, SUM(revenue) as total_revenue\nFROM sales_data \nWHERE order_date >= '2024-01-01'\nGROUP BY region\nORDER BY total_revenue DESC",
"title": "Regional Sales Analysis"
}
```
**Response Example:**
```json
{
"sql_lab_url": "/superset/sqllab/?dbid=1&schema=public&sql_template=encoded_sql_here",
"full_url": "http://localhost:8088/superset/sqllab/?dbid=1&schema=public&sql_template=encoded_sql_here",
"database_connection": {
"id": 1,
"database_name": "main_warehouse",
"backend": "postgresql"
},
"schema": "public",
"sql_template": "SELECT region, SUM(revenue) as total_revenue...",
"context": {
"dataset": "sales_data",
"title": "Regional Sales Analysis"
}
}
```
## Error Responses
All tools can return error responses with this structure:
```json
{
"error": "Chart not found with identifier: 999",
"error_type": "NotFound",
"suggestions": [
"Verify the chart ID exists",
"Check if you have permission to access this chart",
"Try using the chart UUID instead of ID"
],
"details": {
"identifier": 999,
"identifier_type": "id"
}
}
```
## Cache Status
Many responses include cache status information:
```json
{
"cache_status": {
"cache_hit": true, // Data served from cache
"cache_type": "query", // Type: query, metadata, form_data
"cache_age_seconds": 300, // Age of cached data
"refreshed": false // Whether cache was refreshed
}
}
```
This API reference provides complete documentation for integrating with the Superset MCP service, including all request schemas, response formats, and error handling patterns.
## What's Next?
### 🔐 **Ready for Production?**
Set up authentication and security with the [Authentication Guide](./authentication).
### 🔧 **Want to Add More Tools?**
Learn how to extend the MCP service in the [Development Guide](./development).
### 🏗️ **Need Architecture Details?**
Understand the system design in the [Architecture Overview](./architecture).
### 🏢 **Enterprise Features?**
Explore advanced capabilities in the [Preset Integration Guide](./preset-integration).
> 📖 **Back to Documentation Index**: [MCP Service](./intro)

View File

@@ -0,0 +1,191 @@
---
title: Architecture Overview
sidebar_position: 5
version: 1
---
# Architecture Overview
The Superset Model Context Protocol (MCP) service provides a modular, schema-driven interface for programmatic access to Superset dashboards, charts, datasets, and instance metadata. Built on FastMCP for LLM agents and automation tools.
**Status:** Phase 1 Complete. Core functionality stable, authentication production-ready. See [SIP-171](https://github.com/apache/superset/issues/33870) for roadmap.
## Core Architecture
### Tool Structure
- **16 MCP tools** organized by domain: `dashboard/`, `chart/`, `dataset/`, `system/`
- All tools decorated with `@mcp.tool` and `@mcp_auth_hook`
- **Import inside functions**: All Superset DAOs/commands imported in function body to ensure proper app context
- Pydantic v2 schemas with LLM/OpenAPI-compatible field descriptions
### Request Schema Pattern
Eliminates LLM parameter validation issues using structured request objects:
```python
# New approach - single request object
get_dataset_info(request={"identifier": 123}) # ID
get_dataset_info(request={"identifier": "uuid-string"}) # UUID
# Old approach - replaced
get_dataset_info(dataset_id=123)
```
### Multi-Identifier Support
- **Charts/Datasets**: ID (numeric) or UUID (string)
- **Dashboards**: ID (numeric), UUID (string), or slug (string)
- Validation prevents conflicting parameters (search + filters)
## Available Tools
### Dashboard Tools (5)
- `list_dashboards` - List with search/filters/pagination
- `get_dashboard_info` - Get by ID/UUID/slug
- `get_dashboard_available_filters` - Discover filterable columns
- `generate_dashboard` - Create dashboards with multiple charts
- `add_chart_to_existing_dashboard` - Add charts to existing dashboards
### Chart Tools (8)
- `list_charts` - List with search/filters/pagination
- `get_chart_info` - Get by ID/UUID
- `get_chart_available_filters` - Discover filterable columns
- `generate_chart` - Create charts (table, line, bar, area, scatter)
- `update_chart` - Update saved charts
- `update_chart_preview` - Update cached previews
- `get_chart_data` - Export data (JSON/CSV/Excel)
- `get_chart_preview` - Screenshots, ASCII art, table previews
### Dataset Tools (3)
- `list_datasets` - List with columns/metrics
- `get_dataset_info` - Get by ID/UUID with metadata
- `get_dataset_available_filters` - Discover filterable columns
### System Tools (2)
- `get_superset_instance_info` - Instance statistics and version
- `generate_explore_link` - Generate chart exploration URLs
### SQL Lab Tools (1)
- `open_sql_lab_with_context` - Pre-configured SQL Lab sessions
## Authentication & Security
### JWT Bearer Authentication
Production-ready authentication with configurable factory pattern:
```python
# In superset_config.py
MCP_AUTH_ENABLED = True
MCP_JWKS_URI = "https://auth.company.com/.well-known/jwks.json"
MCP_JWT_ISSUER = "https://auth.company.com/"
MCP_JWT_AUDIENCE = "superset-mcp-api"
```
### Scope-Based Authorization
| Tool Category | Required Scope |
|---------------|----------------|
| Dashboard ops | `dashboard:read` |
| Chart ops | `chart:read` / `chart:write` |
| Dataset ops | `dataset:read` |
| System ops | `instance:read` |
### Audit Logging
All operations logged with MCP context:
- User impersonation tracking
- Tool execution details
- Sanitized payloads (sensitive data redacted)
## Cache Control
Leverages Superset's existing cache layers with comprehensive control:
### Cache Types
1. **Query Result Cache** - Database query results
2. **Metadata Cache** - Table schemas, columns, metrics
3. **Form Data Cache** - Chart configurations
4. **Dashboard Cache** - Rendered components
### Cache Parameters
Tools support cache control through request schemas:
- `use_cache`: Enable/disable caching (default: true)
- `force_refresh`: Force cache refresh (default: false)
- `cache_timeout`: Override timeout in seconds
- `refresh_metadata`: Force metadata refresh
### Cache Status Reporting
```json
{
"cache_status": {
"cache_hit": true,
"cache_type": "query",
"cache_age_seconds": 300,
"refreshed": false
}
}
```
## Tool Abstractions
### Generic Base Classes
- **ModelListTool**: Handles list/search/filter operations with pagination
- **ModelGetInfoTool**: Single object retrieval by multiple identifier types
- **ModelGetAvailableFiltersTool**: Returns filterable columns/operators
### Implementation Pattern
```python
@mcp.tool
@mcp_auth_hook
def my_tool(request: MyRequest) -> MyResponse:
# Import Superset modules inside function
from superset.daos.dashboard import DashboardDAO
from superset.commands.chart.create import CreateChartCommand
# Tool implementation
return response
```
## Configuration
### URL Configuration
Centralized URL management for consistent link generation:
```python
# In superset_config.py
SUPERSET_WEBSERVER_ADDRESS = "http://localhost:8088" # Development
SUPERSET_WEBSERVER_ADDRESS = "https://superset.company.com" # Production
```
### Schema Design Principles
- **Minimal columns** in list responses
- **Optional fields** in info schemas for missing data handling
- **Null exclusion** for cleaner JSON responses
- **Type safety** with clear Pydantic validation
## Adding New Tools
1. **Choose domain folder**: `dashboard/`, `chart/`, `dataset/`, or `system/`
2. **Define schemas**: Use Pydantic with field descriptions
3. **Implement tool**:
- Decorate with `@mcp.tool` and `@mcp_auth_hook`
- Import Superset modules inside function body
- Use generic abstractions where applicable
4. **Register**: Add to appropriate `__init__.py`
5. **Test**: Add unit tests in `tests/unit_tests/mcp_service/`
## Current Status
### ✅ Phase 1 Complete
- FastMCP server with CLI
- JWT authentication with RBAC
- All 16 core tools implemented
- Request schema pattern
- Cache control system
- Audit logging
- 194+ unit tests
### 🎯 Future Enhancements
- Demo notebooks and video examples
- OAuth integration for user impersonation
- Enhanced chart rendering formats
- Advanced security features
**Production Ready**: Core functionality stable with comprehensive testing and authentication.
---
For setup and usage, see the [MCP Service overview](./intro).

View File

@@ -0,0 +1,434 @@
---
title: Authentication & Security
sidebar_position: 4
version: 1
---
# Authentication & Security
The MCP service provides enterprise-grade JWT Bearer authentication with flexible configuration options and comprehensive security controls.
## Quick Start
### Development Mode (Default)
:::tip
Authentication is **disabled by default** for local development - no configuration needed.
:::
```bash
# No configuration needed - service runs without authentication
superset mcp run --port 5008 --debug
```
### Production Mode
:::warning
Always enable authentication for production deployments to secure your Superset instance.
:::
Enable JWT authentication in your Superset configuration:
```python
# In superset_config.py
MCP_AUTH_ENABLED = True
MCP_JWKS_URI = "https://auth.company.com/.well-known/jwks.json"
MCP_JWT_ISSUER = "https://auth.company.com/"
MCP_JWT_AUDIENCE = "superset-mcp-api"
MCP_REQUIRED_SCOPES = ["dashboard:read", "chart:read", "dataset:read"]
```
## Configuration Options
### Option 1: Simple Configuration
Add to your `superset_config.py`:
```python
# Enable authentication
MCP_AUTH_ENABLED = True
# JWT settings
MCP_JWKS_URI = "https://auth.company.com/.well-known/jwks.json"
MCP_JWT_ISSUER = "https://auth.company.com/"
MCP_JWT_AUDIENCE = "superset-mcp-api"
MCP_REQUIRED_SCOPES = ["dashboard:read", "chart:read"]
# Optional: User resolution
MCP_JWT_USER_CLAIM = "sub" # JWT claim for username (default: "sub")
MCP_JWT_EMAIL_CLAIM = "email" # JWT claim for email (default: "email")
MCP_FALLBACK_USER = "admin" # Fallback user if JWT user not found
```
### Option 2: Custom Factory
For advanced authentication requirements:
```python
def create_custom_mcp_auth(app):
"""Custom auth factory for enterprise environments."""
from fastmcp.server.auth.providers.bearer import BearerAuthProvider
return BearerAuthProvider(
jwks_uri=app.config["MCP_JWKS_URI"],
issuer=app.config["MCP_JWT_ISSUER"],
audience=app.config["MCP_JWT_AUDIENCE"],
required_scopes=app.config.get("MCP_REQUIRED_SCOPES", []),
user_resolver=custom_user_resolver,
cache_ttl=300 # Cache JWKS for 5 minutes
)
MCP_AUTH_FACTORY = create_custom_mcp_auth
```
### Option 3: Environment Variables
For containerized deployments:
```bash
# Environment variables
export MCP_AUTH_ENABLED=true
export MCP_JWKS_URI=https://auth.company.com/.well-known/jwks.json
export MCP_JWT_ISSUER=https://auth.company.com/
export MCP_JWT_AUDIENCE=superset-mcp-api
export MCP_REQUIRED_SCOPES=dashboard:read,chart:read,dataset:read
```
## Identity Provider Integration
### Auth0
```python
# Auth0 configuration
MCP_JWKS_URI = "https://your-tenant.auth0.com/.well-known/jwks.json"
MCP_JWT_ISSUER = "https://your-tenant.auth0.com/"
MCP_JWT_AUDIENCE = "superset-mcp-api"
```
### Okta
```python
# Okta configuration
MCP_JWKS_URI = "https://your-org.okta.com/oauth2/default/v1/keys"
MCP_JWT_ISSUER = "https://your-org.okta.com/oauth2/default"
MCP_JWT_AUDIENCE = "api://superset-mcp"
```
### AWS Cognito
```python
# Cognito configuration
MCP_JWKS_URI = "https://cognito-idp.{region}.amazonaws.com/{userPoolId}/.well-known/jwks.json"
MCP_JWT_ISSUER = "https://cognito-idp.{region}.amazonaws.com/{userPoolId}"
MCP_JWT_AUDIENCE = "your-app-client-id"
```
### Azure AD
```python
# Azure AD configuration
MCP_JWKS_URI = "https://login.microsoftonline.com/{tenant}/discovery/v2.0/keys"
MCP_JWT_ISSUER = "https://login.microsoftonline.com/{tenant}/v2.0"
MCP_JWT_AUDIENCE = "api://superset-mcp"
```
## Scope-Based Authorization
### Standard Scopes
The MCP service defines these standard scopes:
| Scope | Description | Required For |
|-------|-------------|--------------|
| `dashboard:read` | Read dashboard information | `list_dashboards`, `get_dashboard_info` |
| `dashboard:write` | Create/modify dashboards | `generate_dashboard`, `add_chart_to_existing_dashboard` |
| `chart:read` | Read chart information | `list_charts`, `get_chart_info`, `get_chart_data` |
| `chart:write` | Create/modify charts | `generate_chart`, `update_chart` |
| `dataset:read` | Read dataset information | `list_datasets`, `get_dataset_info` |
| `instance:read` | Read instance information | `get_superset_instance_info` |
### Custom Scopes
Define custom scopes for specific use cases:
```python
# Custom scope definitions
CUSTOM_MCP_SCOPES = {
"analytics:export": "Export analytical data",
"reports:generate": "Generate automated reports",
"admin:config": "Access administrative configuration"
}
# Map tools to custom scopes
def get_custom_required_scopes(tool_name: str) -> List[str]:
scope_map = {
"get_chart_data": ["chart:read", "analytics:export"],
"generate_dashboard": ["dashboard:write", "reports:generate"],
"get_superset_instance_info": ["instance:read", "admin:config"]
}
return scope_map.get(tool_name, [])
MCP_SCOPE_RESOLVER = get_custom_required_scopes
```
## JWT Token Format
### Required Claims
Your JWT tokens must include these standard claims:
```json
{
"iss": "https://auth.company.com/", // Issuer
"aud": "superset-mcp-api", // Audience
"sub": "user@company.com", // Subject (username)
"exp": 1704118800, // Expiration timestamp
"iat": 1704115200, // Issued at timestamp
"scope": "dashboard:read chart:read" // Space-separated scopes
}
```
### Optional Claims
Additional claims for enhanced functionality:
```json
{
"email": "user@company.com", // User email
"name": "John Doe", // Full name
"groups": ["analysts", "sales_team"], // User groups
"tenant_id": "company_123", // Multi-tenant ID
"role": "analyst" // User role
}
```
## Client Integration
### API Client Usage
```python
import requests
# Get JWT token from your identity provider
token = get_jwt_token()
# Call MCP service with Bearer authentication
headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
}
response = requests.post(
"http://localhost:5008/call_tool",
headers=headers,
json={
"tool": "list_dashboards",
"arguments": {"search": "sales"}
}
)
data = response.json()
```
### Claude Desktop with Authentication
For Claude Desktop, the proxy script handles authentication:
```bash
#!/bin/bash
# run_proxy_with_auth.sh
# Get token from environment or file
if [ -f ~/.superset_mcp_token ]; then
TOKEN=$(cat ~/.superset_mcp_token)
else
TOKEN=${SUPERSET_MCP_TOKEN}
fi
# Export token for proxy
export MCP_AUTH_TOKEN="$TOKEN"
cd /path/to/superset
source venv/bin/activate
exec fastmcp proxy http://localhost:5008 --auth-header "Authorization: Bearer $TOKEN"
```
## User Resolution
### Default User Resolution
The service maps JWT claims to Superset users:
```python
def default_user_resolver(claims: Dict[str, Any]) -> User:
"""Default user resolution from JWT claims."""
# Extract username from configurable claim
username = claims.get(app.config.get("MCP_JWT_USER_CLAIM", "sub"))
# Find Superset user
user = security_manager.find_user(username=username)
if not user:
# Try email lookup
email = claims.get(app.config.get("MCP_JWT_EMAIL_CLAIM", "email"))
if email:
user = security_manager.find_user(email=email)
if not user and app.config.get("MCP_FALLBACK_USER"):
# Use fallback user for development
user = security_manager.find_user(username=app.config["MCP_FALLBACK_USER"])
return user
```
### Custom User Resolution
Implement custom user resolution logic:
```python
def custom_user_resolver(claims: Dict[str, Any]) -> User:
"""Custom user resolution for enterprise environments."""
# Extract custom claims
employee_id = claims.get("employee_id")
tenant_id = claims.get("tenant_id")
# Multi-tenant user lookup
user = find_user_by_employee_id(employee_id, tenant_id)
if user:
# Set additional context
user.mcp_tenant_id = tenant_id
user.mcp_groups = claims.get("groups", [])
return user
# Use custom resolver
MCP_USER_RESOLVER = custom_user_resolver
```
## Security Features
### Token Validation
Comprehensive JWT validation:
- **Signature verification**: RS256 with JWKS key rotation support
- **Expiration checking**: Automatic token expiry validation
- **Audience validation**: Prevents token reuse across services
- **Issuer validation**: Ensures tokens from trusted sources only
- **Scope validation**: Enforces tool-level permissions
### Request Security
- **HTTPS enforcement**: Production deployments should use HTTPS
- **Rate limiting**: Configurable per-user rate limits
- **Request logging**: All authenticated requests logged with user context
- **Input validation**: Comprehensive request schema validation
### Audit Logging
Every tool call is logged with security context:
```json
{
"timestamp": "2024-01-25T14:30:00Z",
"user_id": "user@company.com",
"tool_name": "get_chart_data",
"source": "mcp",
"jwt_subject": "user@company.com",
"jwt_scopes": ["chart:read", "analytics:export"],
"tenant_id": "company_123",
"request_id": "req_12345",
"execution_time": 0.145,
"status": "success"
}
```
## Testing Authentication
### Generate Test Tokens
For development and testing:
```python
from fastmcp.server.auth.providers.bearer import RSAKeyPair
# Generate test keypair
keypair = RSAKeyPair.generate()
print("Public key:", keypair.public_key)
# Create test token
token = keypair.create_token(
subject="test@example.com",
issuer="https://test.example.com",
audience="superset-mcp-api",
scopes=["dashboard:read", "chart:read", "dataset:read"],
expires_in=3600 # 1 hour
)
print("Test token:", token)
```
### Test Configuration
```python
# Test configuration with generated keypair
MCP_AUTH_ENABLED = True
MCP_JWT_PUBLIC_KEY = """-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA...
-----END PUBLIC KEY-----"""
MCP_JWT_ISSUER = "https://test.example.com"
MCP_JWT_AUDIENCE = "superset-mcp-api"
MCP_FALLBACK_USER = "admin"
```
### Manual Testing
```bash
# Test with curl
curl -X POST http://localhost:5008/call_tool \
-H "Authorization: Bearer $TEST_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"tool": "get_superset_instance_info",
"arguments": {"include_statistics": true}
}'
```
## Troubleshooting
### Common Issues
**Token Validation Errors:**
```
Error: Invalid JWT signature
Solution: Verify JWKS_URI is accessible and contains correct keys
```
**User Not Found:**
```
Error: User not found for JWT subject
Solution: Check MCP_JWT_USER_CLAIM configuration and user exists in Superset
```
**Insufficient Scopes:**
```
Error: Missing required scope 'chart:read'
Solution: Update JWT token to include required scopes
```
### Debug Configuration
Enable debug logging for authentication issues:
```python
# Enhanced logging for auth debugging
import logging
logging.getLogger('superset.mcp_service.auth').setLevel(logging.DEBUG)
# Log all JWT validation steps
MCP_AUTH_DEBUG = True
```
This authentication guide provides comprehensive coverage for securing the MCP service in production environments while maintaining development flexibility.

View File

@@ -0,0 +1,705 @@
---
title: Development Guide
sidebar_position: 2
version: 1
---
# MCP Service Development Guide
This guide covers the internal architecture, development workflows, and patterns for extending the Superset MCP service.
> 🚀 **New to MCP?** Start with the [Overview](./overview) to understand what the service does before diving into development.
>
> 📚 **Need API examples?** Check the [API Reference](./api-reference) to see how existing tools work.
>
> 🔐 **Planning production use?** Review [Authentication](./authentication) for security considerations.
## Internal Architecture
### Component Overview
The MCP service follows a layered architecture with clear separation of concerns:
```mermaid
graph TB
subgraph "Transport Layer"
HTTP[HTTP Server :5008]
FastMCP[FastMCP Protocol Handler]
end
subgraph "Auth & Middleware Layer"
AuthHook[Auth Hook Decorator]
JWT[JWT Validator]
RBAC[RBAC Engine]
Audit[Audit Logger]
end
subgraph "Tool Layer"
Tools[16 MCP Tools<br/>Tool Decorated]
Schemas[Pydantic Schemas]
Validation[Request Validation]
end
subgraph "Business Logic Layer"
Generic[Generic Tool Abstractions]
ModelList[ModelListTool]
ModelGet[ModelGetInfoTool]
ModelFilter[ModelGetAvailableFiltersTool]
end
subgraph "Data Access Layer"
DAOs[Superset DAOs]
Commands[Superset Commands]
Cache[Cache Manager]
end
subgraph "Storage Layer"
MetaDB[(Metadata DB)]
DataWH[(Data Warehouse)]
Redis[(Redis Cache)]
end
HTTP --> FastMCP
FastMCP --> AuthHook
AuthHook --> JWT
JWT --> RBAC
RBAC --> Audit
Audit --> Tools
Tools --> Schemas
Schemas --> Validation
Validation --> Generic
Generic --> ModelList
Generic --> ModelGet
Generic --> ModelFilter
ModelList --> DAOs
ModelGet --> DAOs
ModelFilter --> DAOs
Tools --> Commands
Commands --> Cache
DAOs --> MetaDB
Commands --> MetaDB
Commands --> DataWH
Cache --> Redis
```
### Request Flow
Every MCP tool call follows this execution pattern:
```mermaid
sequenceDiagram
participant Client as LLM Client
participant MCP as FastMCP Server
participant Auth as Auth Hook
participant Tool as MCP Tool
participant Generic as Generic Abstraction
participant DAO as Superset DAO
participant DB as Database
Client->>+MCP: tool_call(request)
MCP->>+Auth: validate_and_authorize()
Auth->>Auth: Validate JWT token
Auth->>Auth: Check required scopes
Auth->>Auth: Set Flask g.user context
Auth->>Auth: Log audit event
Auth->>+Tool: execute_tool(validated_request)
Tool->>Tool: Parse Pydantic request schema
Tool->>+Generic: Use generic abstraction
Generic->>+DAO: Query Superset data
DAO->>+DB: Execute SQL
DB-->>-DAO: Return results
DAO-->>-Generic: Return objects
Generic->>Generic: Apply pagination/filtering
Generic-->>-Tool: Return formatted data
Tool->>Tool: Build Pydantic response schema
Tool-->>-Auth: Return response
Auth->>Auth: Log success audit event
Auth-->>-MCP: Return validated response
MCP-->>-Client: JSON response
```
### Tool Registration System
Tools are automatically discovered and registered through the decorator pattern:
```python
# superset/mcp_service/mcp_app.py
from fastmcp import FastMCP
# Global MCP instance
mcp = FastMCP("Superset MCP Service")
# Tools register themselves via decorators
@mcp.tool
@mcp_auth_hook(['chart:read'])
def get_chart_info(request: GetChartInfoRequest) -> GetChartInfoResponse:
# Tool implementation
pass
# All tool modules imported to trigger registration
from superset.mcp_service.chart.tool import *
from superset.mcp_service.dashboard.tool import *
from superset.mcp_service.dataset.tool import *
from superset.mcp_service.system.tool import *
```
## Development Patterns
### Tool Implementation Pattern
All tools follow this standardized pattern:
```python
# Example: superset/mcp_service/chart/tool/get_chart_info.py
from superset.mcp_service.auth import mcp_auth_hook
from superset.mcp_service.mcp_app import mcp
from superset.mcp_service.schemas.chart_schemas import (
GetChartInfoRequest,
GetChartInfoResponse,
ChartError
)
@mcp.tool
@mcp_auth_hook(['chart:read'])
def get_chart_info(request: GetChartInfoRequest) -> GetChartInfoResponse:
"""
Get detailed information about a specific chart.
Supports lookup by ID or UUID with comprehensive metadata.
"""
try:
# CRITICAL: Import Superset modules inside function
from superset.daos.chart import ChartDAO
from superset.models.slice import Slice
# Use generic abstraction for common operations
from superset.mcp_service.generic_tools import ModelGetInfoTool
tool = ModelGetInfoTool(
dao=ChartDAO,
model=Slice,
response_schema=GetChartInfoResponse,
identifier_field_map={
'id': 'id',
'uuid': 'uuid'
}
)
return tool.execute(request)
except Exception as e:
return ChartError(
error=f"Failed to get chart info: {str(e)}",
error_type="ChartInfoError"
)
```
### Schema Design Patterns
Pydantic schemas follow these conventions:
```python
# Request Schema Pattern
class GetChartInfoRequest(BaseModel):
"""Request to get detailed chart information."""
identifier: Union[int, str] = Field(
...,
description="Chart ID (numeric) or UUID (string)"
)
include_form_data: bool = Field(
default=True,
description="Whether to include chart configuration"
)
use_cache: bool = Field(
default=True,
description="Whether to use cached data"
)
# Response Schema Pattern
class GetChartInfoResponse(BaseModel):
"""Detailed chart information response."""
chart_id: int = Field(description="Chart numeric ID")
uuid: Optional[str] = Field(description="Chart UUID")
slice_name: str = Field(description="Chart display name")
viz_type: str = Field(description="Visualization type")
datasource_id: Optional[int] = Field(description="Dataset ID")
form_data: Optional[Dict[str, Any]] = Field(description="Chart configuration")
explore_url: Optional[str] = Field(description="Explore URL for editing")
# Cache status for transparency
cache_status: Optional[CacheStatus] = Field(description="Cache hit information")
# Error Schema Pattern
class ChartError(BaseModel):
"""Chart operation error response."""
error: str = Field(description="Error message")
error_type: str = Field(description="Error type identifier")
suggestions: Optional[List[str]] = Field(description="Suggested fixes")
```
### Generic Tool Abstractions
Common operations are abstracted into reusable classes:
```python
# superset/mcp_service/generic_tools.py
from typing import Type, Dict, Any, List, Optional
from pydantic import BaseModel
class ModelListTool:
"""Generic tool for list operations with pagination and filtering."""
def __init__(self,
dao: Type,
model: Type,
response_schema: Type[BaseModel],
default_columns: List[str] = None,
searchable_columns: List[str] = None):
self.dao = dao
self.model = model
self.response_schema = response_schema
self.default_columns = default_columns or []
self.searchable_columns = searchable_columns or []
def execute(self, request: BaseModel) -> BaseModel:
"""Execute list operation with pagination and filtering."""
# Build query with filters
query = self.dao.find_all()
# Apply search if provided
if hasattr(request, 'search') and request.search:
query = self._apply_search(query, request.search)
# Apply filters if provided
if hasattr(request, 'filters') and request.filters:
query = self._apply_filters(query, request.filters)
# Apply pagination
total = query.count()
if hasattr(request, 'page') and hasattr(request, 'page_size'):
offset = (request.page - 1) * request.page_size
query = query.offset(offset).limit(request.page_size)
# Execute query and serialize
results = query.all()
serialized = [self._serialize_model(obj) for obj in results]
return self.response_schema(
results=serialized,
total_count=total,
page=getattr(request, 'page', 1),
page_size=getattr(request, 'page_size', len(serialized))
)
class ModelGetInfoTool:
"""Generic tool for getting single object by multiple identifier types."""
def __init__(self,
dao: Type,
model: Type,
response_schema: Type[BaseModel],
identifier_field_map: Dict[str, str]):
self.dao = dao
self.model = model
self.response_schema = response_schema
self.identifier_field_map = identifier_field_map
def execute(self, request: BaseModel) -> BaseModel:
"""Execute get operation with multi-identifier support."""
identifier = request.identifier
# Determine identifier type and field
if isinstance(identifier, int):
field = self.identifier_field_map.get('id', 'id')
obj = self.dao.find_by_id(identifier)
elif isinstance(identifier, str):
if len(identifier) == 36 and '-' in identifier: # UUID format
field = self.identifier_field_map.get('uuid', 'uuid')
obj = self.dao.find_by_uuid(identifier)
else: # Assume slug
field = self.identifier_field_map.get('slug', 'slug')
obj = getattr(self.dao, 'find_by_slug', lambda x: None)(identifier)
if not obj:
raise ValueError(f"Object not found with identifier: {identifier}")
# Serialize and return
serialized = self._serialize_model(obj)
return self.response_schema(**serialized)
```
## Adding New Tools
### Step-by-Step Process
1. **Define the Domain**
Choose the appropriate domain folder:
- `dashboard/` - Dashboard operations
- `chart/` - Chart operations
- `dataset/` - Dataset operations
- `system/` - System-level operations
2. **Create Schemas**
```bash
# Create schema file
touch superset/mcp_service/schemas/my_domain_schemas.py
```
```python
# Define request/response schemas
class MyToolRequest(BaseModel):
param1: str = Field(description="Parameter description")
param2: Optional[int] = Field(default=None, description="Optional parameter")
class MyToolResponse(BaseModel):
result: str = Field(description="Result description")
metadata: Dict[str, Any] = Field(description="Additional metadata")
```
3. **Implement the Tool**
```bash
# Create tool file
touch superset/mcp_service/my_domain/tool/my_tool.py
```
```python
@mcp.tool
@mcp_auth_hook(['required:scope'])
def my_tool(request: MyToolRequest) -> MyToolResponse:
"""Tool description for LLM."""
# Import Superset modules inside function
from superset.daos.my_dao import MyDAO
# Implement business logic
result = MyDAO.do_something(request.param1)
return MyToolResponse(
result=result,
metadata={"processed_at": datetime.utcnow()}
)
```
4. **Register the Tool**
```python
# Add to superset/mcp_service/my_domain/tool/__init__.py
from .my_tool import my_tool
__all__ = ['my_tool']
```
```python
# Import in superset/mcp_service/mcp_app.py
from superset.mcp_service.my_domain.tool import *
```
5. **Add Tests**
```bash
# Create test file
touch tests/unit_tests/mcp_service/test_my_tool.py
```
```python
import pytest
from superset.mcp_service.my_domain.tool.my_tool import my_tool
from superset.mcp_service.schemas.my_domain_schemas import MyToolRequest
class TestMyTool:
def test_my_tool_success(self):
request = MyToolRequest(param1="test")
response = my_tool(request)
assert response.result == "expected_result"
```
### Tool Best Practices
1. **Import Inside Functions**
```python
# ❌ DON'T: Import at module level
from superset.daos.chart import ChartDAO
@mcp.tool
def my_tool():
# Tool implementation
pass
# ✅ DO: Import inside function
@mcp.tool
def my_tool():
from superset.daos.chart import ChartDAO
# Tool implementation
pass
```
2. **Use Generic Abstractions**
```python
# ✅ Leverage existing patterns
@mcp.tool
def list_my_objects(request):
from superset.mcp_service.generic_tools import ModelListTool
tool = ModelListTool(
dao=MyDAO,
model=MyModel,
response_schema=ListMyObjectsResponse
)
return tool.execute(request)
```
3. **Comprehensive Error Handling**
```python
@mcp.tool
def my_tool(request):
try:
# Tool implementation
return success_response
except PermissionError as e:
return MyToolError(
error="Permission denied",
error_type="PermissionError",
suggestions=["Check user permissions"]
)
except Exception as e:
return MyToolError(
error=f"Unexpected error: {str(e)}",
error_type="InternalError"
)
```
## Testing Patterns
### Unit Test Structure
```python
# tests/unit_tests/mcp_service/test_chart_tools.py
import pytest
from unittest.mock import Mock, patch
from superset.mcp_service.chart.tool.get_chart_info import get_chart_info
from superset.mcp_service.schemas.chart_schemas import GetChartInfoRequest
class TestGetChartInfo:
"""Test suite for get_chart_info tool."""
@patch('superset.mcp_service.chart.tool.get_chart_info.ChartDAO')
def test_get_chart_info_by_id_success(self, mock_dao):
"""Test successful chart lookup by ID."""
# Setup mock
mock_chart = Mock()
mock_chart.id = 1
mock_chart.slice_name = "Test Chart"
mock_chart.viz_type = "line"
mock_dao.find_by_id.return_value = mock_chart
# Execute
request = GetChartInfoRequest(identifier=1)
response = get_chart_info(request)
# Verify
assert response.chart_id == 1
assert response.slice_name == "Test Chart"
mock_dao.find_by_id.assert_called_once_with(1)
@patch('superset.mcp_service.chart.tool.get_chart_info.ChartDAO')
def test_get_chart_info_not_found(self, mock_dao):
"""Test chart not found scenario."""
# Setup mock
mock_dao.find_by_id.return_value = None
# Execute
request = GetChartInfoRequest(identifier=999)
response = get_chart_info(request)
# Verify error response
assert hasattr(response, 'error')
assert "not found" in response.error.lower()
```
### Integration Test Patterns
```python
# tests/integration_tests/mcp_service/test_chart_integration.py
import pytest
from superset.app import create_app
from superset.mcp_service.mcp_app import mcp
from tests.integration_tests.base_tests import SupersetTestCase
class TestChartIntegration(SupersetTestCase):
"""Integration tests for chart tools."""
def setUp(self):
super().setUp()
self.app = create_app()
self.app_context = self.app.app_context()
self.app_context.push()
def tearDown(self):
self.app_context.pop()
super().tearDown()
def test_chart_workflow_integration(self):
"""Test complete chart workflow."""
# Create chart
create_request = {
"dataset_id": "1",
"config": {
"chart_type": "table",
"columns": [{"name": "region"}]
}
}
create_response = mcp.call_tool("generate_chart", create_request)
chart_id = create_response["chart_id"]
# Get chart info
info_request = {"identifier": chart_id}
info_response = mcp.call_tool("get_chart_info", info_request)
assert info_response["chart_id"] == chart_id
assert info_response["viz_type"] == "table"
# Get chart data
data_request = {"identifier": chart_id, "limit": 10}
data_response = mcp.call_tool("get_chart_data", data_request)
assert "data" in data_response
assert len(data_response["data"]) <= 10
```
## Performance Considerations
### Caching Strategy
The MCP service leverages Superset's existing cache layers:
```python
# Cache control in tools
@mcp.tool
def get_chart_data(request: GetChartDataRequest):
"""Tool with cache control."""
cache_config = {
'use_cache': request.use_cache,
'force_refresh': request.force_refresh,
'cache_timeout': request.cache_timeout
}
# Use Superset's cache infrastructure
result = execute_with_cache(query, cache_config)
return ChartDataResponse(
data=result.data,
cache_status=result.cache_status
)
```
### Query Optimization
```python
# Efficient pagination
def list_objects(query, page, page_size):
"""Optimized pagination pattern."""
# Count query optimization
total = query.count()
# Limit columns for list operations
query = query.options(load_only('id', 'name', 'created_on'))
# Apply pagination
offset = (page - 1) * page_size
results = query.offset(offset).limit(page_size).all()
return results, total
```
## Security Considerations
### Authentication Flow
```python
# JWT validation and user context
@mcp_auth_hook(['chart:read'])
def secure_tool(request):
"""Tool with proper security context."""
# g.user is set by auth hook
user_id = g.user.id
# Apply user-specific filtering
query = ChartDAO.find_all().filter(
Chart.owners.contains(g.user)
)
return execute_query(query)
```
### Input Validation
```python
# Comprehensive request validation
class CreateChartRequest(BaseModel):
"""Validated chart creation request."""
dataset_id: Union[int, str] = Field(
...,
description="Dataset ID or UUID"
)
config: ChartConfig = Field(
...,
description="Chart configuration"
)
@validator('dataset_id')
def validate_dataset_id(cls, v):
"""Validate dataset exists and user has access."""
# Validation logic
return v
@validator('config')
def validate_chart_config(cls, v):
"""Validate chart configuration."""
# Configuration validation
return v
```
This development guide provides comprehensive coverage of the MCP service's internal architecture and development patterns, enabling team members to effectively extend and maintain the system.
## Related Documentation
### 📚 **Ready to Use Your New Tools?**
Test your implementations with examples from the [API Reference](./api-reference).
### 🔐 **Securing Your Extensions?**
Add authentication to your tools using the [Authentication Guide](./authentication).
### 🏗️ **Understanding the Big Picture?**
See the complete system design in the [Architecture Overview](./architecture).
### 🏢 **Building Enterprise Features?**
Explore advanced patterns in the [Preset Integration Guide](./preset-integration).
> 📖 **Back to Documentation Index**: [MCP Service](./intro)

View File

@@ -0,0 +1,124 @@
---
title: MCP Service
sidebar_position: 1
version: 1
---
# Superset MCP Service
The Superset Model Context Protocol (MCP) service provides programmatic access to Superset dashboards, charts, datasets, and instance metadata. Built for LLM agents and automation tools.
## What is MCP?
The Model Context Protocol (MCP) is an open standard that allows AI assistants to securely connect to data sources and tools. Superset's MCP service exposes **16 production-ready tools** that enable:
- 📊 **Data Exploration**: List and query dashboards, charts, and datasets
- 🔧 **Chart Creation**: Generate visualizations programmatically
- 📈 **Data Export**: Extract data in multiple formats (JSON, CSV, Excel)
- 🔗 **Navigation**: Generate explore links and SQL Lab sessions
## Quick Start
### Installation
:::note
The MCP service is included with Superset development setup. FastMCP dependencies are installed automatically with `make install`.
:::
```bash
# MCP service is included with Superset development setup
git clone https://github.com/apache/superset.git
cd superset
make venv && source venv/bin/activate
make install
# Start Superset
superset run -p 8088 --with-threads --reload --debugger
# Start MCP service (separate terminal)
source venv/bin/activate
superset mcp run --port 5008 --debug
```
### Claude Desktop Integration
```json
{
"mcpServers": {
"Superset MCP": {
"command": "/path/to/superset/superset/mcp_service/run_proxy.sh",
"args": [],
"env": {}
}
}
}
```
## Key Features
### 🔧 **16 Production Tools**
| Category | Tools | Purpose |
|----------|-------|---------|
| **Dashboard** (5) | List, get info, create, add charts | Dashboard management |
| **Chart** (8) | Full CRUD, data export, previews | Chart operations |
| **Dataset** (3) | List, get info, discover filters | Dataset exploration |
| **System** (2) | Instance info, explore links | System integration |
| **SQL Lab** (1) | Pre-configured sessions | SQL development |
### 🔐 **Enterprise Security**
- **JWT Bearer Authentication**: Production-ready with configurable factory pattern
- **RBAC Integration**: Scope-based permissions with Superset's security model
- **Audit Logging**: Comprehensive MCP context tracking
### 📊 **Advanced Capabilities**
- **Multi-format Export**: JSON, CSV, Excel data export
- **Chart Previews**: Screenshots, ASCII art, table representations
- **Cache Control**: Leverage Superset's existing cache infrastructure
- **Request Schemas**: Eliminates LLM parameter validation issues
## Example Usage
```python
# List dashboards
dashboards = client.call_tool("list_dashboards", {
"search": "sales",
"page_size": 10
})
# Create a chart
chart = client.call_tool("generate_chart", {
"dataset_id": "1",
"config": {
"chart_type": "line",
"x": {"name": "date"},
"y": [{"name": "revenue", "aggregate": "SUM"}]
}
})
# Export chart data
data = client.call_tool("get_chart_data", {
"identifier": chart["chart_id"],
"format": "json",
"limit": 1000
})
```
## Status
✅ **Phase 1 Complete** - Core functionality stable, authentication production-ready, comprehensive testing coverage.
## Documentation Structure
### Getting Started
- **[Overview](./overview)** - Features, use cases, and examples
- **[API Reference](./api-reference)** - Complete tool documentation
### Development
- **[Development Guide](./development)** - Internal architecture and adding tools
- **[Architecture](./architecture)** - System design and patterns
### Production
- **[Authentication](./authentication)** - JWT setup and security
- **[Preset Integration](./preset-integration)** - Enterprise features
> 🚀 **Ready to start?** Continue with the [Overview](./overview) for detailed examples and use cases.

View File

@@ -0,0 +1,196 @@
---
title: MCP Service Overview
sidebar_position: 1
version: 1
---
# Superset MCP Service
The Superset Model Context Protocol (MCP) service provides a modular, schema-driven interface for programmatic access to Superset dashboards, charts, datasets, and instance metadata. Built on FastMCP, it's designed for LLM agents and automation tools.
**Status:** ✅ Phase 1 Complete. Core functionality stable, authentication production-ready, comprehensive testing coverage.
## What is MCP?
The Model Context Protocol (MCP) is an open standard for connecting AI assistants to data sources and tools. Superset's MCP service exposes 16 tools that allow LLM agents to:
- **Explore data**: List and query dashboards, charts, and datasets
- **Create visualizations**: Generate charts and dashboards programmatically
- **Export data**: Extract chart data in multiple formats
- **Navigate interfaces**: Generate explore links and SQL Lab sessions
## Key Features
### 🔧 **16 Production-Ready Tools**
- **Dashboard Tools (5)**: List, get info, create dashboards, add charts
- **Chart Tools (8)**: Full CRUD operations, data export, screenshot previews
- **Dataset Tools (3)**: List, get info, discover filterable columns
- **System Tools (2)**: Instance info, explore link generation
- **SQL Lab Tools (1)**: Pre-configured SQL sessions
### 🔐 **Enterprise Authentication**
- **JWT Bearer Authentication**: Production-ready with configurable factory pattern
- **RBAC Integration**: Scope-based permissions with Superset's security model
- **Audit Logging**: Comprehensive MCP context tracking with impersonation support
### 📊 **Advanced Capabilities**
- **Multi-format Export**: JSON, CSV, Excel data export
- **Chart Previews**: Screenshots, ASCII art, and table representations
- **Cache Control**: Comprehensive control over Superset's cache layers
- **Request Schema Pattern**: Eliminates LLM parameter validation issues
## Architecture Overview
```mermaid
graph TB
subgraph "Client Layer"
LLM[LLM/Agent Client]
Claude[Claude Desktop]
SDK[Custom SDK]
end
subgraph "MCP Service Layer"
FastMCP[FastMCP Server<br/>Port 5008]
Auth[JWT Auth Hook]
Tools[16 MCP Tools]
end
subgraph "Superset Integration"
DAOs[Superset DAOs]
Commands[Superset Commands]
Cache[Cache Layer]
end
subgraph "Data Layer"
DB[(Superset Database)]
DataWarehouse[(Data Warehouse)]
end
LLM --> FastMCP
Claude --> FastMCP
SDK --> FastMCP
FastMCP --> Auth
Auth --> Tools
Tools --> DAOs
Tools --> Commands
Tools --> Cache
DAOs --> DB
Commands --> DB
Commands --> DataWarehouse
```
## Getting Started
### Quick Setup
```bash
# Clone and install Superset
git clone https://github.com/apache/superset.git
cd superset
make venv && source venv/bin/activate
make install
# Start Superset
superset run -p 8088 --with-threads --reload --debugger
# Start MCP service (in separate terminal)
source venv/bin/activate
superset mcp run --port 5008 --debug
```
### Connect to Claude Desktop
:::note
The MCP service runs on HTTP and requires a proxy for Claude Desktop integration.
:::
```bash
# Install FastMCP proxy
pip install fastmcp
```
Configure Claude Desktop (`~/.config/Claude/claude_desktop_config.json`):
```json
{
"mcpServers": {
"Superset MCP": {
"command": "/path/to/superset/superset/mcp_service/run_proxy.sh",
"args": [],
"env": {}
}
}
}
```
## Use Cases
### Data Exploration
- "List all dashboards related to sales"
- "Show me the charts in the Q4 Performance dashboard"
- "What datasets are available for customer analysis?"
### Chart Creation
- "Create a line chart showing revenue trends by month"
- "Generate a table showing top 10 products by sales"
- "Build a bar chart comparing regional performance"
### Data Export
- "Export the sales data from this chart as CSV"
- "Get the underlying data for this dashboard as JSON"
- "Show me a preview of this chart as ASCII art"
### Dashboard Management
- "Create a new dashboard with these 4 charts"
- "Add this revenue chart to the executive dashboard"
- "Generate an explore link for this chart configuration"
## Example Workflow
```python
# List available dashboards
dashboards = client.call_tool("list_dashboards", {
"search": "sales",
"page_size": 10
})
# Get detailed dashboard info
dashboard = client.call_tool("get_dashboard_info", {
"identifier": dashboards["dashboards"][0]["id"]
})
# Create a new chart
chart = client.call_tool("generate_chart", {
"dataset_id": "1",
"config": {
"chart_type": "line",
"x": {"name": "date"},
"y": [{"name": "revenue", "aggregate": "SUM"}]
}
})
# Export chart data
data = client.call_tool("get_chart_data", {
"identifier": chart["chart_id"],
"format": "json",
"limit": 1000
})
```
## Next Steps
### Ready to Use MCP?
- **[📚 API Reference](./api-reference)** - Try all 16 tools with request/response examples
- **[🔐 Authentication](./authentication)** - Set up JWT security for production use
### Want to Extend MCP?
- **[🔧 Development Guide](./development)** - Learn internal architecture and add new tools
- **[🏗️ Architecture](./architecture)** - Understand system design and deployment patterns
### Enterprise Deployment?
- **[🏢 Preset Integration](./preset-integration)** - RBAC extensions and OIDC integration for enterprise
> 💡 **Getting started?** Return to the [MCP Service intro](./intro) for a complete overview.

View File

@@ -0,0 +1,483 @@
---
title: Preset.io Integration
sidebar_position: 6
version: 1
---
# Preset.io Integration Guide
This document outlines integration points for the Preset.io team to extend the Superset MCP service with enterprise features, RBAC customizations, and OIDC integration.
## RBAC Extension Points
### Custom Authorization Factory
The MCP service supports custom authorization logic through the factory pattern:
```python
# In preset_config.py or superset_config.py
def create_preset_mcp_auth(app):
"""Custom auth factory for Preset.io environments."""
from superset.mcp_service.auth import create_auth_provider
from preset.auth.mcp import PresetMCPAuthProvider
return PresetMCPAuthProvider(
jwks_uri=app.config["PRESET_JWKS_URI"],
issuer=app.config["PRESET_JWT_ISSUER"],
audience=app.config["PRESET_JWT_AUDIENCE"],
tenant_resolver=preset_tenant_resolver,
rbac_manager=app.security_manager,
)
MCP_AUTH_FACTORY = create_preset_mcp_auth
```
### Multi-Tenant RBAC
Extend the base auth hook for tenant-aware permissions:
```python
# preset/mcp/auth.py
from superset.mcp_service.auth import mcp_auth_hook
from functools import wraps
def preset_tenant_auth_hook(required_permissions=None):
"""Preset-specific auth hook with tenant isolation."""
def decorator(func):
@wraps(func)
@mcp_auth_hook(required_permissions)
def wrapper(*args, **kwargs):
# Extract tenant from JWT claims
tenant_id = g.user.tenant_id if hasattr(g.user, 'tenant_id') else None
# Inject tenant context
g.mcp_tenant_id = tenant_id
g.mcp_tenant_context = get_tenant_context(tenant_id)
return func(*args, **kwargs)
return wrapper
return decorator
```
### Custom Permission Scopes
Define Preset-specific permission scopes:
```python
# preset/mcp/permissions.py
PRESET_MCP_SCOPES = {
# Tenant-level permissions
"tenant:admin": "Full tenant administration",
"tenant:read": "Read tenant resources",
# Workspace-level permissions
"workspace:admin": "Full workspace administration",
"workspace:read": "Read workspace resources",
# Enhanced dashboard permissions
"dashboard:publish": "Publish dashboards to marketplace",
"dashboard:embed": "Generate embed tokens",
# Enhanced chart permissions
"chart:export": "Export chart data and configs",
"chart:alerts": "Manage chart alerts and notifications",
# Dataset permissions with row-level security
"dataset:rls": "Apply row-level security filters",
"dataset:pii": "Access PII-flagged columns",
}
def get_preset_required_scopes(tool_name: str, context: dict = None) -> List[str]:
"""Map tool calls to Preset-specific permission requirements."""
base_scopes = get_base_required_scopes(tool_name)
# Add tenant-aware scopes
if context and context.get('tenant_id'):
base_scopes.append(f"tenant:{context['tenant_id']}")
# Add workspace-aware scopes
if context and context.get('workspace_id'):
base_scopes.append(f"workspace:{context['workspace_id']}")
return base_scopes
```
### Row-Level Security Integration
Extend data access tools with RLS:
```python
# preset/mcp/rls.py
def apply_preset_rls_filters(query_context: dict, user_context: dict) -> dict:
"""Apply Preset row-level security filters to query context."""
# Get user's RLS rules from Preset metadata
rls_rules = get_user_rls_rules(
user_id=user_context['user_id'],
tenant_id=user_context['tenant_id'],
workspace_id=user_context.get('workspace_id')
)
# Apply RLS filters to query
for rule in rls_rules:
if rule.applies_to_dataset(query_context['datasource']['id']):
query_context = rule.apply_filter(query_context)
return query_context
# Usage in custom tools
@mcp.tool
@preset_tenant_auth_hook(['dataset:read', 'dataset:rls'])
def preset_get_chart_data(request: GetChartDataRequest) -> ChartDataResponse:
"""Get chart data with Preset RLS applied."""
# Apply RLS before executing query
query_context = build_query_context(request)
query_context = apply_preset_rls_filters(
query_context,
{'user_id': g.user.id, 'tenant_id': g.mcp_tenant_id}
)
return execute_chart_data_query(query_context)
```
## OIDC Integration Points
### Preset OIDC Provider
Custom OIDC integration for Preset environments:
```python
# preset/mcp/oidc.py
from superset.mcp_service.auth.providers.bearer import BearerAuthProvider
import requests
from typing import Dict, Any
class PresetOIDCAuthProvider(BearerAuthProvider):
"""OIDC-specific auth provider for Preset.io."""
def __init__(self,
oidc_discovery_url: str,
client_id: str,
client_secret: str = None,
**kwargs):
# Discover OIDC endpoints
self.discovery_doc = self._fetch_discovery_document(oidc_discovery_url)
super().__init__(
jwks_uri=self.discovery_doc['jwks_uri'],
issuer=self.discovery_doc['issuer'],
**kwargs
)
self.client_id = client_id
self.client_secret = client_secret
def _fetch_discovery_document(self, discovery_url: str) -> Dict[str, Any]:
"""Fetch OIDC discovery document."""
response = requests.get(discovery_url)
response.raise_for_status()
return response.json()
def validate_token(self, token: str) -> Dict[str, Any]:
"""Validate JWT token with OIDC-specific claims."""
claims = super().validate_token(token)
# Validate OIDC-specific claims
if claims.get('aud') != self.client_id:
raise ValueError("Invalid audience claim")
# Extract Preset-specific claims
claims['preset_tenant_id'] = claims.get('tenant_id')
claims['preset_workspace_id'] = claims.get('workspace_id')
claims['preset_roles'] = claims.get('roles', [])
return claims
def resolve_user(self, claims: Dict[str, Any]) -> Any:
"""Resolve Superset user from OIDC claims."""
from preset.auth.user_resolver import resolve_preset_user
return resolve_preset_user(
subject=claims['sub'],
email=claims.get('email'),
tenant_id=claims.get('preset_tenant_id'),
roles=claims.get('preset_roles', [])
)
```
### Configuration for OIDC
```python
# In preset_config.py
def create_preset_oidc_auth(app):
"""Factory for Preset OIDC authentication."""
from preset.mcp.oidc import PresetOIDCAuthProvider
return PresetOIDCAuthProvider(
oidc_discovery_url=app.config["PRESET_OIDC_DISCOVERY_URL"],
client_id=app.config["PRESET_OIDC_CLIENT_ID"],
client_secret=app.config["PRESET_OIDC_CLIENT_SECRET"],
audience=app.config["PRESET_MCP_AUDIENCE"],
required_scopes=app.config.get("PRESET_MCP_REQUIRED_SCOPES", [])
)
# MCP Configuration
MCP_AUTH_ENABLED = True
MCP_AUTH_FACTORY = create_preset_oidc_auth
# OIDC Configuration
PRESET_OIDC_DISCOVERY_URL = "https://auth.preset.io/.well-known/openid_configuration"
PRESET_OIDC_CLIENT_ID = "preset-mcp-service"
PRESET_OIDC_CLIENT_SECRET = os.environ.get("PRESET_OIDC_CLIENT_SECRET")
PRESET_MCP_AUDIENCE = "preset-superset-mcp"
PRESET_MCP_REQUIRED_SCOPES = [
"openid", "profile", "email",
"superset:read", "superset:write"
]
```
## Preset-Specific Tools
### Tenant Management Tools
```python
# preset/mcp/tools/tenant.py
@mcp.tool
@preset_tenant_auth_hook(['tenant:read'])
def get_tenant_info(request: GetTenantInfoRequest) -> TenantInfoResponse:
"""Get Preset tenant information and quotas."""
tenant_id = g.mcp_tenant_id
tenant = get_tenant_by_id(tenant_id)
return TenantInfoResponse(
tenant_id=tenant.id,
name=tenant.name,
plan=tenant.plan,
quotas=tenant.quotas,
usage=get_tenant_usage(tenant_id),
workspaces=list_tenant_workspaces(tenant_id)
)
@mcp.tool
@preset_tenant_auth_hook(['workspace:read'])
def list_workspace_assets(request: ListWorkspaceAssetsRequest) -> ListWorkspaceAssetsResponse:
"""List all assets in a Preset workspace."""
workspace_id = request.workspace_id
tenant_id = g.mcp_tenant_id
# Validate workspace belongs to tenant
validate_workspace_access(workspace_id, tenant_id)
assets = {
'dashboards': list_workspace_dashboards(workspace_id),
'charts': list_workspace_charts(workspace_id),
'datasets': list_workspace_datasets(workspace_id)
}
return ListWorkspaceAssetsResponse(
workspace_id=workspace_id,
assets=assets,
total_count=sum(len(v) for v in assets.values())
)
```
### Embed Token Generation
```python
# preset/mcp/tools/embed.py
@mcp.tool
@preset_tenant_auth_hook(['dashboard:embed'])
def generate_embed_token(request: GenerateEmbedTokenRequest) -> EmbedTokenResponse:
"""Generate secure embed token for dashboard/chart."""
# Validate resource access
resource = validate_embed_resource_access(
resource_type=request.resource_type,
resource_id=request.resource_id,
tenant_id=g.mcp_tenant_id
)
# Generate signed embed token
embed_token = create_embed_token(
resource=resource,
user_id=g.user.id,
tenant_id=g.mcp_tenant_id,
permissions=request.permissions,
expiry=request.expiry_hours
)
return EmbedTokenResponse(
embed_token=embed_token,
embed_url=f"{get_preset_base_url()}/embed/{embed_token}",
expires_at=embed_token.expires_at
)
```
## Audit and Compliance Extensions
### Enhanced Audit Logging
```python
# preset/mcp/audit.py
from superset.mcp_service.auth import get_audit_context
def create_preset_audit_context(user_context: dict, tool_name: str,
request_data: dict) -> dict:
"""Create Preset-specific audit context."""
base_context = get_audit_context(user_context, tool_name, request_data)
# Add Preset-specific fields
preset_context = {
**base_context,
'tenant_id': user_context.get('tenant_id'),
'workspace_id': user_context.get('workspace_id'),
'preset_user_role': user_context.get('preset_role'),
'data_classification': classify_request_data(request_data),
'compliance_flags': get_compliance_flags(tool_name, request_data)
}
return preset_context
def log_preset_mcp_access(audit_context: dict):
"""Log MCP access to Preset audit systems."""
# Log to Superset's audit system
log_superset_audit_event(audit_context)
# Log to Preset's compliance system
log_preset_compliance_event(audit_context)
# Log to external SIEM if configured
if app.config.get('PRESET_SIEM_ENABLED'):
log_to_siem(audit_context)
```
### Data Classification
```python
# preset/mcp/classification.py
def classify_request_data(request_data: dict) -> dict:
"""Classify data sensitivity in MCP requests."""
classification = {
'contains_pii': False,
'data_level': 'public',
'retention_policy': 'standard'
}
# Check for PII in request
if contains_pii_fields(request_data):
classification['contains_pii'] = True
classification['data_level'] = 'restricted'
classification['retention_policy'] = 'pii_compliant'
# Check for sensitive datasets
if references_sensitive_datasets(request_data):
classification['data_level'] = 'confidential'
return classification
```
## Deployment Considerations
### Multi-Region Deployment
```python
# preset/mcp/deployment.py
def get_region_specific_config():
"""Get region-specific MCP configuration."""
region = os.environ.get('PRESET_REGION', 'us-east-1')
config_map = {
'us-east-1': {
'jwks_uri': 'https://auth-us.preset.io/.well-known/jwks.json',
'base_url': 'https://app.preset.io',
'data_residency': 'US'
},
'eu-west-1': {
'jwks_uri': 'https://auth-eu.preset.io/.well-known/jwks.json',
'base_url': 'https://eu.preset.io',
'data_residency': 'EU'
}
}
return config_map.get(region, config_map['us-east-1'])
# Usage in config
region_config = get_region_specific_config()
PRESET_JWKS_URI = region_config['jwks_uri']
SUPERSET_WEBSERVER_ADDRESS = region_config['base_url']
```
### Health Check Extensions
```python
# preset/mcp/health.py
@mcp.tool
def preset_health_check() -> HealthCheckResponse:
"""Preset-specific health check for MCP service."""
checks = {
'mcp_service': check_mcp_service_health(),
'database': check_database_health(),
'auth_provider': check_auth_provider_health(),
'tenant_isolation': check_tenant_isolation(),
'rls_engine': check_rls_engine_health()
}
overall_status = 'healthy' if all(
check['status'] == 'healthy' for check in checks.values()
) else 'degraded'
return HealthCheckResponse(
status=overall_status,
checks=checks,
region=os.environ.get('PRESET_REGION'),
version=get_preset_mcp_version()
)
```
## Configuration Templates
### Production Configuration
```python
# preset_production_config.py
from preset.mcp.auth import create_preset_oidc_auth
from preset.mcp.audit import create_preset_audit_context
# MCP Service Configuration
MCP_AUTH_ENABLED = True
MCP_AUTH_FACTORY = create_preset_oidc_auth
MCP_AUDIT_CONTEXT_FACTORY = create_preset_audit_context
# Preset OIDC Configuration
PRESET_OIDC_DISCOVERY_URL = "https://auth.preset.io/.well-known/openid_configuration"
PRESET_OIDC_CLIENT_ID = "preset-mcp-production"
PRESET_MCP_AUDIENCE = "preset-superset-mcp"
# Security Configuration
PRESET_MCP_REQUIRED_SCOPES = [
"openid", "profile", "email",
"tenant:read", "workspace:read",
"dashboard:read", "chart:read", "dataset:read"
]
# Audit Configuration
PRESET_AUDIT_ENABLED = True
PRESET_SIEM_ENABLED = True
PRESET_COMPLIANCE_MODE = "SOC2"
# Performance Configuration
PRESET_MCP_CACHE_ENABLED = True
PRESET_MCP_RATE_LIMIT = "1000/hour"
PRESET_MCP_TIMEOUT = 30
```
This integration guide provides the Preset.io team with concrete extension points for implementing enterprise features while maintaining compatibility with the base MCP service architecture.

View File

@@ -2,20 +2,6 @@
title: CVEs fixed by release
sidebar_position: 2
---
#### Version 5.0.0
| CVE | Title | Affected |
|:---------------|:-----------------------------------------------------------------------------------|---------:|
| CVE-2025-55673 | Exposure of Sensitive Information to an Unauthorized Actor | < 5.0.0 |
| CVE-2025-55674 | Improper Neutralization of Special Elements used in an SQL Command | < 5.0.0 |
| CVE-2025-55675 | Improper Access Control leading to Information Disclosure | < 5.0.0 |
#### Version 4.1.3
| CVE | Title | Affected |
|:---------------|:-----------------------------------------------------------------------------------|---------:|
| CVE-2025-55672 | Improper Neutralization of Input During Web Page Generation | < 4.1.3 |
#### Version 4.1.2
| CVE | Title | Affected |

View File

@@ -20,125 +20,16 @@
import type { Config } from '@docusaurus/types';
import type { Options, ThemeConfig } from '@docusaurus/preset-classic';
import { themes } from 'prism-react-renderer';
import remarkImportPartial from 'remark-import-partial';
import * as fs from 'fs';
import * as path from 'path';
const { github: lightCodeTheme, vsDark: darkCodeTheme } = themes;
// Load version configuration from external file
const versionsConfigPath = path.join(__dirname, 'versions-config.json');
const versionsConfig = JSON.parse(fs.readFileSync(versionsConfigPath, 'utf8'));
// Build plugins array dynamically based on disabled flags
const dynamicPlugins = [];
// Add components plugin if not disabled
if (!versionsConfig.components.disabled) {
dynamicPlugins.push([
'@docusaurus/plugin-content-docs',
{
id: 'components',
path: 'components',
routeBasePath: 'components',
sidebarPath: require.resolve('./sidebarComponents.js'),
editUrl:
'https://github.com/apache/superset/edit/master/docs/components',
remarkPlugins: [remarkImportPartial],
docItemComponent: '@theme/DocItem',
includeCurrentVersion: versionsConfig.components.includeCurrentVersion,
lastVersion: versionsConfig.components.lastVersion,
onlyIncludeVersions: versionsConfig.components.onlyIncludeVersions,
versions: versionsConfig.components.versions,
disableVersioning: false,
showLastUpdateAuthor: true,
showLastUpdateTime: true,
},
]);
}
// Add developer_portal plugin if not disabled
if (!versionsConfig.developer_portal.disabled) {
dynamicPlugins.push([
'@docusaurus/plugin-content-docs',
{
id: 'developer_portal',
path: 'developer_portal',
routeBasePath: 'developer_portal',
sidebarPath: require.resolve('./sidebarTutorials.js'),
editUrl:
'https://github.com/apache/superset/edit/master/docs/developer_portal',
remarkPlugins: [remarkImportPartial],
docItemComponent: '@theme/DocItem',
includeCurrentVersion: versionsConfig.developer_portal.includeCurrentVersion,
lastVersion: versionsConfig.developer_portal.lastVersion,
onlyIncludeVersions: versionsConfig.developer_portal.onlyIncludeVersions,
versions: versionsConfig.developer_portal.versions,
disableVersioning: false,
showLastUpdateAuthor: true,
showLastUpdateTime: true,
},
]);
}
// Build navbar items dynamically based on disabled flags
const dynamicNavbarItems = [];
// Add Component Playground navbar item if not disabled
if (!versionsConfig.components.disabled) {
dynamicNavbarItems.push({
label: 'Component Playground',
to: '/components',
items: [
{
label: 'Introduction',
to: '/components',
},
{
label: 'UI Components',
to: '/components/ui-components/button',
},
{
label: 'Chart Components',
to: '/components/chart-components/bar-chart',
},
{
label: 'Layout Components',
to: '/components/layout-components/grid',
},
],
});
}
// Add Developer Portal navbar item if not disabled
if (!versionsConfig.developer_portal.disabled) {
dynamicNavbarItems.push({
label: 'Developer Portal',
position: 'left',
items: [
{
type: 'doc',
docsPluginId: 'developer_portal',
docId: 'index',
label: 'Introduction',
},
{
type: 'doc',
docsPluginId: 'developer_portal',
docId: 'getting-started/index',
label: 'Getting Started',
},
],
});
}
const config: Config = {
title: 'Superset',
tagline:
'Apache Superset is a modern data exploration and visualization platform',
url: 'https://superset.apache.org',
baseUrl: '/',
onBrokenLinks: 'warn',
onBrokenLinks: 'throw',
onBrokenMarkdownLinks: 'throw',
markdown: {
mermaid: true,
@@ -148,7 +39,6 @@ const config: Config = {
projectName: 'superset',
themes: ['@saucelabs/theme-github-codeblock', '@docusaurus/theme-mermaid'],
plugins: [
require.resolve('./src/webpack.extend.ts'),
[
'docusaurus-plugin-less',
{
@@ -157,10 +47,11 @@ const config: Config = {
},
},
],
...dynamicPlugins,
[
'@docusaurus/plugin-client-redirects',
{
fromExtensions: ['html', 'htm'],
toExtensions: ['exe', 'zip'],
redirects: [
{
to: '/docs/installation/docker-compose',
@@ -319,13 +210,6 @@ const config: Config = {
}
return `https://github.com/apache/superset/edit/master/docs/${versionDocsDirPath}/${docPath}`;
},
includeCurrentVersion: versionsConfig.docs.includeCurrentVersion,
lastVersion: versionsConfig.docs.lastVersion, // Make 'next' the default
onlyIncludeVersions: versionsConfig.docs.onlyIncludeVersions,
versions: versionsConfig.docs.versions,
disableVersioning: false,
showLastUpdateAuthor: true,
showLastUpdateTime: true,
},
blog: {
showReadingTime: true,
@@ -351,13 +235,6 @@ const config: Config = {
apiKey: 'd0d22810f2e9b614ffac3a73b26891fe',
indexName: 'superset-apache',
},
mermaid: {
theme: { light: 'neutral', dark: 'dark' },
options: {
// Any Mermaid config options go here...
maxTextSize: 100000,
},
},
navbar: {
logo: {
alt: 'Superset Logo',
@@ -367,22 +244,20 @@ const config: Config = {
items: [
{
label: 'Documentation',
position: 'left',
to: '/docs/intro',
items: [
{
type: 'doc',
docId: 'intro',
label: 'Getting Started',
to: '/docs/intro',
},
{
type: 'doc',
docId: 'faq',
label: 'FAQ',
to: '/docs/faq',
},
],
},
{
label: 'Community Resources',
label: 'Community',
to: '/community',
items: [
{
@@ -407,7 +282,6 @@ const config: Config = {
},
],
},
...dynamicNavbarItems,
{
href: '/docs/intro',
position: 'right',
@@ -462,6 +336,7 @@ const config: Config = {
// src: 'https://www.bugherd.com/sidebarv2.js?apikey=enilpiu7bgexxsnoqfjtxa',
// async: true,
// },
'/script/matomo.js',
{
src: 'https://widget.kapa.ai/kapa-widget.bundle.js',
async: true,

View File

@@ -28,9 +28,6 @@ const globals = require('globals');
const { defineConfig, globalIgnores } = require('eslint/config');
module.exports = defineConfig([
{
files: ['**/*.{js,jsx,ts,tsx}'],
},
globalIgnores(['build/**/*', '.docusaurus/**/*', 'node_modules/**/*']),
js.configs.recommended,
...ts.configs.recommended,
@@ -39,7 +36,7 @@ module.exports = defineConfig([
files: ['eslint.config.js'],
rules: {
'@typescript-eslint/no-require-imports': 'off',
},
}
},
{
languageOptions: {
@@ -71,5 +68,5 @@ module.exports = defineConfig([
version: 'detect',
},
},
},
]);
}
])

View File

@@ -6,8 +6,7 @@
"scripts": {
"docusaurus": "docusaurus",
"_init": "cat src/intro_header.txt ../README.md > docs/intro.md",
"start": "yarn run _init && NODE_ENV=development docusaurus start",
"stop": "pkill -f 'docusaurus start' || echo 'No docusaurus server running'",
"start": "yarn run _init && docusaurus start",
"build": "yarn run _init && DEBUG=docusaurus:* docusaurus build",
"swizzle": "docusaurus swizzle",
"deploy": "docusaurus deploy",
@@ -16,73 +15,44 @@
"write-translations": "docusaurus write-translations",
"write-heading-ids": "docusaurus write-heading-ids",
"typecheck": "tsc",
"eslint": "eslint .",
"version:add": "node scripts/manage-versions.mjs add",
"version:remove": "node scripts/manage-versions.mjs remove",
"version:add:docs": "node scripts/manage-versions.mjs add docs",
"version:add:developer_portal": "node scripts/manage-versions.mjs add developer_portal",
"version:add:components": "node scripts/manage-versions.mjs add components",
"version:remove:docs": "node scripts/manage-versions.mjs remove docs",
"version:remove:developer_portal": "node scripts/manage-versions.mjs remove developer_portal",
"version:remove:components": "node scripts/manage-versions.mjs remove components"
"eslint": "eslint . --ext .js,.jsx,.ts,.tsx"
},
"dependencies": {
"@ant-design/icons": "^6.0.0",
"@docusaurus/core": "3.8.1",
"@docusaurus/plugin-client-redirects": "3.8.1",
"@docusaurus/preset-classic": "3.8.1",
"@docusaurus/theme-mermaid": "^3.8.1",
"@emotion/core": "^10.0.27",
"@emotion/react": "^11.13.3",
"@docusaurus/theme-mermaid": "3.8.1",
"@emotion/styled": "^10.0.27",
"@mdx-js/react": "^3.1.0",
"@saucelabs/theme-github-codeblock": "^0.3.0",
"@storybook/addon-docs": "^8.6.11",
"@storybook/blocks": "^8.6.11",
"@storybook/channels": "^8.6.11",
"@storybook/client-logger": "^8.6.11",
"@storybook/components": "^8.6.11",
"@storybook/core": "^8.6.11",
"@storybook/core-events": "^8.6.11",
"@storybook/csf": "^0.1.13",
"@storybook/docs-tools": "^8.6.11",
"@storybook/preview-api": "^8.6.11",
"@storybook/theming": "^8.6.11",
"@superset-ui/core": "^0.20.4",
"antd": "^5.26.7",
"caniuse-lite": "^1.0.30001707",
"@superset-ui/style": "^0.14.23",
"antd": "^5.26.3",
"docusaurus-plugin-less": "^2.0.2",
"json-bigint": "^1.0.0",
"less": "^4.4.0",
"less": "^4.3.0",
"less-loader": "^12.3.0",
"prism-react-renderer": "^2.4.1",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"react-github-btn": "^1.4.0",
"react-svg-pan-zoom": "^3.13.1",
"remark-import-partial": "^0.0.2",
"reselect": "^5.1.1",
"storybook": "^8.6.11",
"swagger-ui-react": "^5.27.1",
"tinycolor2": "^1.4.2",
"ts-loader": "^9.5.2"
"swagger-ui-react": "^5.26.0"
},
"devDependencies": {
"@docusaurus/module-type-aliases": "^3.8.1",
"@docusaurus/tsconfig": "^3.8.1",
"@eslint/js": "^9.32.0",
"@eslint/js": "^9.31.0",
"@types/react": "^19.1.8",
"@typescript-eslint/eslint-plugin": "^8.37.0",
"@typescript-eslint/parser": "^8.37.0",
"eslint": "^9.32.0",
"eslint-config-prettier": "^10.1.8",
"eslint-plugin-prettier": "^5.5.3",
"eslint": "^9.31.0",
"eslint-config-prettier": "^10.1.5",
"eslint-plugin-prettier": "^5.5.1",
"eslint-plugin-react": "^7.37.5",
"globals": "^16.3.0",
"prettier": "^3.6.2",
"typescript": "~5.8.3",
"typescript-eslint": "^8.39.0",
"webpack": "^5.101.0"
"typescript-eslint": "^8.37.0",
"webpack": "^5.99.9"
},
"browserslist": {
"production": [

View File

@@ -1,242 +0,0 @@
#!/usr/bin/env node
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
import fs from 'fs';
import path from 'path';
import { execSync } from 'child_process';
import { fileURLToPath } from 'url';
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const CONFIG_FILE = path.join(__dirname, '..', 'versions-config.json');
// Parse command line arguments
const args = process.argv.slice(2);
const command = args[0]; // 'add' or 'remove'
const section = args[1]; // 'docs', 'developer_portal', or 'components'
const version = args[2]; // version string like '1.2.0'
function loadConfig() {
return JSON.parse(fs.readFileSync(CONFIG_FILE, 'utf8'));
}
function saveConfig(config) {
fs.writeFileSync(CONFIG_FILE, JSON.stringify(config, null, 2) + '\n');
}
function fixVersionedImports(version) {
const versionedDocsPath = path.join(__dirname, '..', 'versioned_docs', `version-${version}`);
// Files that need import path fixes
const filesToFix = [
'contributing/resources.mdx',
'configuration/country-map-tools.mdx'
];
console.log(` Fixing relative imports in versioned docs...`);
filesToFix.forEach(filePath => {
const fullPath = path.join(versionedDocsPath, filePath);
if (fs.existsSync(fullPath)) {
let content = fs.readFileSync(fullPath, 'utf8');
// Fix imports that go up two directories to go up three instead
content = content.replace(
/from ['"]\.\.\/\.\.\/src\//g,
"from '../../../src/"
);
content = content.replace(
/from ['"]\.\.\/\.\.\/data\//g,
"from '../../../data/"
);
fs.writeFileSync(fullPath, content);
console.log(` Fixed imports in ${filePath}`);
}
});
}
function addVersion(section, version) {
const config = loadConfig();
if (!config[section]) {
console.error(`Section '${section}' not found in config`);
process.exit(1);
}
// Check if version already exists
if (config[section].onlyIncludeVersions.includes(version)) {
console.error(`Version ${version} already exists in ${section}`);
process.exit(1);
}
console.log(`Creating version ${version} for ${section}...`);
// Run Docusaurus version command
const docusaurusCommand = section === 'docs'
? `yarn docusaurus docs:version ${version}`
: `yarn docusaurus docs:version:${section} ${version}`;
try {
execSync(docusaurusCommand, { stdio: 'inherit' });
} catch (error) {
console.error(`Failed to create version: ${error.message}`);
process.exit(1);
}
// Fix relative imports in versioned docs (for main docs section only)
if (section === 'docs') {
fixVersionedImports(version);
}
// Update config
// Add to onlyIncludeVersions array (after 'current')
const versionIndex = config[section].onlyIncludeVersions.indexOf('current') + 1;
config[section].onlyIncludeVersions.splice(versionIndex, 0, version);
// Add version metadata
const versionPath = section === 'docs' ? version : version;
config[section].versions[version] = {
label: version,
path: versionPath,
banner: 'none'
};
// Optionally update lastVersion if this is the first non-current version
if (config[section].onlyIncludeVersions.length === 2) {
config[section].lastVersion = version;
}
saveConfig(config);
console.log(`✅ Version ${version} added successfully to ${section}`);
console.log(`📝 Updated versions-config.json`);
}
function removeVersion(section, version) {
const config = loadConfig();
if (!config[section]) {
console.error(`Section '${section}' not found in config`);
process.exit(1);
}
if (version === 'current') {
console.error(`Cannot remove 'current' version`);
process.exit(1);
}
if (!config[section].onlyIncludeVersions.includes(version)) {
console.error(`Version ${version} not found in ${section}`);
process.exit(1);
}
console.log(`Removing version ${version} from ${section}...`);
// Determine file paths based on section
const versionedDocsDir = section === 'docs'
? `versioned_docs/version-${version}`
: `${section}_versioned_docs/version-${version}`;
const versionedSidebarsFile = section === 'docs'
? `versioned_sidebars/version-${version}-sidebars.json`
: `${section}_versioned_sidebars/version-${version}-sidebars.json`;
// Remove versioned files
const docsPath = path.join(__dirname, '..', versionedDocsDir);
const sidebarsPath = path.join(__dirname, '..', versionedSidebarsFile);
if (fs.existsSync(docsPath)) {
fs.rmSync(docsPath, { recursive: true });
console.log(` Removed ${versionedDocsDir}`);
}
if (fs.existsSync(sidebarsPath)) {
fs.unlinkSync(sidebarsPath);
console.log(` Removed ${versionedSidebarsFile}`);
}
// Update versions.json file
const versionsJsonFile = section === 'docs'
? 'versions.json'
: `${section}_versions.json`;
const versionsJsonPath = path.join(__dirname, '..', versionsJsonFile);
if (fs.existsSync(versionsJsonPath)) {
const versions = JSON.parse(fs.readFileSync(versionsJsonPath, 'utf8'));
const versionIndex = versions.indexOf(version);
if (versionIndex > -1) {
versions.splice(versionIndex, 1);
fs.writeFileSync(versionsJsonPath, JSON.stringify(versions, null, 2) + '\n');
console.log(` Updated ${versionsJsonFile}`);
}
}
// Update config
const versionIndex = config[section].onlyIncludeVersions.indexOf(version);
config[section].onlyIncludeVersions.splice(versionIndex, 1);
delete config[section].versions[version];
// Update lastVersion if needed
if (config[section].lastVersion === version) {
// Set to the next available version or 'current'
const remainingVersions = config[section].onlyIncludeVersions.filter(v => v !== 'current');
config[section].lastVersion = remainingVersions.length > 0 ? remainingVersions[0] : 'current';
console.log(` Updated lastVersion to ${config[section].lastVersion}`);
}
saveConfig(config);
console.log(`✅ Version ${version} removed successfully from ${section}`);
console.log(`📝 Updated versions-config.json`);
}
function printUsage() {
console.log(`
Usage:
node scripts/manage-versions.js add <section> <version>
node scripts/manage-versions.js remove <section> <version>
Where:
- section: 'docs', 'developer_portal', or 'components'
- version: version string (e.g., '1.2.0', '2.0.0')
Examples:
node scripts/manage-versions.js add docs 2.0.0
node scripts/manage-versions.js add developer_portal 1.3.0
node scripts/manage-versions.js remove components 1.0.0
`);
}
// Main execution
if (!command || !section || !version) {
printUsage();
process.exit(1);
}
if (command === 'add') {
addVersion(section, version);
} else if (command === 'remove') {
removeVersion(section, version);
} else {
console.error(`Unknown command: ${command}`);
printUsage();
process.exit(1);
}

View File

@@ -1,68 +0,0 @@
/* eslint-env node */
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
// @ts-check
/** @type {import('@docusaurus/plugin-content-docs').SidebarsConfig} */
const sidebars = {
// By default, Docusaurus generates a sidebar from the docs folder structure
//tutorialSidebar: [{type: 'autogenerated', dirName: '.'}],
// But we're not doing that.
ComponentSidebar: [
{
type: 'doc',
label: 'Introduction',
id: 'index',
},
{
type: 'category',
label: 'UI Components',
items: [
{
type: 'autogenerated',
dirName: 'ui-components',
},
],
},
{
type: 'category',
label: 'Chart Components',
items: [
{
type: 'autogenerated',
dirName: 'chart-components',
},
],
},
{
type: 'category',
label: 'Layout Components',
items: [
{
type: 'autogenerated',
dirName: 'layout-components',
},
],
},
],
};
module.exports = sidebars;

View File

@@ -1,48 +0,0 @@
/* eslint-env node */
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
// @ts-check
/** @type {import('@docusaurus/plugin-content-docs').SidebarsConfig} */
const sidebars = {
// By default, Docusaurus generates a sidebar from the docs folder structure
//tutorialSidebar: [{type: 'autogenerated', dirName: '.'}],
// But we're not doing that.
TutorialsSidebar: [
{
type: 'doc',
label: 'Introduction',
id: 'index',
},
{
type: 'category',
label: 'Getting Started',
items: [
{
type: 'autogenerated',
dirName: 'getting-started',
},
],
},
],
};
module.exports = sidebars;

View File

@@ -87,6 +87,16 @@ const sidebars = {
},
],
},
{
type: 'category',
label: 'MCP Service',
items: [
{
type: 'autogenerated',
dirName: 'mcp-service',
},
],
},
{
type: 'doc',
label: 'FAQ',

View File

@@ -1,23 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
// Re-export the Button component as a default export
import { Button } from '../../../superset-frontend/packages/superset-ui-core/src/components/Button';
export default Button;

View File

@@ -1,121 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
import React from 'react';
import { supersetTheme, ThemeProvider } from '@superset-ui/core';
// A simple component to display a story example
export function StoryExample({ component: Component, props = {} }) {
return (
<ThemeProvider theme={supersetTheme}>
<div
className="storybook-example"
style={{
border: '1px solid #e8e8e8',
borderRadius: '4px',
padding: '20px',
marginBottom: '20px',
}}
>
{Component && <Component {...props} />}
</div>
</ThemeProvider>
);
}
// A simple component to display a story with controls
export function StoryWithControls({
component: Component,
props = {},
controls = [],
}) {
const [stateProps, setStateProps] = React.useState(props);
const updateProp = (key, value) => {
setStateProps(prev => ({
...prev,
[key]: value,
}));
};
return (
<ThemeProvider theme={supersetTheme}>
<div className="storybook-with-controls">
<div
className="storybook-example"
style={{
border: '1px solid #e8e8e8',
borderRadius: '4px',
padding: '20px',
marginBottom: '20px',
}}
>
{Component && <Component {...stateProps} />}
</div>
{controls.length > 0 && (
<div
className="storybook-controls"
style={{
border: '1px solid #e8e8e8',
borderRadius: '4px',
padding: '20px',
marginBottom: '20px',
}}
>
<h4>Controls</h4>
{controls.map(control => (
<div key={control.name} style={{ marginBottom: '10px' }}>
<label style={{ display: 'block', marginBottom: '5px' }}>
{control.label || control.name}:
</label>
{control.type === 'select' ? (
<select
value={stateProps[control.name]}
onChange={e => updateProp(control.name, e.target.value)}
style={{ width: '100%', padding: '5px' }}
>
{control.options.map(option => (
<option key={option} value={option}>
{option}
</option>
))}
</select>
) : control.type === 'boolean' ? (
<input
type="checkbox"
checked={stateProps[control.name]}
onChange={e => updateProp(control.name, e.target.checked)}
/>
) : (
<input
type="text"
value={stateProps[control.name]}
onChange={e => updateProp(control.name, e.target.value)}
style={{ width: '100%', padding: '5px' }}
/>
)}
</div>
))}
</div>
)}
</div>
</ThemeProvider>
);
}

View File

@@ -460,23 +460,15 @@ export default function Home(): JSX.Element {
const changeToDark = () => {
const navbar = document.body.querySelector('.navbar');
const logo = document.body.querySelector('.navbar__logo img');
if (navbar) {
navbar.classList.add('navbar--dark');
}
if (logo) {
logo.setAttribute('src', '/img/superset-logo-horiz-dark.svg');
}
navbar.classList.add('navbar--dark');
logo.setAttribute('src', '/img/superset-logo-horiz-dark.svg');
};
const changeToLight = () => {
const navbar = document.body.querySelector('.navbar');
const logo = document.body.querySelector('.navbar__logo img');
if (navbar) {
navbar.classList.remove('navbar--dark');
}
if (logo) {
logo.setAttribute('src', '/img/superset-logo-horiz.svg');
}
navbar.classList.remove('navbar--dark');
logo.setAttribute('src', '/img/superset-logo-horiz.svg');
};
// Set up dark <-> light navbar change
@@ -484,9 +476,7 @@ export default function Home(): JSX.Element {
changeToDark();
const navbarToggle = document.body.querySelector('.navbar__toggle');
if (navbarToggle) {
navbarToggle.addEventListener('click', () => changeToLight());
}
navbarToggle.addEventListener('click', () => changeToLight());
const scrollListener = () => {
if (window.scrollY > 0) {

View File

@@ -45,60 +45,6 @@ ul.dropdown__menu svg {
display: none;
}
/* Consistent dropdown styling for navbar */
.navbar__item.dropdown .dropdown__menu {
background-color: white;
border: 1px solid rgba(0, 0, 0, 0.1);
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
}
.navbar__item.dropdown .dropdown__link {
color: #1c1e21 !important;
background-color: transparent !important;
border-radius: 0 !important;
padding: 0.5rem 1rem !important;
display: block !important;
}
.navbar__item.dropdown .dropdown__link:hover {
background-color: #f5f5f5 !important;
color: #1c1e21 !important;
text-decoration: none !important;
}
/* Remove the blue box styling for doc links in dropdowns */
.navbar__item.dropdown .dropdown__link--active {
background-color: transparent !important;
color: #1c1e21 !important;
}
.navbar__item.dropdown .dropdown__link--active:hover {
background-color: #f5f5f5 !important;
}
/* Dark mode support */
[data-theme='dark'] .navbar__item.dropdown .dropdown__menu {
background-color: #242526;
border: 1px solid rgba(255, 255, 255, 0.1);
}
[data-theme='dark'] .navbar__item.dropdown .dropdown__link {
color: #f5f6f7 !important;
}
[data-theme='dark'] .navbar__item.dropdown .dropdown__link:hover {
background-color: #3a3b3c !important;
color: #f5f6f7 !important;
}
[data-theme='dark'] .navbar__item.dropdown .dropdown__link--active {
color: #f5f6f7 !important;
}
[data-theme='dark'] .navbar__item.dropdown .dropdown__link--active:hover {
background-color: #3a3b3c !important;
}
:root {
--ifm-color-primary: #20a7c9;
--ifm-color-primary-dark: #1985a0;

View File

@@ -1,119 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
import React from 'react';
import {
useActivePlugin,
useDocsVersion,
useVersions,
} from '@docusaurus/plugin-content-docs/client';
import { useLocation } from '@docusaurus/router';
import { useDocsPreferredVersion } from '@docusaurus/theme-common';
import { Dropdown } from 'antd';
import { DownOutlined } from '@ant-design/icons';
import styles from './styles.module.css';
export default function DocVersionBadge() {
const activePlugin = useActivePlugin();
const { pathname } = useLocation();
const pluginId = activePlugin?.pluginId;
const [versionedPath, setVersionedPath] = React.useState('');
// Show version selector for all versioned sections
const isVersioned = [
'default', // main docs
'components',
'tutorials',
'developer_portal',
].includes(pluginId);
const { preferredVersion } = useDocsPreferredVersion(pluginId);
const versions = useVersions(pluginId);
const version = useDocsVersion();
// Extract the current page path relative to the version
React.useEffect(() => {
if (!pathname || !version || !pluginId) return;
let relativePath = '';
const basePath = pluginId === 'default' ? '/docs' : `/${pluginId}`;
// Handle different version path patterns
if (pathname.includes(basePath)) {
// Extract the part after the base path
const parts = pathname.split(basePath);
if (parts.length > 1) {
const afterBase = parts[1];
// For versioned paths, remove the version segment
if (afterBase.startsWith('/')) {
const segments = afterBase.substring(1).split('/');
// Check if first segment is a version (e.g., "1.1.0", "next")
if (segments[0] && (segments[0].match(/^\d+\.\d+\.\d+$/) || segments[0] === 'next')) {
// Skip the version segment
relativePath = segments.length > 1 ? '/' + segments.slice(1).join('/') : '';
} else {
// No version in path (e.g., /docs/intro for current version with empty path)
relativePath = afterBase;
}
}
}
}
setVersionedPath(relativePath);
}, [pathname, version, pluginId]);
// Create dropdown items for version selection
const items = versions.map(v => {
// Construct the URL for this version, preserving the current page
// v.path contains the full path including base, e.g., "/docs/1.1.0" or "/docs"
let versionUrl = v.path;
if (versionedPath) {
// Append the current page path to the version base
versionUrl = v.path + versionedPath;
}
return {
key: v.name,
label: (
<a href={versionUrl}>
{v.label}
{v.name === version.name && ' (current)'}
{v.name === preferredVersion?.name && ' (preferred)'}
</a>
),
};
});
if (!isVersioned) {
return null;
}
return (
<span className={styles.versionBadge}>
Version:{' '}
<Dropdown menu={{ items }} trigger={['click']}>
<a onClick={e => e.preventDefault()} className={styles.versionSelector}>
{version.label} <DownOutlined />
</a>
</Dropdown>
</span>
);
}

View File

@@ -1,42 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
.versionBadge {
display: inline-flex;
align-items: center;
font-size: 0.8rem;
font-weight: 500;
color: var(--ifm-color-emphasis-700);
background-color: var(--ifm-color-emphasis-200);
border-radius: 0.5rem;
padding: 0.2rem 0.5rem;
margin-right: 0.5rem;
}
.versionSelector {
cursor: pointer;
color: var(--ifm-color-primary);
font-weight: 500;
margin-left: 0.25rem;
}
.versionSelector:hover {
text-decoration: none;
color: var(--ifm-color-primary-darker);
}

View File

@@ -1,121 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
import React, { useState, useEffect } from 'react';
import DocVersionBanner from '@theme-original/DocVersionBanner';
import {
useActivePlugin,
useDocsVersion,
useVersions,
} from '@docusaurus/plugin-content-docs/client';
import { useLocation } from '@docusaurus/router';
import { useDocsPreferredVersion } from '@docusaurus/theme-common';
import { Dropdown } from 'antd';
import { DownOutlined } from '@ant-design/icons';
import styles from './styles.module.css';
export default function DocVersionBannerWrapper(props) {
const activePlugin = useActivePlugin();
const { pathname } = useLocation();
const pluginId = activePlugin?.pluginId;
const [versionedPath, setVersionedPath] = useState('');
// Only show version selector for tutorials
// Main docs, components, and developer_portal use the DocVersionBadge component instead
const isVersioned = pluginId && ['tutorials'].includes(pluginId);
const { preferredVersion } = useDocsPreferredVersion(pluginId);
const versions = useVersions(pluginId);
const version = useDocsVersion();
// Early return if required data is not available
if (!isVersioned || !versions || !version) {
return <DocVersionBanner {...props} />;
}
// Extract the current page path relative to the version
useEffect(() => {
if (!pathname || !version || !pluginId) return;
let relativePath = '';
// Handle different version path patterns
if (pathname.includes(`/${pluginId}/`)) {
// Extract the part after the version
// Example: /components/1.1.0/ui-components/button -> /ui-components/button
const parts = pathname.split(`/${pluginId}/`);
if (parts.length > 1) {
const afterPluginId = parts[1];
// Find where the version part ends
const versionParts = afterPluginId.split('/');
if (versionParts.length > 1) {
// Remove the version part and join the rest
relativePath = '/' + versionParts.slice(1).join('/');
}
}
}
setVersionedPath(relativePath);
}, [pathname, version, pluginId]);
// Create dropdown items for version selection
const items = versions.map(v => {
// Construct the URL for this version, preserving the current page
// v.path is the version-specific path like "1.0.0" or "next"
let versionUrl = v.path;
if (versionedPath) {
// Construct the full URL with the version and the current page path
versionUrl = v.path + versionedPath;
}
return {
key: v.name,
label: (
<a href={versionUrl}>
{v.label}
{v.name === version.name && ' (current)'}
{v.name === preferredVersion?.name && ' (preferred)'}
</a>
),
};
});
return (
<>
<DocVersionBanner {...props} />
{isVersioned && (
<div className={styles.versionBanner}>
<div className={styles.versionContainer}>
<span className={styles.versionLabel}>Version:</span>
<Dropdown menu={{ items }} trigger={['click']}>
<a
onClick={e => e.preventDefault()}
className={styles.versionSelector}
>
{version.label} <DownOutlined />
</a>
</Dropdown>
</div>
</div>
)}
</>
);
}

View File

@@ -1,49 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
.versionBanner {
background-color: var(--ifm-color-emphasis-100);
padding: 0.5rem 1rem;
margin-bottom: 1rem;
border-bottom: 1px solid var(--ifm-color-emphasis-200);
}
.versionContainer {
display: flex;
align-items: center;
max-width: var(--ifm-container-width);
margin: 0 auto;
padding: 0 var(--ifm-spacing-horizontal);
}
.versionLabel {
font-weight: bold;
margin-right: 0.5rem;
}
.versionSelector {
cursor: pointer;
color: var(--ifm-color-primary);
font-weight: 500;
}
.versionSelector:hover {
text-decoration: none;
color: var(--ifm-color-primary-darker);
}

View File

@@ -1,115 +0,0 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
import path from 'path';
import type { Plugin } from '@docusaurus/types';
export default function webpackExtendPlugin(): Plugin<void> {
return {
name: 'custom-webpack-plugin',
configureWebpack(config, isServer, utils) {
const { isDev } = utils;
return {
devtool: isDev ? 'eval-source-map' : config.devtool,
...(isDev && {
optimization: {
...config.optimization,
minimize: false,
removeAvailableModules: false,
removeEmptyChunks: false,
splitChunks: false,
},
}),
resolve: {
alias: {
...config.resolve.alias,
// Allow importing from superset-frontend
src: path.resolve(__dirname, '../../superset-frontend/src'),
// '@superset-ui/core': path.resolve(
// __dirname,
// '../../superset-frontend/packages/superset-ui-core',
// ),
// Add aliases for our components to make imports easier
'@docs/components': path.resolve(__dirname, '../src/components'),
'@superset/components': path.resolve(
__dirname,
'../../superset-frontend/packages/superset-ui-core/src/components',
),
// Add proper Storybook aliases
'@storybook/blocks': path.resolve(
__dirname,
'../node_modules/@storybook/blocks',
),
'@storybook/components': path.resolve(
__dirname,
'../node_modules/@storybook/components',
),
'@storybook/theming': path.resolve(
__dirname,
'../node_modules/@storybook/theming',
),
'@storybook/client-logger': path.resolve(
__dirname,
'../node_modules/@storybook/client-logger',
),
'@storybook/core-events': path.resolve(
__dirname,
'../node_modules/@storybook/core-events',
),
// Add internal Storybook aliases
'storybook/internal/components': path.resolve(
__dirname,
'../node_modules/@storybook/components',
),
'storybook/internal/theming': path.resolve(
__dirname,
'../node_modules/@storybook/theming',
),
'storybook/internal/client-logger': path.resolve(
__dirname,
'../node_modules/@storybook/client-logger',
),
'storybook/internal/csf': path.resolve(
__dirname,
'../node_modules/@storybook/csf',
),
'storybook/internal/preview-api': path.resolve(
__dirname,
'../node_modules/@storybook/preview-api',
),
'storybook/internal/docs-tools': path.resolve(
__dirname,
'../node_modules/@storybook/docs-tools',
),
'storybook/internal/core-events': path.resolve(
__dirname,
'../node_modules/@storybook/core-events',
),
'storybook/internal/channels': path.resolve(
__dirname,
'../node_modules/@storybook/channels',
),
},
},
// We're removing the ts-loader rule that was processing superset-frontend files
// This will prevent TypeScript errors from files outside the docs directory
};
},
};
}

View File

@@ -1,3 +0,0 @@
[
"1.0.0"
]

View File

@@ -1,35 +0,0 @@
---
title: API
hide_title: true
sidebar_position: 10
---
import SwaggerUI from 'swagger-ui-react';
import openapi from '/resources/openapi.json';
import 'swagger-ui-react/swagger-ui.css';
import { Alert } from 'antd';
## API
Superset's public **REST API** follows the
[OpenAPI specification](https://swagger.io/specification/), and is
documented here. The docs below are generated using
[Swagger React UI](https://www.npmjs.com/package/swagger-ui-react).
<Alert
type="info"
message={
<div>
<strong>NOTE! </strong>
You can find an interactive version of this documentation on your local Superset
instance at <strong>/swagger/v1</strong> (unless disabled)
</div>
}
/>
<br />
<br />
<hr />
<div className="swagger-container">
<SwaggerUI spec={openapi} />
</div>

View File

@@ -1,436 +0,0 @@
---
title: Alerts and Reports
hide_title: true
sidebar_position: 2
version: 2
---
# Alerts and Reports
Users can configure automated alerts and reports to send dashboards or charts to an email recipient or Slack channel.
- *Alerts* are sent when a SQL condition is reached
- *Reports* are sent on a schedule
Alerts and reports are disabled by default. To turn them on, you need to do some setup, described here.
## Requirements
### Commons
#### In your `superset_config.py` or `superset_config_docker.py`
- `"ALERT_REPORTS"` [feature flag](/docs/configuration/configuring-superset#feature-flags) must be turned to True.
- `beat_schedule` in CeleryConfig must contain schedule for `reports.scheduler`.
- At least one of those must be configured, depending on what you want to use:
- emails: `SMTP_*` settings
- Slack messages: `SLACK_API_TOKEN`
- Users can customize the email subject by including date code placeholders, which will automatically be replaced with the corresponding UTC date when the email is sent. To enable this functionality, activate the `"DATE_FORMAT_IN_EMAIL_SUBJECT"` [feature flag](/docs/configuration/configuring-superset#feature-flags). This enables date formatting in email subjects, preventing all reporting emails from being grouped into the same thread (optional for the reporting feature).
- Use date codes from [strftime.org](https://strftime.org/) to create the email subject.
- If no date code is provided, the original string will be used as the email subject.
##### Disable dry-run mode
Screenshots will be taken but no messages actually sent as long as `ALERT_REPORTS_NOTIFICATION_DRY_RUN = True`, its default value in `docker/pythonpath_dev/superset_config.py`. To disable dry-run mode and start receiving email/Slack notifications, set `ALERT_REPORTS_NOTIFICATION_DRY_RUN` to `False` in [superset config](https://github.com/apache/superset/blob/master/docker/pythonpath_dev/superset_config.py).
#### In your `Dockerfile`
- You must install a headless browser, for taking screenshots of the charts and dashboards. Only Firefox and Chrome are currently supported.
> If you choose Chrome, you must also change the value of `WEBDRIVER_TYPE` to `"chrome"` in your `superset_config.py`.
Note: All the components required (Firefox headless browser, Redis, Postgres db, celery worker and celery beat) are present in the *dev* docker image if you are following [Installing Superset Locally](/docs/installation/docker-compose/).
All you need to do is add the required config variables described in this guide (See `Detailed Config`).
If you are running a non-dev docker image, e.g., a stable release like `apache/superset:3.1.0`, that image does not include a headless browser. Only the `superset_worker` container needs this headless browser to browse to the target chart or dashboard.
You can either install and configure the headless browser - see "Custom Dockerfile" section below - or when deploying via `docker compose`, modify your `docker-compose.yml` file to use a dev image for the worker container and a stable release image for the `superset_app` container.
*Note*: In this context, a "dev image" is the same application software as its corresponding non-dev image, just bundled with additional tools. So an image like `3.1.0-dev` is identical to `3.1.0` when it comes to stability, functionality, and running in production. The actual "in-development" versions of Superset - cutting-edge and unstable - are not tagged with version numbers on Docker Hub and will display version `0.0.0-dev` within the Superset UI.
### Slack integration
To send alerts and reports to Slack channels, you need to create a new Slack Application on your workspace.
1. Connect to your Slack workspace, then head to [https://api.slack.com/apps].
2. Create a new app.
3. Go to "OAuth & Permissions" section, and give the following scopes to your app:
- `incoming-webhook`
- `files:write`
- `chat:write`
- `channels:read`
- `groups:read`
4. At the top of the "OAuth and Permissions" section, click "install to workspace".
5. Select a default channel for your app and continue.
(You can post to any channel by inviting your Superset app into that channel).
6. The app should now be installed in your workspace, and a "Bot User OAuth Access Token" should have been created. Copy that token in the `SLACK_API_TOKEN` variable of your `superset_config.py`.
7. Ensure the feature flag `ALERT_REPORT_SLACK_V2` is set to True in `superset_config.py`
8. Restart the service (or run `superset init`) to pull in the new configuration.
Note: when you configure an alert or a report, the Slack channel list takes channel names without the leading '#' e.g. use `alerts` instead of `#alerts`.
### Kubernetes-specific
- You must have a `celery beat` pod running. If you're using the chart included in the GitHub repository under [helm/superset](https://github.com/apache/superset/tree/master/helm/superset), you need to put `supersetCeleryBeat.enabled = true` in your values override.
- You can see the dedicated docs about [Kubernetes installation](/docs/installation/kubernetes) for more details.
### Docker Compose specific
#### You must have in your `docker-compose.yml`
- A Redis message broker
- PostgreSQL DB instead of SQLlite
- One or more `celery worker`
- A single `celery beat`
This process also works in a Docker swarm environment, you would just need to add `Deploy:` to the Superset, Redis and Postgres services along with your specific configs for your swarm.
### Detailed config
The following configurations need to be added to the `superset_config.py` file. This file is loaded when the image runs, and any configurations in it will override the default configurations found in the `config.py`.
You can find documentation about each field in the default `config.py` in the GitHub repository under [superset/config.py](https://github.com/apache/superset/blob/master/superset/config.py).
You need to replace default values with your custom Redis, Slack and/or SMTP config.
Superset uses Celery beat and Celery worker(s) to send alerts and reports.
- The beat is the scheduler that tells the worker when to perform its tasks. This schedule is defined when you create the alert or report.
- The worker will process the tasks that need to be performed when an alert or report is fired.
In the `CeleryConfig`, only the `beat_schedule` is relevant to this feature, the rest of the `CeleryConfig` can be changed for your needs.
```python
from celery.schedules import crontab
FEATURE_FLAGS = {
"ALERT_REPORTS": True
}
REDIS_HOST = "superset_cache"
REDIS_PORT = "6379"
class CeleryConfig:
broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/0"
imports = (
"superset.sql_lab",
"superset.tasks.scheduler",
)
result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/0"
worker_prefetch_multiplier = 10
task_acks_late = True
task_annotations = {
"sql_lab.get_sql_results": {
"rate_limit": "100/s",
},
}
beat_schedule = {
"reports.scheduler": {
"task": "reports.scheduler",
"schedule": crontab(minute="*", hour="*"),
},
"reports.prune_log": {
"task": "reports.prune_log",
"schedule": crontab(minute=0, hour=0),
},
}
CELERY_CONFIG = CeleryConfig
SCREENSHOT_LOCATE_WAIT = 100
SCREENSHOT_LOAD_WAIT = 600
# Slack configuration
SLACK_API_TOKEN = "xoxb-"
# Email configuration
SMTP_HOST = "smtp.sendgrid.net" # change to your host
SMTP_PORT = 2525 # your port, e.g. 587
SMTP_STARTTLS = True
SMTP_SSL_SERVER_AUTH = True # If you're using an SMTP server with a valid certificate
SMTP_SSL = False
SMTP_USER = "your_user" # use the empty string "" if using an unauthenticated SMTP server
SMTP_PASSWORD = "your_password" # use the empty string "" if using an unauthenticated SMTP server
SMTP_MAIL_FROM = "noreply@youremail.com"
EMAIL_REPORTS_SUBJECT_PREFIX = "[Superset] " # optional - overwrites default value in config.py of "[Report] "
# WebDriver configuration
# If you use Firefox, you can stick with default values
# If you use Chrome, then add the following WEBDRIVER_TYPE and WEBDRIVER_OPTION_ARGS
WEBDRIVER_TYPE = "chrome"
WEBDRIVER_OPTION_ARGS = [
"--force-device-scale-factor=2.0",
"--high-dpi-support=2.0",
"--headless",
"--disable-gpu",
"--disable-dev-shm-usage",
"--no-sandbox",
"--disable-setuid-sandbox",
"--disable-extensions",
]
# This is for internal use, you can keep http
WEBDRIVER_BASEURL = "http://superset:8088" # When running using docker compose use "http://superset_app:8088'
# This is the link sent to the recipient. Change to your domain, e.g. https://superset.mydomain.com
WEBDRIVER_BASEURL_USER_FRIENDLY = "http://localhost:8088"
```
You also need
to specify on behalf of which username to render the dashboards. In general, dashboards and charts
are not accessible to unauthorized requests, that is why the worker needs to take over credentials
of an existing user to take a snapshot.
By default, Alerts and Reports are executed as the owner of the alert/report object. To use a fixed user account,
just change the config as follows (`admin` in this example):
```python
from superset.tasks.types import FixedExecutor
ALERT_REPORTS_EXECUTORS = [FixedExecutor("admin")]
```
Please refer to `ExecutorType` in the codebase for other executor types.
**Important notes**
- Be mindful of the concurrency setting for celery (using `-c 4`). Selenium/webdriver instances can
consume a lot of CPU / memory on your servers.
- In some cases, if you notice a lot of leaked geckodriver processes, try running your celery
processes with `celery worker --pool=prefork --max-tasks-per-child=128 ...`
- It is recommended to run separate workers for the `sql_lab` and `email_reports` tasks. This can be
done using the `queue` field in `task_annotations`.
- Adjust `WEBDRIVER_BASEURL` in your configuration file if celery workers cant access Superset via
its default value of `http://0.0.0.0:8080/`.
It's also possible to specify a minimum interval between each report's execution through the config file:
``` python
# Set a minimum interval threshold between executions (for each Alert/Report)
# Value should be an integer
ALERT_MINIMUM_INTERVAL = int(timedelta(minutes=10).total_seconds())
REPORT_MINIMUM_INTERVAL = int(timedelta(minutes=5).total_seconds())
```
Alternatively, you can assign a function to `ALERT_MINIMUM_INTERVAL` and/or `REPORT_MINIMUM_INTERVAL`. This is useful to dynamically retrieve a value as needed:
``` python
def alert_dynamic_minimal_interval(**kwargs) -> int:
"""
Define logic here to retrieve the value dynamically
"""
ALERT_MINIMUM_INTERVAL = alert_dynamic_minimal_interval
```
## Custom Dockerfile
If you're running the dev version of a released Superset image, like `apache/superset:3.1.0-dev`, you should be set with the above.
But if you're building your own image, or starting with a non-dev version, a webdriver (and headless browser) is needed to capture screenshots of the charts and dashboards which are then sent to the recipient.
Here's how you can modify your Dockerfile to take the screenshots either with Firefox or Chrome.
### Using Firefox
```docker
FROM apache/superset:3.1.0
USER root
RUN apt-get update && \
apt-get install --no-install-recommends -y firefox-esr
ENV GECKODRIVER_VERSION=0.29.0
RUN wget -q https://github.com/mozilla/geckodriver/releases/download/v${GECKODRIVER_VERSION}/geckodriver-v${GECKODRIVER_VERSION}-linux64.tar.gz && \
tar -x geckodriver -zf geckodriver-v${GECKODRIVER_VERSION}-linux64.tar.gz -O > /usr/bin/geckodriver && \
chmod 755 /usr/bin/geckodriver && \
rm geckodriver-v${GECKODRIVER_VERSION}-linux64.tar.gz
RUN pip install --no-cache gevent psycopg2 redis
USER superset
```
### Using Chrome
```docker
FROM apache/superset:3.1.0
USER root
RUN apt-get update && \
apt-get install -y wget zip libaio1
RUN export CHROMEDRIVER_VERSION=$(curl --silent https://googlechromelabs.github.io/chrome-for-testing/LATEST_RELEASE_116) && \
wget -O google-chrome-stable_current_amd64.deb -q http://dl.google.com/linux/chrome/deb/pool/main/g/google-chrome-stable/google-chrome-stable_${CHROMEDRIVER_VERSION}-1_amd64.deb && \
apt-get install -y --no-install-recommends ./google-chrome-stable_current_amd64.deb && \
rm -f google-chrome-stable_current_amd64.deb
RUN export CHROMEDRIVER_VERSION=$(curl --silent https://googlechromelabs.github.io/chrome-for-testing/LATEST_RELEASE_116) && \
wget -q https://storage.googleapis.com/chrome-for-testing-public/${CHROMEDRIVER_VERSION}/linux64/chromedriver-linux64.zip && \
unzip -j chromedriver-linux64.zip -d /usr/bin && \
chmod 755 /usr/bin/chromedriver && \
rm -f chromedriver-linux64.zip
RUN pip install --no-cache gevent psycopg2 redis
USER superset
```
Don't forget to set `WEBDRIVER_TYPE` and `WEBDRIVER_OPTION_ARGS` in your config if you use Chrome.
## Troubleshooting
There are many reasons that reports might not be working. Try these steps to check for specific issues.
### Confirm feature flag is enabled and you have sufficient permissions
If you don't see "Alerts & Reports" under the *Manage* section of the Settings dropdown in the Superset UI, you need to enable the `ALERT_REPORTS` feature flag (see above). Enable another feature flag and check to see that it took effect, to verify that your config file is getting loaded.
Log in as an admin user to ensure you have adequate permissions.
### Check the logs of your Celery worker
This is the best source of information about the problem. In a docker compose deployment, you can do this with a command like `docker logs superset_worker --since 1h`.
### Check web browser and webdriver installation
To take a screenshot, the worker visits the dashboard or chart using a headless browser, then takes a screenshot. If you are able to send a chart as CSV or text but can't send as PNG, your problem may lie with the browser.
Superset docker images that have a tag ending with `-dev` have the Firefox headless browser and geckodriver already installed. You can test that these are installed and in the proper path by entering your Superset worker and running `firefox --headless` and then `geckodriver`. Both commands should start those applications.
If you are handling the installation of that software on your own, or wish to use Chromium instead, do your own verification to ensure that the headless browser opens successfully in the worker environment.
### Send a test email
One symptom of an invalid connection to an email server is receiving an error of `[Errno 110] Connection timed out` in your logs when the report tries to send.
Confirm via testing that your outbound email configuration is correct. Here is the simplest test, for an un-authenticated email SMTP email service running on port 25. If you are sending over SSL, for instance, study how [Superset's codebase sends emails](https://github.com/apache/superset/blob/master/superset/utils/core.py#L818) and then test with those commands and arguments.
Start Python in your worker environment, replace all example values, and run:
```python
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from_email = 'superset_emails@example.com'
to_email = 'your_email@example.com'
msg = MIMEMultipart()
msg['From'] = from_email
msg['To'] = to_email
msg['Subject'] = 'Superset SMTP config test'
message = 'It worked'
msg.attach(MIMEText(message))
mailserver = smtplib.SMTP('smtpmail.example.com', 25)
mailserver.sendmail(from_email, to_email, msg.as_string())
mailserver.quit()
```
This should send an email.
Possible fixes:
- Some cloud hosts disable outgoing unauthenticated SMTP email to prevent spam. For instance, [Azure blocks port 25 by default on some machines](https://learn.microsoft.com/en-us/azure/virtual-network/troubleshoot-outbound-smtp-connectivity). Enable that port or use another sending method.
- Use another set of SMTP credentials that you verify works in this setup.
### Browse to your report from the worker
The worker may be unable to reach the report. It will use the value of `WEBDRIVER_BASEURL` to browse to the report. If that route is invalid, or presents an authentication challenge that the worker can't pass, the report screenshot will fail.
Check this by attempting to `curl` the URL of a report that you see in the error logs of your worker. For instance, from the worker environment, run `curl http://superset_app:8088/superset/dashboard/1/`. You may get different responses depending on whether the dashboard exists - for example, you may need to change the `1` in that URL. If there's a URL in your logs from a failed report screenshot, that's a good place to start. The goal is to determine a valid value for `WEBDRIVER_BASEURL` and determine if an issue like HTTPS or authentication is redirecting your worker.
In a deployment with authentication measures enabled like HTTPS and Single Sign-On, it may make sense to have the worker navigate directly to the Superset application running in the same location, avoiding the need to sign in. For instance, you could use `WEBDRIVER_BASEURL="http://superset_app:8088"` for a docker compose deployment, and set `"force_https": False,` in your `TALISMAN_CONFIG`.
## Scheduling Queries as Reports
You can optionally allow your users to schedule queries directly in SQL Lab. This is done by adding
extra metadata to saved queries, which are then picked up by an external scheduled (like
[Apache Airflow](https://airflow.apache.org/)).
To allow scheduled queries, add the following to `SCHEDULED_QUERIES` in your configuration file:
```python
SCHEDULED_QUERIES = {
# This information is collected when the user clicks "Schedule query",
# and saved into the `extra` field of saved queries.
# See: https://github.com/mozilla-services/react-jsonschema-form
'JSONSCHEMA': {
'title': 'Schedule',
'description': (
'In order to schedule a query, you need to specify when it '
'should start running, when it should stop running, and how '
'often it should run. You can also optionally specify '
'dependencies that should be met before the query is '
'executed. Please read the documentation for best practices '
'and more information on how to specify dependencies.'
),
'type': 'object',
'properties': {
'output_table': {
'type': 'string',
'title': 'Output table name',
},
'start_date': {
'type': 'string',
'title': 'Start date',
# date-time is parsed using the chrono library, see
# https://www.npmjs.com/package/chrono-node#usage
'format': 'date-time',
'default': 'tomorrow at 9am',
},
'end_date': {
'type': 'string',
'title': 'End date',
# date-time is parsed using the chrono library, see
# https://www.npmjs.com/package/chrono-node#usage
'format': 'date-time',
'default': '9am in 30 days',
},
'schedule_interval': {
'type': 'string',
'title': 'Schedule interval',
},
'dependencies': {
'type': 'array',
'title': 'Dependencies',
'items': {
'type': 'string',
},
},
},
},
'UISCHEMA': {
'schedule_interval': {
'ui:placeholder': '@daily, @weekly, etc.',
},
'dependencies': {
'ui:help': (
'Check the documentation for the correct format when '
'defining dependencies.'
),
},
},
'VALIDATION': [
# ensure that start_date <= end_date
{
'name': 'less_equal',
'arguments': ['start_date', 'end_date'],
'message': 'End date cannot be before start date',
# this is where the error message is shown
'container': 'end_date',
},
],
# link to the scheduler; this example links to an Airflow pipeline
# that uses the query id and the output table as its name
'linkback': (
'https://airflow.example.com/admin/airflow/tree?'
'dag_id=query_${id}_${extra_json.schedule_info.output_table}'
),
}
```
This configuration is based on
[react-jsonschema-form](https://github.com/mozilla-services/react-jsonschema-form) and will add a
menu item called “Schedule” to SQL Lab. When the menu item is clicked, a modal will show up where
the user can add the metadata required for scheduling the query.
This information can then be retrieved from the endpoint `/api/v1/saved_query/` and used to
schedule the queries that have `schedule_info` in their JSON metadata. For schedulers other than
Airflow, additional fields can be easily added to the configuration file above.

View File

@@ -1,104 +0,0 @@
---
title: Async Queries via Celery
hide_title: true
sidebar_position: 4
version: 1
---
# Async Queries via Celery
## Celery
On large analytic databases, its common to run queries that execute for minutes or hours. To enable
support for long running queries that execute beyond the typical web requests timeout (30-60
seconds), it is necessary to configure an asynchronous backend for Superset which consists of:
- one or many Superset workers (which is implemented as a Celery worker), and can be started with
the `celery worker` command, run `celery worker --help` to view the related options.
- a celery broker (message queue) for which we recommend using Redis or RabbitMQ
- a results backend that defines where the worker will persist the query results
Configuring Celery requires defining a `CELERY_CONFIG` in your `superset_config.py`. Both the worker
and web server processes should have the same configuration.
```python
class CeleryConfig(object):
broker_url = "redis://localhost:6379/0"
imports = (
"superset.sql_lab",
"superset.tasks.scheduler",
)
result_backend = "redis://localhost:6379/0"
worker_prefetch_multiplier = 10
task_acks_late = True
task_annotations = {
"sql_lab.get_sql_results": {
"rate_limit": "100/s",
},
}
CELERY_CONFIG = CeleryConfig
```
To start a Celery worker to leverage the configuration, run the following command:
```bash
celery --app=superset.tasks.celery_app:app worker --pool=prefork -O fair -c 4
```
To start a job which schedules periodic background jobs, run the following command:
```bash
celery --app=superset.tasks.celery_app:app beat
```
To setup a result backend, you need to pass an instance of a derivative of from
from flask_caching.backends.base import BaseCache to the RESULTS_BACKEND configuration key in your superset_config.py. You can
use Memcached, Redis, S3 (https://pypi.python.org/pypi/s3werkzeugcache), memory or the file system
(in a single server-type setup or for testing), or to write your own caching interface. Your
`superset_config.py` may look something like:
```python
# On S3
from s3cache.s3cache import S3Cache
S3_CACHE_BUCKET = 'foobar-superset'
S3_CACHE_KEY_PREFIX = 'sql_lab_result'
RESULTS_BACKEND = S3Cache(S3_CACHE_BUCKET, S3_CACHE_KEY_PREFIX)
# On Redis
from flask_caching.backends.rediscache import RedisCache
RESULTS_BACKEND = RedisCache(
host='localhost', port=6379, key_prefix='superset_results')
```
For performance gains, [MessagePack](https://github.com/msgpack/msgpack-python) and
[PyArrow](https://arrow.apache.org/docs/python/) are now used for results serialization. This can be
disabled by setting `RESULTS_BACKEND_USE_MSGPACK = False` in your `superset_config.py`, should any
issues arise. Please clear your existing results cache store when upgrading an existing environment.
**Important Notes**
- It is important that all the worker nodes and web servers in the Superset cluster _share a common
metadata database_. This means that SQLite will not work in this context since it has limited
support for concurrency and typically lives on the local file system.
- There should _only be one instance of celery beat running_ in your entire setup. If not,
background jobs can get scheduled multiple times resulting in weird behaviors like duplicate
delivery of reports, higher than expected load / traffic etc.
- SQL Lab will _only run your queries asynchronously if_ you enable **Asynchronous Query Execution**
in your database settings (Sources > Databases > Edit record).
## Celery Flower
Flower is a web based tool for monitoring the Celery cluster which you can install from pip:
```bash
pip install flower
```
You can run flower using:
```bash
celery --app=superset.tasks.celery_app:app flower
```

View File

@@ -1,154 +0,0 @@
---
title: Caching
hide_title: true
sidebar_position: 3
version: 1
---
# Caching
Superset uses [Flask-Caching](https://flask-caching.readthedocs.io/) for caching purposes.
Flask-Caching supports various caching backends, including Redis (recommended), Memcached,
SimpleCache (in-memory), or the local filesystem.
[Custom cache backends](https://flask-caching.readthedocs.io/en/latest/#custom-cache-backends)
are also supported.
Caching can be configured by providing dictionaries in
`superset_config.py` that comply with [the Flask-Caching config specifications](https://flask-caching.readthedocs.io/en/latest/#configuring-flask-caching).
The following cache configurations can be customized in this way:
- Dashboard filter state (required): `FILTER_STATE_CACHE_CONFIG`.
- Explore chart form data (required): `EXPLORE_FORM_DATA_CACHE_CONFIG`
- Metadata cache (optional): `CACHE_CONFIG`
- Charting data queried from datasets (optional): `DATA_CACHE_CONFIG`
For example, to configure the filter state cache using Redis:
```python
FILTER_STATE_CACHE_CONFIG = {
'CACHE_TYPE': 'RedisCache',
'CACHE_DEFAULT_TIMEOUT': 86400,
'CACHE_KEY_PREFIX': 'superset_filter_cache',
'CACHE_REDIS_URL': 'redis://localhost:6379/0'
}
```
## Dependencies
In order to use dedicated cache stores, additional python libraries must be installed
- For Redis: we recommend the [redis](https://pypi.python.org/pypi/redis) Python package
- Memcached: we recommend using [pylibmc](https://pypi.org/project/pylibmc/) client library as
`python-memcached` does not handle storing binary data correctly.
These libraries can be installed using pip.
## Fallback Metastore Cache
Note, that some form of Filter State and Explore caching are required. If either of these caches
are undefined, Superset falls back to using a built-in cache that stores data in the metadata
database. While it is recommended to use a dedicated cache, the built-in cache can also be used
to cache other data.
For example, to use the built-in cache to store chart data, use the following config:
```python
DATA_CACHE_CONFIG = {
"CACHE_TYPE": "SupersetMetastoreCache",
"CACHE_KEY_PREFIX": "superset_results", # make sure this string is unique to avoid collisions
"CACHE_DEFAULT_TIMEOUT": 86400, # 60 seconds * 60 minutes * 24 hours
}
```
## Chart Cache Timeout
The cache timeout for charts may be overridden by the settings for an individual chart, dataset, or
database. Each of these configurations will be checked in order before falling back to the default
value defined in `DATA_CACHE_CONFIG`.
Note, that by setting the cache timeout to `-1`, caching for charting data can be disabled, either
per chart, dataset or database, or by default if set in `DATA_CACHE_CONFIG`.
## SQL Lab Query Results
Caching for SQL Lab query results is used when async queries are enabled and is configured using
`RESULTS_BACKEND`.
Note that this configuration does not use a flask-caching dictionary for its configuration, but
instead requires a cachelib object.
See [Async Queries via Celery](/docs/configuration/async-queries-celery) for details.
## Caching Thumbnails
This is an optional feature that can be turned on by activating its [feature flag](/docs/configuration/configuring-superset#feature-flags) on config:
```
FEATURE_FLAGS = {
"THUMBNAILS": True,
"THUMBNAILS_SQLA_LISTENERS": True,
}
```
By default thumbnails are rendered per user, and will fall back to the Selenium user for anonymous users.
To always render thumbnails as a fixed user (`admin` in this example), use the following configuration:
```python
from superset.tasks.types import FixedExecutor
THUMBNAIL_EXECUTORS = [FixedExecutor("admin")]
```
For this feature you will need a cache system and celery workers. All thumbnails are stored on cache
and are processed asynchronously by the workers.
An example config where images are stored on S3 could be:
```python
from flask import Flask
from s3cache.s3cache import S3Cache
...
class CeleryConfig(object):
broker_url = "redis://localhost:6379/0"
imports = (
"superset.sql_lab",
"superset.tasks.thumbnails",
)
result_backend = "redis://localhost:6379/0"
worker_prefetch_multiplier = 10
task_acks_late = True
CELERY_CONFIG = CeleryConfig
def init_thumbnail_cache(app: Flask) -> S3Cache:
return S3Cache("bucket_name", 'thumbs_cache/')
THUMBNAIL_CACHE_CONFIG = init_thumbnail_cache
```
Using the above example cache keys for dashboards will be `superset_thumb__dashboard__{ID}`. You can
override the base URL for selenium using:
```
WEBDRIVER_BASEURL = "https://superset.company.com"
```
Additional selenium web drive configuration can be set using `WEBDRIVER_CONFIGURATION`. You can
implement a custom function to authenticate selenium. The default function uses the `flask-login`
session cookie. Here's an example of a custom function signature:
```python
def auth_driver(driver: WebDriver, user: "User") -> WebDriver:
pass
```
Then on configuration:
```
WEBDRIVER_AUTH_FUNC = auth_driver
```

View File

@@ -1,548 +0,0 @@
---
title: Configuring Superset
hide_title: true
sidebar_position: 1
version: 1
---
# Configuring Superset
## superset_config.py
Superset exposes hundreds of configurable parameters through its
[config.py module](https://github.com/apache/superset/blob/master/superset/config.py). The
variables and objects exposed act as a public interface of the bulk of what you may want
to configure, alter and interface with. In this python module, you'll find all these
parameters, sensible defaults, as well as rich documentation in the form of comments
To configure your application, you need to create your own configuration module, which
will allow you to override few or many of these parameters. Instead of altering the core module,
you'll want to define your own module (typically a file named `superset_config.py`).
Add this file to your `PYTHONPATH` or create an environment variable
`SUPERSET_CONFIG_PATH` specifying the full path of the `superset_config.py`.
For example, if deploying on Superset directly on a Linux-based system where your
`superset_config.py` is under `/app` directory, you can run:
```bash
export SUPERSET_CONFIG_PATH=/app/superset_config.py
```
If you are using your own custom Dockerfile with the official Superset image as base image,
then you can add your overrides as shown below:
```bash
COPY --chown=superset superset_config.py /app/
ENV SUPERSET_CONFIG_PATH /app/superset_config.py
```
Docker compose deployments handle application configuration differently using specific conventions.
Refer to the [docker compose tips & configuration](/docs/installation/docker-compose#docker-compose-tips--configuration)
for details.
The following is an example of just a few of the parameters you can set in your `superset_config.py` file:
```
# Superset specific config
ROW_LIMIT = 5000
# Flask App Builder configuration
# Your App secret key will be used for securely signing the session cookie
# and encrypting sensitive information on the database
# Make sure you are changing this key for your deployment with a strong key.
# Alternatively you can set it with `SUPERSET_SECRET_KEY` environment variable.
# You MUST set this for production environments or the server will refuse
# to start and you will see an error in the logs accordingly.
SECRET_KEY = 'YOUR_OWN_RANDOM_GENERATED_SECRET_KEY'
# The SQLAlchemy connection string to your database backend
# This connection defines the path to the database that stores your
# superset metadata (slices, connections, tables, dashboards, ...).
# Note that the connection information to connect to the datasources
# you want to explore are managed directly in the web UI
# The check_same_thread=false property ensures the sqlite client does not attempt
# to enforce single-threaded access, which may be problematic in some edge cases
SQLALCHEMY_DATABASE_URI = 'sqlite:////path/to/superset.db?check_same_thread=false'
# Flask-WTF flag for CSRF
WTF_CSRF_ENABLED = True
# Add endpoints that need to be exempt from CSRF protection
WTF_CSRF_EXEMPT_LIST = []
# A CSRF token that expires in 1 year
WTF_CSRF_TIME_LIMIT = 60 * 60 * 24 * 365
# Set this API key to enable Mapbox visualizations
MAPBOX_API_KEY = ''
```
:::tip
Note that it is typical to copy and paste [only] the portions of the
core [superset/config.py](https://github.com/apache/superset/blob/master/superset/config.py) that
you want to alter, along with the related comments into your own `superset_config.py` file.
:::
All the parameters and default values defined
in [superset/config.py](https://github.com/apache/superset/blob/master/superset/config.py)
can be altered in your local `superset_config.py`. Administrators will want to read through the file
to understand what can be configured locally as well as the default values in place.
Since `superset_config.py` acts as a Flask configuration module, it can be used to alter the
settings of Flask itself, as well as Flask extensions that Superset bundles like
`flask-wtf`, `flask-caching`, `flask-migrate`,
and `flask-appbuilder`. Each one of these extensions offers intricate configurability.
Flask App Builder, the web framework used by Superset, also offers many
configuration settings. Please consult the
[Flask App Builder Documentation](https://flask-appbuilder.readthedocs.org/en/latest/config.html)
for more information on how to configure it.
At the very least, you'll want to change `SECRET_KEY` and `SQLALCHEMY_DATABASE_URI`. Continue reading for more about each of these.
## Specifying a SECRET_KEY
### Adding an initial SECRET_KEY
Superset requires a user-specified SECRET_KEY to start up. This requirement was [added in version 2.1.0 to force secure configurations](https://preset.io/blog/superset-security-update-default-secret_key-vulnerability/). Add a strong SECRET_KEY to your `superset_config.py` file like:
```python
SECRET_KEY = 'YOUR_OWN_RANDOM_GENERATED_SECRET_KEY'
```
You can generate a strong secure key with `openssl rand -base64 42`.
:::caution Use a strong secret key
This key will be used for securely signing session cookies and encrypting sensitive information stored in Superset's application metadata database.
Your deployment must use a complex, unique key.
:::
### Rotating to a newer SECRET_KEY
If you wish to change your existing SECRET_KEY, add the existing SECRET_KEY to your `superset_config.py` file as
`PREVIOUS_SECRET_KEY =`and provide your new key as `SECRET_KEY =`. You can find your current SECRET_KEY with these
commands - if running Superset with Docker, execute from within the Superset application container:
```python
superset shell
from flask import current_app; print(current_app.config["SECRET_KEY"])
```
Save your `superset_config.py` with these values and then run `superset re-encrypt-secrets`.
## Setting up a production metadata database
Superset needs a database to store the information it manages, like the definitions of
charts, dashboards, and many other things.
By default, Superset is configured to use [SQLite](https://www.sqlite.org/),
a self-contained, single-file database that offers a simple and fast way to get started
(without requiring any installation). However, for production environments,
using SQLite is highly discouraged due to security, scalability, and data integrity reasons.
It's important to use only the supported database engines and consider using a different
database engine on a separate host or container.
Superset supports the following database engines/versions:
| Database Engine | Supported Versions |
| ----------------------------------------- | ---------------------------------------- |
| [PostgreSQL](https://www.postgresql.org/) | 10.X, 11.X, 12.X, 13.X, 14.X, 15.X, 16.X |
| [MySQL](https://www.mysql.com/) | 5.7, 8.X |
Use the following database drivers and connection strings:
| Database | PyPI package | Connection String |
| ----------------------------------------- | ------------------------- | ---------------------------------------------------------------------- |
| [PostgreSQL](https://www.postgresql.org/) | `pip install psycopg2` | `postgresql://<UserName>:<DBPassword>@<Database Host>/<Database Name>` |
| [MySQL](https://www.mysql.com/) | `pip install mysqlclient` | `mysql://<UserName>:<DBPassword>@<Database Host>/<Database Name>` |
:::tip
Properly setting up metadata store is beyond the scope of this documentation. We recommend
using a hosted managed service such as [Amazon RDS](https://aws.amazon.com/rds/) or
[Google Cloud Databases](https://cloud.google.com/products/databases?hl=en) to handle
service and supporting infrastructure and backup strategy.
:::
To configure Superset metastore set `SQLALCHEMY_DATABASE_URI` config key on `superset_config`
to the appropriate connection string.
## Running on a WSGI HTTP Server
While you can run Superset on NGINX or Apache, we recommend using Gunicorn in async mode. This
enables impressive concurrency even and is fairly easy to install and configure. Please refer to the
documentation of your preferred technology to set up this Flask WSGI application in a way that works
well in your environment. Heres an async setup known to work well in production:
```
-w 10 \
-k gevent \
--worker-connections 1000 \
--timeout 120 \
-b 0.0.0.0:6666 \
--limit-request-line 0 \
--limit-request-field_size 0 \
--statsd-host localhost:8125 \
"superset.app:create_app()"
```
Refer to the [Gunicorn documentation](https://docs.gunicorn.org/en/stable/design.html) for more
information. _Note that the development web server (`superset run` or `flask run`) is not intended
for production use._
If you're not using Gunicorn, you may want to disable the use of `flask-compress` by setting
`COMPRESS_REGISTER = False` in your `superset_config.py`.
Currently, the Google BigQuery Python SDK is not compatible with `gevent`, due to some dynamic monkeypatching on python core library by `gevent`.
So, when you use `BigQuery` datasource on Superset, you have to use `gunicorn` worker type except `gevent`.
## HTTPS Configuration
You can configure HTTPS upstream via a load balancer or a reverse proxy (such as nginx) and do SSL/TLS Offloading before traffic reaches the Superset application. In this setup, local traffic from a Celery worker taking a snapshot of a chart for Alerts & Reports can access Superset at a `http://` URL, from behind the ingress point.
You can also configure [SSL in Gunicorn](https://docs.gunicorn.org/en/stable/settings.html#ssl) (the Python webserver) if you are using an official Superset Docker image.
## Configuration Behind a Load Balancer
If you are running superset behind a load balancer or reverse proxy (e.g. NGINX or ELB on AWS), you
may need to utilize a healthcheck endpoint so that your load balancer knows if your superset
instance is running. This is provided at `/health` which will return a 200 response containing “OK”
if the webserver is running.
If the load balancer is inserting `X-Forwarded-For/X-Forwarded-Proto` headers, you should set
`ENABLE_PROXY_FIX = True` in the superset config file (`superset_config.py`) to extract and use the
headers.
In case the reverse proxy is used for providing SSL encryption, an explicit definition of the
`X-Forwarded-Proto` may be required. For the Apache webserver this can be set as follows:
```
RequestHeader set X-Forwarded-Proto "https"
```
## Configuring the application root
*Please be advised that this feature is in BETA.*
Superset supports running the application under a non-root path. The root path
prefix can be specified in one of two ways:
- Setting the `SUPERSET_APP_ROOT` environment variable to the desired prefix.
- Customizing the [Flask entrypoint](https://github.com/apache/superset/blob/master/superset/app.py#L29)
by passing the `superset_app_root` variable.
Note, the prefix should start with a `/`.
### Customizing the Flask entrypoint
To configure a prefix, e.g `/analytics`, pass the `superset_app_root` argument to
`create_app` when calling flask run either through the `FLASK_APP`
environment variable:
```sh
FLASK_APP="superset:create_app(superset_app_root='/analytics')"
```
or as part of the `--app` argument to `flask run`:
```sh
flask --app "superset.app:create_app(superset_app_root='/analytics')"
```
### Docker builds
The [docker compose](/docs/installation/docker-compose#configuring-further) developer
configuration includes an additional environmental variable,
[`SUPERSET_APP_ROOT`](https://github.com/apache/superset/blob/master/docker/.env),
to simplify the process of setting up a non-default root path across the services.
In `docker/.env-local` set `SUPERSET_APP_ROOT` to the desired prefix and then bring the
services up with `docker compose up --detach`.
## Custom OAuth2 Configuration
Superset is built on Flask-AppBuilder (FAB), which supports many providers out of the box
(GitHub, Twitter, LinkedIn, Google, Azure, etc). Beyond those, Superset can be configured to connect
with other OAuth2 Authorization Server implementations that support “code” authorization.
Make sure the pip package [`Authlib`](https://authlib.org/) is installed on the webserver.
First, configure authorization in Superset `superset_config.py`.
```python
from flask_appbuilder.security.manager import AUTH_OAUTH
# Set the authentication type to OAuth
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [
{ 'name':'egaSSO',
'token_key':'access_token', # Name of the token in the response of access_token_url
'icon':'fa-address-card', # Icon for the provider
'remote_app': {
'client_id':'myClientId', # Client Id (Identify Superset application)
'client_secret':'MySecret', # Secret for this Client Id (Identify Superset application)
'client_kwargs':{
'scope': 'read' # Scope for the Authorization
},
'access_token_method':'POST', # HTTP Method to call access_token_url
'access_token_params':{ # Additional parameters for calls to access_token_url
'client_id':'myClientId'
},
'jwks_uri':'https://myAuthorizationServe/adfs/discovery/keys', # may be required to generate token
'access_token_headers':{ # Additional headers for calls to access_token_url
'Authorization': 'Basic Base64EncodedClientIdAndSecret'
},
'api_base_url':'https://myAuthorizationServer/oauth2AuthorizationServer/',
'access_token_url':'https://myAuthorizationServer/oauth2AuthorizationServer/token',
'authorize_url':'https://myAuthorizationServer/oauth2AuthorizationServer/authorize'
}
}
]
# Will allow user self registration, allowing to create Flask users from Authorized User
AUTH_USER_REGISTRATION = True
# The default user self registration role
AUTH_USER_REGISTRATION_ROLE = "Public"
```
In case you want to assign the `Admin` role on new user registration, it can be assigned as follows:
```python
AUTH_USER_REGISTRATION_ROLE = "Admin"
```
If you encounter the [issue](https://github.com/apache/superset/issues/13243) of not being able to list users from the Superset main page settings, although a newly registered user has an `Admin` role, please re-run `superset init` to sync the required permissions. Below is the command to re-run `superset init` using docker compose.
```
docker-compose exec superset superset init
```
Then, create a `CustomSsoSecurityManager` that extends `SupersetSecurityManager` and overrides
`oauth_user_info`:
```python
import logging
from superset.security import SupersetSecurityManager
class CustomSsoSecurityManager(SupersetSecurityManager):
def oauth_user_info(self, provider, response=None):
logging.debug("Oauth2 provider: {0}.".format(provider))
if provider == 'egaSSO':
# As example, this line request a GET to base_url + '/' + userDetails with Bearer Authentication,
# and expects that authorization server checks the token, and response with user details
me = self.appbuilder.sm.oauth_remotes[provider].get('userDetails').data
logging.debug("user_data: {0}".format(me))
return { 'name' : me['name'], 'email' : me['email'], 'id' : me['user_name'], 'username' : me['user_name'], 'first_name':'', 'last_name':''}
...
```
This file must be located in the same directory as `superset_config.py` with the name
`custom_sso_security_manager.py`. Finally, add the following 2 lines to `superset_config.py`:
```
from custom_sso_security_manager import CustomSsoSecurityManager
CUSTOM_SECURITY_MANAGER = CustomSsoSecurityManager
```
**Notes**
- The redirect URL will be `https://<superset-webserver>/oauth-authorized/<provider-name>`
When configuring an OAuth2 authorization provider if needed. For instance, the redirect URL will
be `https://<superset-webserver>/oauth-authorized/egaSSO` for the above configuration.
- If an OAuth2 authorization server supports OpenID Connect 1.0, you could configure its configuration
document URL only without providing `api_base_url`, `access_token_url`, `authorize_url` and other
required options like user info endpoint, jwks uri etc. For instance:
```python
OAUTH_PROVIDERS = [
{ 'name':'egaSSO',
'token_key':'access_token', # Name of the token in the response of access_token_url
'icon':'fa-address-card', # Icon for the provider
'remote_app': {
'client_id':'myClientId', # Client Id (Identify Superset application)
'client_secret':'MySecret', # Secret for this Client Id (Identify Superset application)
'server_metadata_url': 'https://myAuthorizationServer/.well-known/openid-configuration'
}
}
]
```
### Keycloak-Specific Configuration using Flask-OIDC
If you are using Keycloak as OpenID Connect 1.0 Provider, the above configuration based on [`Authlib`](https://authlib.org/) might not work. In this case using [`Flask-OIDC`](https://pypi.org/project/flask-oidc/) is a viable option.
Make sure the pip package [`Flask-OIDC`](https://pypi.org/project/flask-oidc/) is installed on the webserver. This was successfully tested using version 2.2.0. This package requires [`Flask-OpenID`](https://pypi.org/project/Flask-OpenID/) as a dependency.
The following code defines a new security manager. Add it to a new file named `keycloak_security_manager.py`, placed in the same directory as your `superset_config.py` file.
```python
from flask_appbuilder.security.manager import AUTH_OID
from superset.security import SupersetSecurityManager
from flask_oidc import OpenIDConnect
from flask_appbuilder.security.views import AuthOIDView
from flask_login import login_user
from urllib.parse import quote
from flask_appbuilder.views import ModelView, SimpleFormView, expose
from flask import (
redirect,
request
)
import logging
class OIDCSecurityManager(SupersetSecurityManager):
def __init__(self, appbuilder):
super(OIDCSecurityManager, self).__init__(appbuilder)
if self.auth_type == AUTH_OID:
self.oid = OpenIDConnect(self.appbuilder.get_app)
self.authoidview = AuthOIDCView
class AuthOIDCView(AuthOIDView):
@expose('/login/', methods=['GET', 'POST'])
def login(self, flag=True):
sm = self.appbuilder.sm
oidc = sm.oid
@self.appbuilder.sm.oid.require_login
def handle_login():
user = sm.auth_user_oid(oidc.user_getfield('email'))
if user is None:
info = oidc.user_getinfo(['preferred_username', 'given_name', 'family_name', 'email'])
user = sm.add_user(info.get('preferred_username'), info.get('given_name'), info.get('family_name'),
info.get('email'), sm.find_role('Gamma'))
login_user(user, remember=False)
return redirect(self.appbuilder.get_url_for_index)
return handle_login()
@expose('/logout/', methods=['GET', 'POST'])
def logout(self):
oidc = self.appbuilder.sm.oid
oidc.logout()
super(AuthOIDCView, self).logout()
redirect_url = request.url_root.strip('/') + self.appbuilder.get_url_for_login
return redirect(
oidc.client_secrets.get('issuer') + '/protocol/openid-connect/logout?redirect_uri=' + quote(redirect_url))
```
Then add to your `superset_config.py` file:
```python
from keycloak_security_manager import OIDCSecurityManager
from flask_appbuilder.security.manager import AUTH_OID, AUTH_REMOTE_USER, AUTH_DB, AUTH_LDAP, AUTH_OAUTH
import os
AUTH_TYPE = AUTH_OID
SECRET_KEY: 'SomethingNotEntirelySecret'
OIDC_CLIENT_SECRETS = '/path/to/client_secret.json'
OIDC_ID_TOKEN_COOKIE_SECURE = False
OIDC_OPENID_REALM: '<myRealm>'
OIDC_INTROSPECTION_AUTH_METHOD: 'client_secret_post'
CUSTOM_SECURITY_MANAGER = OIDCSecurityManager
# Will allow user self registration, allowing to create Flask users from Authorized User
AUTH_USER_REGISTRATION = True
# The default user self registration role
AUTH_USER_REGISTRATION_ROLE = 'Public'
```
Store your client-specific OpenID information in a file called `client_secret.json`. Create this file in the same directory as `superset_config.py`:
```json
{
"<myOpenIDProvider>": {
"issuer": "https://<myKeycloakDomain>/realms/<myRealm>",
"auth_uri": "https://<myKeycloakDomain>/realms/<myRealm>/protocol/openid-connect/auth",
"client_id": "https://<myKeycloakDomain>",
"client_secret": "<myClientSecret>",
"redirect_uris": [
"https://<SupersetWebserver>/oauth-authorized/<myOpenIDProvider>"
],
"userinfo_uri": "https://<myKeycloakDomain>/realms/<myRealm>/protocol/openid-connect/userinfo",
"token_uri": "https://<myKeycloakDomain>/realms/<myRealm>/protocol/openid-connect/token",
"token_introspection_uri": "https://<myKeycloakDomain>/realms/<myRealm>/protocol/openid-connect/token/introspect"
}
}
```
## LDAP Authentication
FAB supports authenticating user credentials against an LDAP server.
To use LDAP you must install the [python-ldap](https://www.python-ldap.org/en/latest/installing.html) package.
See [FAB's LDAP documentation](https://flask-appbuilder.readthedocs.io/en/latest/security.html#authentication-ldap)
for details.
## Mapping LDAP or OAUTH groups to Superset roles
AUTH_ROLES_MAPPING in Flask-AppBuilder is a dictionary that maps from LDAP/OAUTH group names to FAB roles.
It is used to assign roles to users who authenticate using LDAP or OAuth.
### Mapping OAUTH groups to Superset roles
The following `AUTH_ROLES_MAPPING` dictionary would map the OAUTH group "superset_users" to the Superset roles "Gamma" as well as "Alpha", and the OAUTH group "superset_admins" to the Superset role "Admin".
```python
AUTH_ROLES_MAPPING = {
"superset_users": ["Gamma","Alpha"],
"superset_admins": ["Admin"],
}
```
### Mapping LDAP groups to Superset roles
The following `AUTH_ROLES_MAPPING` dictionary would map the LDAP DN "cn=superset_users,ou=groups,dc=example,dc=com" to the Superset roles "Gamma" as well as "Alpha", and the LDAP DN "cn=superset_admins,ou=groups,dc=example,dc=com" to the Superset role "Admin".
```python
AUTH_ROLES_MAPPING = {
"cn=superset_users,ou=groups,dc=example,dc=com": ["Gamma","Alpha"],
"cn=superset_admins,ou=groups,dc=example,dc=com": ["Admin"],
}
```
Note: This requires `AUTH_LDAP_SEARCH` to be set. For more details, please see the [FAB Security documentation](https://flask-appbuilder.readthedocs.io/en/latest/security.html).
### Syncing roles at login
You can also use the `AUTH_ROLES_SYNC_AT_LOGIN` configuration variable to control how often Flask-AppBuilder syncs the user's roles with the LDAP/OAUTH groups. If `AUTH_ROLES_SYNC_AT_LOGIN` is set to True, Flask-AppBuilder will sync the user's roles each time they log in. If `AUTH_ROLES_SYNC_AT_LOGIN` is set to False, Flask-AppBuilder will only sync the user's roles when they first register.
## Flask app Configuration Hook
`FLASK_APP_MUTATOR` is a configuration function that can be provided in your environment, receives
the app object and can alter it in any way. For example, add `FLASK_APP_MUTATOR` into your
`superset_config.py` to setup session cookie expiration time to 24 hours:
```python
from flask import session
from flask import Flask
def make_session_permanent():
'''
Enable maxAge for the cookie 'session'
'''
session.permanent = True
# Set up max age of session to 24 hours
PERMANENT_SESSION_LIFETIME = timedelta(hours=24)
def FLASK_APP_MUTATOR(app: Flask) -> None:
app.before_request_funcs.setdefault(None, []).append(make_session_permanent)
```
## Feature Flags
To support a diverse set of users, Superset has some features that are not enabled by default. For
example, some users have stronger security restrictions, while some others may not. So Superset
allows users to enable or disable some features by config. For feature owners, you can add optional
functionalities in Superset, but will be only affected by a subset of users.
You can enable or disable features with flag from `superset_config.py`:
```python
FEATURE_FLAGS = {
'PRESTO_EXPAND_DATA': False,
}
```
A current list of feature flags can be found in [RESOURCES/FEATURE_FLAGS.md](https://github.com/apache/superset/blob/master/RESOURCES/FEATURE_FLAGS.md).

View File

@@ -1,40 +0,0 @@
---
title: Country Map Tools
sidebar_position: 10
version: 1
---
import countriesData from '../../../data/countries.json';
# The Country Map Visualization
The Country Map visualization allows you to plot lightweight choropleth maps of
your countries by province, states, or other subdivision types. It does not rely
on any third-party map services but would require you to provide the
[ISO-3166-2](https://en.wikipedia.org/wiki/ISO_3166-2) codes of your country's
top-level subdivisions. Comparing to a province or state's full names, the ISO
code is less ambiguous and is unique to all regions in the world.
## Included Maps
The current list of countries can be found in the src
[legacy-plugin-chart-country-map/src/countries.ts](https://github.com/apache/superset/blob/master/superset-frontend/plugins/legacy-plugin-chart-country-map/src/countries.ts)
The Country Maps visualization already ships with the maps for the following countries:
<ul style={{columns: 3}}>
{countriesData.countries.map((country, index) => (
<li key={index}>{country}</li>
))}
</ul>
## Adding a New Country
To add a new country to the list, you'd have to edit files in
[@superset-ui/legacy-plugin-chart-country-map](https://github.com/apache/superset/tree/master/superset-frontend/plugins/legacy-plugin-chart-country-map).
1. Generate a new GeoJSON file for your country following the guide in [this Jupyter notebook](https://github.com/apache/superset/blob/master/superset-frontend/plugins/legacy-plugin-chart-country-map/scripts/Country%20Map%20GeoJSON%20Generator.ipynb).
2. Edit the countries list in [legacy-plugin-chart-country-map/src/countries.ts](https://github.com/apache/superset/blob/master/superset-frontend/plugins/legacy-plugin-chart-country-map/src/countries.ts).
3. Install superset-frontend dependencies: `cd superset-frontend && npm install`
4. Verify your countries in Superset plugins storybook: `npm run plugins:storybook`.
5. Build and install Superset from source code.

File diff suppressed because it is too large Load Diff

View File

@@ -1,62 +0,0 @@
---
title: Event Logging
sidebar_position: 9
version: 1
---
# Logging
## Event Logging
Superset by default logs special action events in its internal database (DBEventLogger). These logs can be accessed
on the UI by navigating to **Security > Action Log**. You can freely customize these logs by
implementing your own event log class.
**When custom log class is enabled DBEventLogger is disabled and logs
stop being populated in UI logs view.**
To achieve both, custom log class should extend built-in DBEventLogger log class.
Here's an example of a simple JSON-to-stdout class:
```python
def log(self, user_id, action, *args, **kwargs):
records = kwargs.get('records', list())
dashboard_id = kwargs.get('dashboard_id')
slice_id = kwargs.get('slice_id')
duration_ms = kwargs.get('duration_ms')
referrer = kwargs.get('referrer')
for record in records:
log = dict(
action=action,
json=record,
dashboard_id=dashboard_id,
slice_id=slice_id,
duration_ms=duration_ms,
referrer=referrer,
user_id=user_id
)
print(json.dumps(log))
```
End by updating your config to pass in an instance of the logger you want to use:
```
EVENT_LOGGER = JSONStdOutEventLogger()
```
## StatsD Logging
Superset can be configured to log events to [StatsD](https://github.com/statsd/statsd)
if desired. Most endpoints hit are logged as
well as key events like query start and end in SQL Lab.
To setup StatsD logging, its a matter of configuring the logger in your `superset_config.py`.
If not already present, you need to ensure that the `statsd`-package is installed in Superset's python environment.
```python
from superset.stats_logger import StatsdStatsLogger
STATS_LOGGER = StatsdStatsLogger(host='localhost', port=8125, prefix='superset')
```
Note that its also possible to implement your own logger by deriving
`superset.stats_logger.BaseStatsLogger`.

View File

@@ -1,126 +0,0 @@
---
title: Importing and Exporting Datasources
hide_title: true
sidebar_position: 11
version: 1
---
# Importing and Exporting Datasources
The superset cli allows you to import and export datasources from and to YAML. Datasources include
databases. The data is expected to be organized in the following hierarchy:
```text
├──databases
| ├──database_1
| | ├──table_1
| | | ├──columns
| | | | ├──column_1
| | | | ├──column_2
| | | | └──... (more columns)
| | | └──metrics
| | | ├──metric_1
| | | ├──metric_2
| | | └──... (more metrics)
| | └── ... (more tables)
| └── ... (more databases)
```
## Exporting Datasources to YAML
You can print your current datasources to stdout by running:
```bash
superset export_datasources
```
To save your datasources to a ZIP file run:
```bash
superset export_datasources -f <filename>
```
By default, default (null) values will be omitted. Use the -d flag to include them. If you want back
references to be included (e.g. a column to include the table id it belongs to) use the -b flag.
Alternatively, you can export datasources using the UI:
1. Open **Sources -> Databases** to export all tables associated to a single or multiple databases.
(**Tables** for one or more tables)
2. Select the items you would like to export.
3. Click **Actions -> Export** to YAML
4. If you want to import an item that you exported through the UI, you will need to nest it inside
its parent element, e.g. a database needs to be nested under databases a table needs to be nested
inside a database element.
In order to obtain an **exhaustive list of all fields** you can import using the YAML import run:
```bash
superset export_datasource_schema
```
As a reminder, you can use the `-b` flag to include back references.
## Importing Datasources
In order to import datasources from a ZIP file, run:
```bash
superset import_datasources -p <path / filename>
```
The optional username flag **-u** sets the user used for the datasource import. The default is 'admin'. Example:
```bash
superset import_datasources -p <path / filename> -u 'admin'
```
## Legacy Importing Datasources
### From older versions of Superset to current version
When using Superset version 4.x.x to import from an older version (2.x.x or 3.x.x) importing is supported as the command `legacy_import_datasources` and expects a JSON or directory of JSONs. The options are `-r` for recursive and `-u` for specifying a user. Example of legacy import without options:
```bash
superset legacy_import_datasources -p <path or filename>
```
### From older versions of Superset to older versions
When using an older Superset version (2.x.x & 3.x.x) of Superset, the command is `import_datasources`. ZIP and YAML files are supported and to switch between them the feature flag `VERSIONED_EXPORT` is used. When `VERSIONED_EXPORT` is `True`, `import_datasources` expects a ZIP file, otherwise YAML. Example:
```bash
superset import_datasources -p <path or filename>
```
When `VERSIONED_EXPORT` is `False`, if you supply a path all files ending with **yaml** or **yml** will be parsed. You can apply
additional flags (e.g. to search the supplied path recursively):
```bash
superset import_datasources -p <path> -r
```
The sync flag **-s** takes parameters in order to sync the supplied elements with your file. Be
careful this can delete the contents of your meta database. Example:
```bash
superset import_datasources -p <path / filename> -s columns,metrics
```
This will sync all metrics and columns for all datasources found in the `<path /filename>` in the
Superset meta database. This means columns and metrics not specified in YAML will be deleted. If you
would add tables to columns,metrics those would be synchronised as well.
If you dont supply the sync flag (**-s**) importing will only add and update (override) fields.
E.g. you can add a verbose_name to the column ds in the table random_time_series from the example
datasets by saving the following YAML to file and then running the **import_datasources** command.
```yaml
databases:
- database_name: main
tables:
- table_name: random_time_series
columns:
- column_name: ds
verbose_name: datetime
```

View File

@@ -1,78 +0,0 @@
---
title: Map Tiles
sidebar_position: 12
version: 1
---
# Map tiles
Superset uses OSM and Mapbox tiles by default. OSM is free but you still need setting your MAPBOX_API_KEY if you want to use mapbox maps.
## Setting map tiles
Map tiles can be set with `DECKGL_BASE_MAP` in your `superset_config.py` or `superset_config_docker.py`
For adding your own map tiles, you can use the following format.
```python
DECKGL_BASE_MAP = [
['tile://https://your_personal_url/{z}/{x}/{y}.png', 'MyTile']
]
```
Openstreetmap tiles url can be added without prefix.
```python
DECKGL_BASE_MAP = [
['https://c.tile.openstreetmap.org/{z}/{x}/{y}.png', 'OpenStreetMap']
]
```
Default values are:
```python
DECKGL_BASE_MAP = [
['https://tile.openstreetmap.org/{z}/{x}/{y}.png', 'Streets (OSM)'],
['https://tile.osm.ch/osm-swiss-style/{z}/{x}/{y}.png', 'Topography (OSM)'],
['mapbox://styles/mapbox/streets-v9', 'Streets'],
['mapbox://styles/mapbox/dark-v9', 'Dark'],
['mapbox://styles/mapbox/light-v9', 'Light'],
['mapbox://styles/mapbox/satellite-streets-v9', 'Satellite Streets'],
['mapbox://styles/mapbox/satellite-v9', 'Satellite'],
['mapbox://styles/mapbox/outdoors-v9', 'Outdoors'],
]
```
It is possible to set only mapbox by removing osm tiles and other way around.
:::warning
Setting `DECKGL_BASE_MAP` overwrite default values
:::
After defining your map tiles, set them in these variables:
- `CORS_OPTIONS`
- `connect-src` of `TALISMAN_CONFIG` and `TALISMAN_CONFIG_DEV` variables.
```python
ENABLE_CORS = True
CORS_OPTIONS: dict[Any, Any] = {
"origins": [
"https://tile.openstreetmap.org",
"https://tile.osm.ch",
"https://your_personal_url/{z}/{x}/{y}.png",
]
}
.
.
TALISMAN_CONFIG = {
"content_security_policy": {
...
"connect-src": [
"'self'",
"https://api.mapbox.com",
"https://events.mapbox.com",
"https://tile.openstreetmap.org",
"https://tile.osm.ch",
"https://your_personal_url/{z}/{x}/{y}.png",
],
...
}
```

View File

@@ -1,141 +0,0 @@
---
title: Network and Security Settings
sidebar_position: 7
version: 1
---
# Network and Security Settings
## CORS
:::note
In Superset versions prior to `5.x` you have to install to install `flask-cors` with `pip install flask-cors` to enable CORS support.
:::
The following keys in `superset_config.py` can be specified to configure CORS:
- `ENABLE_CORS`: Must be set to `True` in order to enable CORS
- `CORS_OPTIONS`: options passed to Flask-CORS
([documentation](https://flask-cors.readthedocs.io/en/latest/api.html#extension))
## HTTP headers
Note that Superset bundles [flask-talisman](https://pypi.org/project/talisman/)
Self-described as a small Flask extension that handles setting HTTP headers that can help
protect against a few common web application security issues.
## HTML Embedding of Dashboards and Charts
There are two ways to embed a dashboard: Using the [SDK](https://www.npmjs.com/package/@superset-ui/embedded-sdk) or embedding a direct link. Note that in the latter case everybody who knows the link is able to access the dashboard.
### Embedding a Public Direct Link to a Dashboard
This works by first changing the content security policy (CSP) of [flask-talisman](https://github.com/GoogleCloudPlatform/flask-talisman) to allow for certain domains to display Superset content. Then a dashboard can be made publicly accessible, i.e. **bypassing authentication**. Once made public, the dashboard's URL can be added to an iframe in another website's HTML code.
#### Changing flask-talisman CSP
Add to `superset_config.py` the entire `TALISMAN_CONFIG` section from `config.py` and include a `frame-ancestors` section:
```python
TALISMAN_ENABLED = True
TALISMAN_CONFIG = {
"content_security_policy": {
...
"frame-ancestors": ["*.my-domain.com", "*.another-domain.com"],
...
```
Restart Superset for this configuration change to take effect.
#### Making a Dashboard Public
1. Add the `'DASHBOARD_RBAC': True` [Feature Flag](https://github.com/apache/superset/blob/master/RESOURCES/FEATURE_FLAGS.md) to `superset_config.py`
2. Add the `Public` role to your dashboard as described [here](https://superset.apache.org/docs/using-superset/creating-your-first-dashboard/#manage-access-to-dashboards)
#### Embedding a Public Dashboard
Now anybody can directly access the dashboard's URL. You can embed it in an iframe like so:
```html
<iframe
width="600"
height="400"
seamless
frameBorder="0"
scrolling="no"
src="https://superset.my-domain.com/superset/dashboard/10/?standalone=1&height=400"
>
</iframe>
```
#### Embedding a Chart
A chart's embed code can be generated by going to a chart's edit view and then clicking at the top right on `...` > `Share` > `Embed code`
### Enabling Embedding via the SDK
Clicking on `...` next to `EDIT DASHBOARD` on the top right of the dashboard's overview page should yield a drop-down menu including the entry "Embed dashboard".
To enable this entry, add the following line to the `.env` file:
```text
SUPERSET_FEATURE_EMBEDDED_SUPERSET=true
```
## CSRF settings
Similarly, [flask-wtf](https://flask-wtf.readthedocs.io/en/0.15.x/config/) is used to manage
some CSRF configurations. If you need to exempt endpoints from CSRF (e.g. if you are
running a custom auth postback endpoint), you can add the endpoints to `WTF_CSRF_EXEMPT_LIST`:
## SSH Tunneling
1. Turn on feature flag
- Change [`SSH_TUNNELING`](https://github.com/apache/superset/blob/eb8386e3f0647df6d1bbde8b42073850796cc16f/superset/config.py#L489) to `True`
- If you want to add more security when establishing the tunnel we allow users to overwrite the `SSHTunnelManager` class [here](https://github.com/apache/superset/blob/eb8386e3f0647df6d1bbde8b42073850796cc16f/superset/config.py#L507)
- You can also set the [`SSH_TUNNEL_LOCAL_BIND_ADDRESS`](https://github.com/apache/superset/blob/eb8386e3f0647df6d1bbde8b42073850796cc16f/superset/config.py#L508) this the host address where the tunnel will be accessible on your VPC
2. Create database w/ ssh tunnel enabled
- With the feature flag enabled you should now see ssh tunnel toggle.
- Click the toggle to enable SSH tunneling and add your credentials accordingly.
- Superset allows for two different types of authentication (Basic + Private Key). These credentials should come from your service provider.
3. Verify data is flowing
- Once SSH tunneling has been enabled, go to SQL Lab and write a query to verify data is properly flowing.
## Domain Sharding
:::note
Domain Sharding is deprecated as of Superset 5.0.0, and will be removed in Superset 6.0.0. Please Enable HTTP2 to keep more open connections per domain.
:::
Chrome allows up to 6 open connections per domain at a time. When there are more than 6 slices in
dashboard, a lot of time fetch requests are queued up and wait for next available socket.
[PR 5039](https://github.com/apache/superset/pull/5039) adds domain sharding to Superset,
and this feature will be enabled by configuration only (by default Superset doesnt allow
cross-domain request).
Add the following setting in your `superset_config.py` file:
- `SUPERSET_WEBSERVER_DOMAINS`: list of allowed hostnames for domain sharding feature.
Please create your domain shards as subdomains of your main domain for authorization to
work properly on new domains. For Example:
- `SUPERSET_WEBSERVER_DOMAINS=['superset-1.mydomain.com','superset-2.mydomain.com','superset-3.mydomain.com','superset-4.mydomain.com']`
or add the following setting in your `superset_config.py` file if domain shards are not subdomains of main domain.
- `SESSION_COOKIE_DOMAIN = '.mydomain.com'`
## Middleware
Superset allows you to add your own middleware. To add your own middleware, update the
`ADDITIONAL_MIDDLEWARE` key in your `superset_config.py`. `ADDITIONAL_MIDDLEWARE` should be a list
of your additional middleware classes.
For example, to use `AUTH_REMOTE_USER` from behind a proxy server like nginx, you have to add a
simple middleware class to add the value of `HTTP_X_PROXY_REMOTE_USER` (or any other custom header
from the proxy) to Gunicorns `REMOTE_USER` environment variable.

View File

@@ -1,535 +0,0 @@
---
title: SQL Templating
hide_title: true
sidebar_position: 5
version: 1
---
# SQL Templating
## Jinja Templates
SQL Lab and Explore supports [Jinja templating](https://jinja.palletsprojects.com/en/2.11.x/) in queries.
To enable templating, the `ENABLE_TEMPLATE_PROCESSING` [feature flag](/docs/configuration/configuring-superset#feature-flags) needs to be enabled in
`superset_config.py`. When templating is enabled, python code can be embedded in virtual datasets and
in Custom SQL in the filter and metric controls in Explore. By default, the following variables are
made available in the Jinja context:
- `columns`: columns which to group by in the query
- `filter`: filters applied in the query
- `from_dttm`: start `datetime` value from the selected time range (`None` if undefined) (deprecated beginning in version 5.0, use `get_time_filter` instead)
- `to_dttm`: end `datetime` value from the selected time range (`None` if undefined). (deprecated beginning in version 5.0, use `get_time_filter` instead)
- `groupby`: columns which to group by in the query (deprecated)
- `metrics`: aggregate expressions in the query
- `row_limit`: row limit of the query
- `row_offset`: row offset of the query
- `table_columns`: columns available in the dataset
- `time_column`: temporal column of the query (`None` if undefined)
- `time_grain`: selected time grain (`None` if undefined)
For example, to add a time range to a virtual dataset, you can write the following:
```sql
SELECT *
FROM tbl
WHERE dttm_col > '{{ from_dttm }}' and dttm_col < '{{ to_dttm }}'
```
You can also use [Jinja's logic](https://jinja.palletsprojects.com/en/2.11.x/templates/#tests)
to make your query robust to clearing the timerange filter:
```sql
SELECT *
FROM tbl
WHERE (
{% if from_dttm is not none %}
dttm_col > '{{ from_dttm }}' AND
{% endif %}
{% if to_dttm is not none %}
dttm_col < '{{ to_dttm }}' AND
{% endif %}
1 = 1
)
```
The `1 = 1` at the end ensures a value is present for the `WHERE` clause even when
the time filter is not set. For many database engines, this could be replaced with `true`.
Note that the Jinja parameters are called within _double_ brackets in the query and with
_single_ brackets in the logic blocks.
To add custom functionality to the Jinja context, you need to overload the default Jinja
context in your environment by defining the `JINJA_CONTEXT_ADDONS` in your superset configuration
(`superset_config.py`). Objects referenced in this dictionary are made available for users to use
where the Jinja context is made available.
```python
JINJA_CONTEXT_ADDONS = {
'my_crazy_macro': lambda x: x*2,
}
```
Default values for jinja templates can be specified via `Parameters` menu in the SQL Lab user interface.
In the UI you can assign a set of parameters as JSON
```json
{
"my_table": "foo"
}
```
The parameters become available in your SQL (example: `SELECT * FROM {{ my_table }}` ) by using Jinja templating syntax.
SQL Lab template parameters are stored with the dataset as `TEMPLATE PARAMETERS`.
There is a special ``_filters`` parameter which can be used to test filters used in the jinja template.
```json
{
"_filters": [
{
"col": "action_type",
"op": "IN",
"val": ["sell", "buy"]
}
]
}
```
```sql
SELECT action, count(*) as times
FROM logs
WHERE action in {{ filter_values('action_type')|where_in }}
GROUP BY action
```
Note ``_filters`` is not stored with the dataset. It's only used within the SQL Lab UI.
Besides default Jinja templating, SQL lab also supports self-defined template processor by setting
the `CUSTOM_TEMPLATE_PROCESSORS` in your superset configuration. The values in this dictionary
overwrite the default Jinja template processors of the specified database engine. The example below
configures a custom presto template processor which implements its own logic of processing macro
template with regex parsing. It uses the `$` style macro instead of `{{ }}` style in Jinja
templating.
By configuring it with `CUSTOM_TEMPLATE_PROCESSORS`, a SQL template on a presto database is
processed by the custom one rather than the default one.
```python
def DATE(
ts: datetime, day_offset: SupportsInt = 0, hour_offset: SupportsInt = 0
) -> str:
"""Current day as a string."""
day_offset, hour_offset = int(day_offset), int(hour_offset)
offset_day = (ts + timedelta(days=day_offset, hours=hour_offset)).date()
return str(offset_day)
class CustomPrestoTemplateProcessor(PrestoTemplateProcessor):
"""A custom presto template processor."""
engine = "presto"
def process_template(self, sql: str, **kwargs) -> str:
"""Processes a sql template with $ style macro using regex."""
# Add custom macros functions.
macros = {
"DATE": partial(DATE, datetime.utcnow())
} # type: Dict[str, Any]
# Update with macros defined in context and kwargs.
macros.update(self.context)
macros.update(kwargs)
def replacer(match):
"""Expand $ style macros with corresponding function calls."""
macro_name, args_str = match.groups()
args = [a.strip() for a in args_str.split(",")]
if args == [""]:
args = []
f = macros[macro_name[1:]]
return f(*args)
macro_names = ["$" + name for name in macros.keys()]
pattern = r"(%s)\s*\(([^()]*)\)" % "|".join(map(re.escape, macro_names))
return re.sub(pattern, replacer, sql)
CUSTOM_TEMPLATE_PROCESSORS = {
CustomPrestoTemplateProcessor.engine: CustomPrestoTemplateProcessor
}
```
SQL Lab also includes a live query validation feature with pluggable backends. You can configure
which validation implementation is used with which database engine by adding a block like the
following to your configuration file:
```python
FEATURE_FLAGS = {
'SQL_VALIDATORS_BY_ENGINE': {
'presto': 'PrestoDBSQLValidator',
}
}
```
The available validators and names can be found in
[sql_validators](https://github.com/apache/superset/tree/master/superset/sql_validators).
## Available Macros
In this section, we'll walkthrough the pre-defined Jinja macros in Superset.
**Current Username**
The `{{ current_username() }}` macro returns the `username` of the currently logged in user.
If you have caching enabled in your Superset configuration, then by default the `username` value will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
You can disable the inclusion of the `username` value in the calculation of the
cache key by adding the following parameter to your Jinja code:
```python
{{ current_username(add_to_cache_keys=False) }}
```
**Current User ID**
The `{{ current_user_id() }}` macro returns the account ID of the currently logged in user.
If you have caching enabled in your Superset configuration, then by default the account `id` value will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
You can disable the inclusion of the account `id` value in the calculation of the
cache key by adding the following parameter to your Jinja code:
```python
{{ current_user_id(add_to_cache_keys=False) }}
```
**Current User Email**
The `{{ current_user_email() }}` macro returns the email address of the currently logged in user.
If you have caching enabled in your Superset configuration, then by default the email address value will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
You can disable the inclusion of the email value in the calculation of the
cache key by adding the following parameter to your Jinja code:
```python
{{ current_user_email(add_to_cache_keys=False) }}
```
**Current User Roles**
The `{{ current_user_roles() }}` macro returns an array of roles for the logged in user.
If you have caching enabled in your Superset configuration, then by default the roles value will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
You can disable the inclusion of the roles value in the calculation of the
cache key by adding the following parameter to your Jinja code:
```python
{{ current_user_roles(add_to_cache_keys=False) }}
```
You can json-stringify the array by adding `|tojson` to your Jinja code:
```python
{{ current_user_roles()|tojson }}
```
You can use the `|where_in` filter to use your roles in a SQL statement. For example, if `current_user_roles()` returns `['admin', 'viewer']`, the following template:
```python
SELECT * FROM users WHERE role IN {{ current_user_roles()|where_in }}
```
Will be rendered as:
```sql
SELECT * FROM users WHERE role IN ('admin', 'viewer')
```
**Current User RLS Rules**
The `{{ current_user_rls_rules() }}` macro returns an array of RLS rules applied to the current dataset for the logged in user.
If you have caching enabled in your Superset configuration, then the list of RLS Rules will be used
by Superset when calculating the cache key. A cache key is a unique identifier that determines if there's a
cache hit in the future and Superset can retrieve cached data.
**Custom URL Parameters**
The `{{ url_param('custom_variable') }}` macro lets you define arbitrary URL
parameters and reference them in your SQL code.
Here's a concrete example:
- You write the following query in SQL Lab:
```sql
SELECT count(*)
FROM ORDERS
WHERE country_code = '{{ url_param('countrycode') }}'
```
- You're hosting Superset at the domain www.example.com and you send your
coworker in Spain the following SQL Lab URL `www.example.com/superset/sqllab?countrycode=ES`
and your coworker in the USA the following SQL Lab URL `www.example.com/superset/sqllab?countrycode=US`
- For your coworker in Spain, the SQL Lab query will be rendered as:
```sql
SELECT count(*)
FROM ORDERS
WHERE country_code = 'ES'
```
- For your coworker in the USA, the SQL Lab query will be rendered as:
```sql
SELECT count(*)
FROM ORDERS
WHERE country_code = 'US'
```
**Explicitly Including Values in Cache Key**
The `{{ cache_key_wrapper() }}` function explicitly instructs Superset to add a value to the
accumulated list of values used in the calculation of the cache key.
This function is only needed when you want to wrap your own custom function return values
in the cache key. You can gain more context
[here](https://github.com/apache/superset/blob/efd70077014cbed62e493372d33a2af5237eaadf/superset/jinja_context.py#L133-L148).
Note that this function powers the caching of the `user_id` and `username` values
in the `current_user_id()` and `current_username()` function calls (if you have caching enabled).
**Filter Values**
You can retrieve the value for a specific filter as a list using `{{ filter_values() }}`.
This is useful if:
- You want to use a filter component to filter a query where the name of filter component column doesn't match the one in the select statement
- You want to have the ability to filter inside the main query for performance purposes
Here's a concrete example:
```sql
SELECT action, count(*) as times
FROM logs
WHERE
action in {{ filter_values('action_type')|where_in }}
GROUP BY action
```
There `where_in` filter converts the list of values from `filter_values('action_type')` into a string suitable for an `IN` expression.
**Filters for a Specific Column**
The `{{ get_filters() }}` macro returns the filters applied to a given column. In addition to
returning the values (similar to how `filter_values()` does), the `get_filters()` macro
returns the operator specified in the Explore UI.
This is useful if:
- You want to handle more than the IN operator in your SQL clause
- You want to handle generating custom SQL conditions for a filter
- You want to have the ability to filter inside the main query for speed purposes
Here's a concrete example:
```sql
WITH RECURSIVE
superiors(employee_id, manager_id, full_name, level, lineage) AS (
SELECT
employee_id,
manager_id,
full_name,
1 as level,
employee_id as lineage
FROM
employees
WHERE
1=1
{# Render a blank line #}
{%- for filter in get_filters('full_name', remove_filter=True) -%}
{%- if filter.get('op') == 'IN' -%}
AND
full_name IN {{ filter.get('val')|where_in }}
{%- endif -%}
{%- if filter.get('op') == 'LIKE' -%}
AND
full_name LIKE {{ "'" + filter.get('val') + "'" }}
{%- endif -%}
{%- endfor -%}
UNION ALL
SELECT
e.employee_id,
e.manager_id,
e.full_name,
s.level + 1 as level,
s.lineage
FROM
employees e,
superiors s
WHERE s.manager_id = e.employee_id
)
SELECT
employee_id, manager_id, full_name, level, lineage
FROM
superiors
order by lineage, level
```
**Time Filter**
The `{{ get_time_filter() }}` macro returns the time filter applied to a specific column. This is useful if you want
to handle time filters inside the virtual dataset, as by default the time filter is placed on the outer query. This can
considerably improve performance, as many databases and query engines are able to optimize the query better
if the temporal filter is placed on the inner query, as opposed to the outer query.
The macro takes the following parameters:
- `column`: Name of the temporal column. Leave undefined to reference the time range from a Dashboard Native Time Range
filter (when present).
- `default`: The default value to fall back to if the time filter is not present, or has the value `No filter`
- `target_type`: The target temporal type as recognized by the target database (e.g. `TIMESTAMP`, `DATE` or
`DATETIME`). If `column` is defined, the format will default to the type of the column. This is used to produce
the format of the `from_expr` and `to_expr` properties of the returned `TimeFilter` object.
- `strftime`: format using the `strftime` method of `datetime` for custom time formatting.
([see docs for valid format codes](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes)).
When defined `target_type` will be ignored.
- `remove_filter`: When set to true, mark the filter as processed, removing it from the outer query. Useful when a
filter should only apply to the inner query.
The return type has the following properties:
- `from_expr`: the start of the time filter (if any)
- `to_expr`: the end of the time filter (if any)
- `time_range`: The applied time range
Here's a concrete example using the `logs` table from the Superset metastore:
```
{% set time_filter = get_time_filter("dttm", remove_filter=True) %}
{% set from_expr = time_filter.from_expr %}
{% set to_expr = time_filter.to_expr %}
{% set time_range = time_filter.time_range %}
SELECT
*,
'{{ time_range }}' as time_range
FROM logs
{% if from_expr or to_expr %}WHERE 1 = 1
{% if from_expr %}AND dttm >= {{ from_expr }}{% endif %}
{% if to_expr %}AND dttm < {{ to_expr }}{% endif %}
{% endif %}
```
Assuming we are creating a table chart with a simple `COUNT(*)` as the metric with a time filter `Last week` on the
`dttm` column, this would render the following query on Postgres (note the formatting of the temporal filters, and
the absence of time filters on the outer query):
```
SELECT COUNT(*) AS count
FROM
(SELECT *,
'Last week' AS time_range
FROM public.logs
WHERE 1 = 1
AND dttm >= TO_TIMESTAMP('2024-08-27 00:00:00.000000', 'YYYY-MM-DD HH24:MI:SS.US')
AND dttm < TO_TIMESTAMP('2024-09-03 00:00:00.000000', 'YYYY-MM-DD HH24:MI:SS.US')) AS virtual_table
ORDER BY count DESC
LIMIT 1000;
```
When using the `default` parameter, the templated query can be simplified, as the endpoints will always be defined
(to use a fixed time range, you can also use something like `default="2024-08-27 : 2024-09-03"`)
```
{% set time_filter = get_time_filter("dttm", default="Last week", remove_filter=True) %}
SELECT
*,
'{{ time_filter.time_range }}' as time_range
FROM logs
WHERE
dttm >= {{ time_filter.from_expr }}
AND dttm < {{ time_filter.to_expr }}
```
**Datasets**
It's possible to query physical and virtual datasets using the `dataset` macro. This is useful if you've defined computed columns and metrics on your datasets, and want to reuse the definition in adhoc SQL Lab queries.
To use the macro, first you need to find the ID of the dataset. This can be done by going to the view showing all the datasets, hovering over the dataset you're interested in, and looking at its URL. For example, if the URL for a dataset is https://superset.example.org/explore/?dataset_type=table&dataset_id=42 its ID is 42.
Once you have the ID you can query it as if it were a table:
```sql
SELECT * FROM {{ dataset(42) }} LIMIT 10
```
If you want to select the metric definitions as well, in addition to the columns, you need to pass an additional keyword argument:
```sql
SELECT * FROM {{ dataset(42, include_metrics=True) }} LIMIT 10
```
Since metrics are aggregations, the resulting SQL expression will be grouped by all non-metric columns. You can specify a subset of columns to group by instead:
```sql
SELECT * FROM {{ dataset(42, include_metrics=True, columns=["ds", "category"]) }} LIMIT 10
```
**Metrics**
The `{{ metric('metric_key', dataset_id) }}` macro can be used to retrieve the metric SQL syntax from a dataset. This can be useful for different purposes:
- Override the metric label in the chart level
- Combine multiple metrics in a calculation
- Retrieve a metric syntax in SQL lab
- Re-use metrics across datasets
This macro avoids copy/paste, allowing users to centralize the metric definition in the dataset layer.
The `dataset_id` parameter is optional, and if not provided Superset will use the current dataset from context (for example, when using this macro in the Chart Builder, by default the `macro_key` will be searched in the dataset powering the chart).
The parameter can be used in SQL Lab, or when fetching a metric from another dataset.
## Available Filters
Superset supports [builtin filters from the Jinja2 templating package](https://jinja.palletsprojects.com/en/stable/templates/#builtin-filters). Custom filters have also been implemented:
**Where In**
Parses a list into a SQL-compatible statement. This is useful with macros that return an array (for example the `filter_values` macro):
```
Dashboard filter with "First", "Second" and "Third" options selected
{{ filter_values('column') }} => ["First", "Second", "Third"]
{{ filter_values('column')|where_in }} => ('First', 'Second', 'Third')
```
By default, this filter returns `()` (as a string) in case the value is null. The `default_to_none` parameter can be se to `True` to return null in this case:
```
Dashboard filter without any value applied
{{ filter_values('column') }} => ()
{{ filter_values('column')|where_in(default_to_none=True) }} => None
```
**To Datetime**
Loads a string as a `datetime` object. This is useful when performing date operations. For example:
```
{% set from_expr = get_time_filter("dttm", strftime="%Y-%m-%d").from_expr %}
{% set to_expr = get_time_filter("dttm", strftime="%Y-%m-%d").to_expr %}
{% if (to_expr|to_datetime(format="%Y-%m-%d") - from_expr|to_datetime(format="%Y-%m-%d")).days > 100 %}
do something
{% else %}
do something else
{% endif %}
```

View File

@@ -1,192 +0,0 @@
---
title: Theming
hide_title: true
sidebar_position: 12
version: 1
---
# Theming Superset
:::note
apache-superset>=6.0
:::
Superset now rides on **Ant Design v5's token-based theming**.
Every Antd token works, plus a handful of Superset-specific ones for charts and dashboard chrome.
## Managing Themes via UI
Superset includes a built-in **Theme Management** interface accessible from the admin menu under **Settings > Themes**.
### Creating a New Theme
1. Navigate to **Settings > Themes** in the Superset interface
2. Click **+ Theme** to create a new theme
3. Use the [Ant Design Theme Editor](https://ant.design/theme-editor) to design your theme:
- Design your palette, typography, and component overrides
- Open the `CONFIG` modal and copy the JSON configuration
4. Paste the JSON into the theme definition field in Superset
5. Give your theme a descriptive name and save
You can also extend with Superset-specific tokens (documented in the default theme object) before you import.
### System Theme Administration
When `ENABLE_UI_THEME_ADMINISTRATION = True` is configured, administrators can manage system-wide themes directly from the UI:
#### Setting System Themes
- **System Default Theme**: Click the sun icon on any theme to set it as the system-wide default
- **System Dark Theme**: Click the moon icon on any theme to set it as the system dark mode theme
- **Automatic OS Detection**: When both default and dark themes are set, Superset automatically detects and applies the appropriate theme based on OS preferences
#### Managing System Themes
- System themes are indicated with special badges in the theme list
- Only administrators with write permissions can modify system theme settings
- Removing a system theme designation reverts to configuration file defaults
### Applying Themes to Dashboards
Once created, themes can be applied to individual dashboards:
- Edit any dashboard and select your custom theme from the theme dropdown
- Each dashboard can have its own theme, allowing for branded or context-specific styling
## Configuration Options
### Python Configuration
Configure theme behavior via `superset_config.py`:
```python
# Enable UI-based theme administration for admins
ENABLE_UI_THEME_ADMINISTRATION = True
# Optional: Set initial default themes via configuration
# These can be overridden via the UI when ENABLE_UI_THEME_ADMINISTRATION = True
THEME_DEFAULT = {
"token": {
"colorPrimary": "#2893B3",
"colorSuccess": "#5ac189",
# ... your theme JSON configuration
}
}
# Optional: Dark theme configuration
THEME_DARK = {
"algorithm": "dark",
"token": {
"colorPrimary": "#2893B3",
# ... your dark theme overrides
}
}
# To force a single theme on all users, set THEME_DARK = None
# When both themes are defined (via UI or config):
# - Users can manually switch between themes
# - OS preference detection is automatically enabled
```
### Migration from Configuration to UI
When `ENABLE_UI_THEME_ADMINISTRATION = True`:
1. System themes set via the UI take precedence over configuration file settings
2. The UI shows which themes are currently set as system defaults
3. Administrators can change system themes without restarting Superset
4. Configuration file themes serve as fallbacks when no UI themes are set
### Copying Themes Between Systems
To export a theme for use in configuration files or another instance:
1. Navigate to **Settings > Themes** and click the export icon on your desired theme
2. Extract the JSON configuration from the exported YAML file
3. Use this JSON in your `superset_config.py` or import it into another Superset instance
## Theme Development Workflow
1. **Design**: Use the [Ant Design Theme Editor](https://ant.design/theme-editor) to iterate on your design
2. **Test**: Create themes in Superset's CRUD interface for testing
3. **Apply**: Assign themes to specific dashboards or configure instance-wide
4. **Iterate**: Modify theme JSON directly in the CRUD interface or re-import from the theme editor
## Custom Fonts
Superset supports custom fonts through runtime configuration, allowing you to use branded or custom typefaces without rebuilding the application.
### Configuring Custom Fonts
Add font URLs to your `superset_config.py`:
```python
# Load fonts from Google Fonts, Adobe Fonts, or self-hosted sources
CUSTOM_FONT_URLS = [
"https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap",
"https://fonts.googleapis.com/css2?family=JetBrains+Mono:wght@400;500&display=swap",
]
# Update CSP to allow font sources
TALISMAN_CONFIG = {
"content_security_policy": {
"font-src": ["'self'", "https://fonts.googleapis.com", "https://fonts.gstatic.com"],
"style-src": ["'self'", "'unsafe-inline'", "https://fonts.googleapis.com"],
}
}
```
### Using Custom Fonts in Themes
Once configured, reference the fonts in your theme configuration:
```python
THEME_DEFAULT = {
"token": {
"fontFamily": "Inter, -apple-system, BlinkMacSystemFont, sans-serif",
"fontFamilyCode": "JetBrains Mono, Monaco, monospace",
# ... other theme tokens
}
}
```
Or in the CRUD interface theme JSON:
```json
{
"token": {
"fontFamily": "Inter, -apple-system, BlinkMacSystemFont, sans-serif",
"fontFamilyCode": "JetBrains Mono, Monaco, monospace"
}
}
```
### Font Sources
- **Google Fonts**: Free, CDN-hosted fonts with wide variety
- **Adobe Fonts**: Premium fonts (requires subscription and kit ID)
- **Self-hosted**: Place font files in `/static/assets/fonts/` and reference via CSS
This feature works with the stock Docker image - no custom build required!
## Advanced Features
- **System Themes**: Manage system-wide default and dark themes via UI or configuration
- **Per-Dashboard Theming**: Each dashboard can have its own visual identity
- **JSON Editor**: Edit theme configurations directly within Superset's interface
- **Custom Fonts**: Load external fonts via configuration without rebuilding
- **OS Dark Mode Detection**: Automatically switches themes based on system preferences
- **Theme Import/Export**: Share themes between instances via YAML files
## API Access
For programmatic theme management, Superset provides REST endpoints:
- `GET /api/v1/theme/` - List all themes
- `POST /api/v1/theme/` - Create a new theme
- `PUT /api/v1/theme/{id}` - Update a theme
- `DELETE /api/v1/theme/{id}` - Delete a theme
- `PUT /api/v1/theme/{id}/set_system_default` - Set as system default theme (admin only)
- `PUT /api/v1/theme/{id}/set_system_dark` - Set as system dark theme (admin only)
- `DELETE /api/v1/theme/unset_system_default` - Remove system default designation
- `DELETE /api/v1/theme/unset_system_dark` - Remove system dark designation
- `GET /api/v1/theme/export/` - Export themes as YAML
- `POST /api/v1/theme/import/` - Import themes from YAML
These endpoints require appropriate permissions and are subject to RBAC controls.

View File

@@ -1,50 +0,0 @@
---
title: Timezones
hide_title: true
sidebar_position: 6
version: 1
---
# Timezones
There are four distinct timezone components which relate to Apache Superset,
1. The timezone that the underlying data is encoded in.
2. The timezone of the database engine.
3. The timezone of the Apache Superset backend.
4. The timezone of the Apache Superset client.
where if a temporal field (`DATETIME`, `TIME`, `TIMESTAMP`, etc.) does not explicitly define a timezone it defaults to the underlying timezone of the component.
To help make the problem somewhat tractable—given that Apache Superset has no control on either how the data is ingested (1) or the timezone of the client (4)—from a consistency standpoint it is highly recommended that both (2) and (3) are configured to use the same timezone with a strong preference given to [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) to ensure temporal fields without an explicit timestamp are not incorrectly coerced into the wrong timezone. Actually Apache Superset currently has implicit assumptions that timestamps are in UTC and thus configuring (3) to a non-UTC timezone could be problematic.
To strive for data consistency (regardless of the timezone of the client) the Apache Superset backend tries to ensure that any timestamp sent to the client has an explicit (or semi-explicit as in the case with [Epoch time](https://en.wikipedia.org/wiki/Unix_time) which is always in reference to UTC) timezone encoded within.
The challenge however lies with the slew of [database engines](/docs/configuration/databases#installing-drivers-in-docker-images) which Apache Superset supports and various inconsistencies between their [Python Database API (DB-API)](https://www.python.org/dev/peps/pep-0249/) implementations combined with the fact that we use [Pandas](https://pandas.pydata.org/) to read SQL into a DataFrame prior to serializing to JSON. Regrettably Pandas ignores the DB-API [type_code](https://www.python.org/dev/peps/pep-0249/#type-objects) relying by default on the underlying Python type returned by the DB-API. Currently only a subset of the supported database engines work correctly with Pandas, i.e., ensuring timestamps without an explicit timestamp are serializd to JSON with the server timezone, thus guaranteeing the client will display timestamps in a consistent manner irrespective of the client's timezone.
For example the following is a comparison of MySQL and Presto,
```python
import pandas as pd
from sqlalchemy import create_engine
pd.read_sql_query(
sql="SELECT TIMESTAMP('2022-01-01 00:00:00') AS ts",
con=create_engine("mysql://root@localhost:3360"),
).to_json()
pd.read_sql_query(
sql="SELECT TIMESTAMP '2022-01-01 00:00:00' AS ts",
con=create_engine("presto://localhost:8080"),
).to_json()
```
which outputs `{"ts":{"0":1640995200000}}` (which infers the UTC timezone per the Epoch time definition) and `{"ts":{"0":"2022-01-01 00:00:00.000"}}` (without an explicit timezone) respectively and thus are treated differently in JavaScript:
```js
new Date(1640995200000)
> Sat Jan 01 2022 13:00:00 GMT+1300 (New Zealand Daylight Time)
new Date("2022-01-01 00:00:00.000")
> Sat Jan 01 2022 00:00:00 GMT+1300 (New Zealand Daylight Time)
```

View File

@@ -1,138 +0,0 @@
---
title: Contributing to Superset
sidebar_position: 1
version: 1
---
# Contributing to Superset
Superset is an [Apache Software foundation](https://www.apache.org/theapacheway/index.html) project.
The core contributors (or committers) to Superset communicate primarily in the following channels (
which can be joined by anyone):
- [Mailing list](https://lists.apache.org/list.html?dev@superset.apache.org)
- [Apache Superset Slack community](http://bit.ly/join-superset-slack)
- [GitHub issues](https://github.com/apache/superset/issues)
- [GitHub pull requests](https://github.com/apache/superset/pulls)
- [GitHub discussions](https://github.com/apache/superset/discussions)
- [Superset Community Calendar](https://superset.apache.org/community)
More references:
- [Superset Wiki (code guidelines and additional resources)](https://github.com/apache/superset/wiki)
## Orientation
Here's a list of repositories that contain Superset-related packages:
- [apache/superset](https://github.com/apache/superset)
is the main repository containing the `apache_superset` Python package
distributed on
[pypi](https://pypi.org/project/apache_superset/). This repository
also includes Superset's main TypeScript/JavaScript bundles and react apps under
the [superset-frontend](https://github.com/apache/superset/tree/master/superset-frontend)
folder.
- [github.com/apache-superset](https://github.com/apache-superset) is the
GitHub organization under which we manage Superset-related
small tools, forks and Superset-related experimental ideas.
## Types of Contributions
### Report Bug
The best way to report a bug is to file an issue on GitHub. Please include:
- Your operating system name and version.
- Superset version.
- Detailed steps to reproduce the bug.
- Any details about your local setup that might be helpful in troubleshooting.
When posting Python stack traces, please quote them using
[Markdown blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks/).
_Please note that feature requests opened as GitHub Issues will be moved to Discussions._
### Submit Ideas or Feature Requests
The best way is to start an ["Ideas" Discussion thread](https://github.com/apache/superset/discussions/categories/ideas) on GitHub:
- Explain in detail how it would work.
- Keep the scope as narrow as possible, to make it easier to implement.
- Remember that this is a volunteer-driven project, and that your contributions are as welcome as anyone's :)
To propose large features or major changes to codebase, and help usher in those changes, please create a **Superset Improvement Proposal (SIP)**. See template from [SIP-0](https://github.com/apache/superset/issues/5602)
### Fix Bugs
Look through the GitHub issues. Issues tagged with `#bug` are
open to whoever wants to implement them.
### Implement Features
Look through the GitHub issues. Issues tagged with
`#feature` are open to whoever wants to implement them.
### Improve Documentation
Superset could always use better documentation,
whether as part of the official Superset docs,
in docstrings, `docs/*.rst` or even on the web as blog posts or
articles. See [Documentation](/docs/contributing/howtos#contributing-to-documentation) for more details.
### Add Translations
If you are proficient in a non-English language, you can help translate
text strings from Superset's UI. You can jump into the existing
language dictionaries at
`superset/translations/<language_code>/LC_MESSAGES/messages.po`, or
even create a dictionary for a new language altogether.
See [Translating](howtos#contributing-translations) for more details.
### Ask Questions
There is a dedicated [`apache-superset` tag](https://stackoverflow.com/questions/tagged/apache-superset) on [StackOverflow](https://stackoverflow.com/). Please use it when asking questions.
## Types of Contributors
Following the project governance model of the Apache Software Foundation (ASF), Apache Superset has a specific set of contributor roles:
### PMC Member
A Project Management Committee (PMC) member is a person who has been elected by the PMC to help manage the project. PMC members are responsible for the overall health of the project, including community development, release management, and project governance. PMC members are also responsible for the technical direction of the project.
For more information about Apache Project PMCs, please refer to https://www.apache.org/foundation/governance/pmcs.html
### Committer
A committer is a person who has been elected by the PMC to have write access (commit access) to the code repository. They can modify the code, documentation, and website and accept contributions from others.
The official list of committers and PMC members can be found [here](https://projects.apache.org/committee.html?superset).
### Contributor
A contributor is a person who has contributed to the project in any way, including but not limited to code, tests, documentation, issues, and discussions.
> You can also review the Superset project's guidelines for PMC member promotion here: https://github.com/apache/superset/wiki/Guidelines-for-promoting-Superset-Committers-to-the-Superset-PMC
### Security Team
The security team is a selected subset of PMC members, committers and non-committers who are responsible for handling security issues.
New members of the security team are selected by the PMC members in a vote. You can request to be added to the team by sending a message to private@superset.apache.org. However, the team should be small and focused on solving security issues, so the requests will be evaluated on a case-by-case basis and the team size will be kept relatively small, limited to only actively security-focused contributors.
This security team must follow the [ASF vulnerability handling process](https://apache.org/security/committers.html#asf-project-security-for-committers).
Each new security issue is tracked as a JIRA ticket on the [ASF's JIRA Superset security project](https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=588&projectKey=SUPERSETSEC)
Security team members must:
- Have an [ICLA](https://www.apache.org/licenses/contributor-agreements.html) signed with Apache Software Foundation.
- Not reveal information about pending and unfixed security issues to anyone (including their employers) unless specifically authorised by the security team members, e.g., if the security team agrees that diagnosing and solving an issue requires the involvement of external experts.
A release manager, the contributor overseeing the release of a specific version of Apache Superset, is by default a member of the security team. However, they are not expected to be active in assessing, discussing, and fixing security issues.
Security team members should also follow these general expectations:
- Actively participate in assessing, discussing, fixing, and releasing security issues in Superset.
- Avoid discussing security fixes in public forums. Pull request (PR) descriptions should not contain any information about security issues. The corresponding JIRA ticket should contain a link to the PR.
- Security team members who contribute to a fix may be listed as remediation developers in the CVE report, along with their job affiliation (if they choose to include it).

File diff suppressed because it is too large Load Diff

View File

@@ -1,254 +0,0 @@
---
title: Guidelines
sidebar_position: 2
version: 1
---
## Pull Request Guidelines
A philosophy we would like to strongly encourage is
> Before creating a PR, create an issue.
The purpose is to separate problem from possible solutions.
**Bug fixes:** If youre only fixing a small bug, its fine to submit a pull request right away but we highly recommend filing an issue detailing what youre fixing. This is helpful in case we dont accept that specific fix but want to keep track of the issue. Please keep in mind that the project maintainers reserve the rights to accept or reject incoming PRs, so it is better to separate the issue and the code to fix it from each other. In some cases, project maintainers may request you to create a separate issue from PR before proceeding.
**Refactor:** For small refactors, it can be a standalone PR itself detailing what you are refactoring and why. If there are concerns, project maintainers may request you to create a `#SIP` for the PR before proceeding.
**Feature/Large changes:** If you intend to change the public API, or make any non-trivial changes to the implementation, we require you to file a new issue as `#SIP` (Superset Improvement Proposal). This lets us reach an agreement on your proposal before you put significant effort into it. You are welcome to submit a PR along with the SIP (sometimes necessary for demonstration), but we will not review/merge the code until the SIP is approved.
In general, small PRs are always easier to review than large PRs. The best practice is to break your work into smaller independent PRs and refer to the same issue. This will greatly reduce turnaround time.
If you wish to share your work which is not ready to merge yet, create a [Draft PR](https://github.blog/2019-02-14-introducing-draft-pull-requests/). This will enable maintainers and the CI runner to prioritize mature PR's.
Finally, never submit a PR that will put master branch in broken state. If the PR is part of multiple PRs to complete a large feature and cannot work on its own, you can create a feature branch and merge all related PRs into the feature branch before creating a PR from feature branch to master.
### Protocol
#### Authoring
- Fill in all sections of the PR template.
- Title the PR with one of the following semantic prefixes (inspired by [Karma](http://karma-runner.github.io/0.10/dev/git-commit-msg.html])):
- `feat` (new feature)
- `fix` (bug fix)
- `docs` (changes to the documentation)
- `style` (formatting, missing semi colons, etc; no application logic change)
- `refactor` (refactoring code)
- `test` (adding missing tests, refactoring tests; no application logic change)
- `chore` (updating tasks etc; no application logic change)
- `perf` (performance-related change)
- `build` (build tooling, Docker configuration change)
- `ci` (test runner, GitHub Actions workflow changes)
- `other` (changes that don't correspond to the above -- should be rare!)
- Examples:
- `feat: export charts as ZIP files`
- `perf(api): improve API info performance`
- `fix(chart-api): cached-indicator always shows value is cached`
- Add prefix `[WIP]` to title if not ready for review (WIP = work-in-progress). We recommend creating a PR with `[WIP]` first and remove it once you have passed CI test and read through your code changes at least once.
- If you believe your PR contributes a potentially breaking change, put a `!` after the semantic prefix but before the colon in the PR title, like so: `feat!: Added foo functionality to bar`
- **Screenshots/GIFs:** Changes to user interface require before/after screenshots, or GIF for interactions
- Recommended capture tools ([Kap](https://getkap.co/), [LICEcap](https://www.cockos.com/licecap/), [Skitch](https://download.cnet.com/Skitch/3000-13455_4-189876.html))
- If no screenshot is provided, the committers will mark the PR with `need:screenshot` label and will not review until screenshot is provided.
- **Dependencies:** Be careful about adding new dependency and avoid unnecessary dependencies.
- For Python, include it in `pyproject.toml` denoting any specific restrictions and
in `requirements.txt` pinned to a specific version which ensures that the application
build is deterministic.
- For TypeScript/JavaScript, include new libraries in `package.json`
- **Tests:** The pull request should include tests, either as doctests, unit tests, or both. Make sure to resolve all errors and test failures. See [Testing](/docs/contributing/howtos#testing) for how to run tests.
- **Documentation:** If the pull request adds functionality, the docs should be updated as part of the same PR.
- **CI:** Reviewers will not review the code until all CI tests are passed. Sometimes there can be flaky tests. You can close and open PR to re-run CI test. Please report if the issue persists. After the CI fix has been deployed to `master`, please rebase your PR.
- **Code coverage:** Please ensure that code coverage does not decrease.
- Remove `[WIP]` when ready for review. Please note that it may be merged soon after approved so please make sure the PR is ready to merge and do not expect more time for post-approval edits.
- If the PR was not ready for review and inactive for > 30 days, we will close it due to inactivity. The author is welcome to re-open and update.
#### Reviewing
- Use constructive tone when writing reviews.
- If there are changes required, state clearly what needs to be done before the PR can be approved.
- If you are asked to update your pull request with some changes there's no need to create a new one. Push your changes to the same branch.
- The committers reserve the right to reject any PR and in some cases may request the author to file an issue.
#### Test Environments
- Members of the Apache GitHub org can launch an ephemeral test environment directly on a pull request by creating a comment containing (only) the command `/testenv up`.
- Note that org membership must be public in order for this validation to function properly.
- Feature flags may be set for a test environment by specifying the flag name (prefixed with `FEATURE_`) and value after the command.
- Format: `/testenv up FEATURE_<feature flag name>=true|false`
- Example: `/testenv up FEATURE_DASHBOARD_NATIVE_FILTERS=true`
- Multiple feature flags may be set in single command, separated by whitespace
- A comment will be created by the workflow script with the address and login information for the ephemeral environment.
- Test environments may be created once the Docker build CI workflow for the PR has completed successfully.
- Test environments do not currently update automatically when new commits are added to a pull request.
- Test environments do not currently support async workers, though this is planned.
- Running test environments will be shutdown upon closing the pull request.
#### Merging
- At least one approval is required for merging a PR.
- PR is usually left open for at least 24 hours before merging.
- After the PR is merged, [close the corresponding issue(s)](https://help.github.com/articles/closing-issues-using-keywords/).
#### Post-merge Responsibility
- Project maintainers may contact the PR author if new issues are introduced by the PR.
- Project maintainers may revert your changes if a critical issue is found, such as breaking master branch CI.
## Managing Issues and PRs
To handle issues and PRs that are coming in, committers read issues/PRs and flag them with labels to categorize and help contributors spot where to take actions, as contributors usually have different expertises.
Triaging goals
- **For issues:** Categorize, screen issues, flag required actions from authors.
- **For PRs:** Categorize, flag required actions from authors. If PR is ready for review, flag required actions from reviewers.
First, add **Category labels (a.k.a. hash labels)**. Every issue/PR must have one hash label (except spam entry). Labels that begin with `#` defines issue/PR type:
| Label | for Issue | for PR |
| --------------- | ----------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| `#bug` | Bug report | Bug fix |
| `#code-quality` | Describe problem with code, architecture or productivity | Refactor, tests, tooling |
| `#feature` | New feature request | New feature implementation |
| `#refine` | Propose improvement such as adjusting padding or refining UI style, excluding new features, bug fixes, and refactoring. | Implementation of improvement such as adjusting padding or refining UI style, excluding new features, bug fixes, and refactoring. |
| `#doc` | Documentation | Documentation |
| `#question` | Troubleshooting: Installation, Running locally, Ask how to do something. Can be changed to `#bug` later. | N/A |
| `#SIP` | Superset Improvement Proposal | N/A |
| `#ASF` | Tasks related to Apache Software Foundation policy | Tasks related to Apache Software Foundation policy |
Then add other types of labels as appropriate.
- **Descriptive labels (a.k.a. dot labels):** These labels that begin with `.` describe the details of the issue/PR, such as `.ui`, `.js`, `.install`, `.backend`, etc. Each issue/PR can have zero or more dot labels.
- **Need labels:** These labels have pattern `need:xxx`, which describe the work required to progress, such as `need:rebase`, `need:update`, `need:screenshot`.
- **Risk labels:** These labels have pattern `risk:xxx`, which describe the potential risk on adopting the work, such as `risk:db-migration`. The intention was to better understand the impact and create awareness for PRs that need more rigorous testing.
- **Status labels:** These labels describe the status (`abandoned`, `wontfix`, `cant-reproduce`, etc.) Issue/PRs that are rejected or closed without completion should have one or more status labels.
- **Version labels:** These have the pattern `vx.x` such as `v0.28`. Version labels on issues describe the version the bug was reported on. Version labels on PR describe the first release that will include the PR.
Committers may also update title to reflect the issue/PR content if the author-provided title is not descriptive enough.
If the PR passes CI tests and does not have any `need:` labels, it is ready for review, add label `review` and/or `design-review`.
If an issue/PR has been inactive for >=30 days, it will be closed. If it does not have any status label, add `inactive`.
When creating a PR, if you're aiming to have it included in a specific release, please tag it with the version label. For example, to have a PR considered for inclusion in Superset 1.1 use the label `v1.1`.
## Revert Guidelines
Reverting changes that are causing issues in the master branch is a normal and expected part of the development process. In an open source community, the ramifications of a change cannot always be fully understood. With that in mind, here are some considerations to keep in mind when considering a revert:
- **Availability of the PR author:** If the original PR author or the engineer who merged the code is highly available and can provide a fix in a reasonable time frame, this would counter-indicate reverting.
- **Severity of the issue:** How severe is the problem on master? Is it keeping the project from moving forward? Is there user impact? What percentage of users will experience a problem?
- **Size of the change being reverted:** Reverting a single small PR is a much lower-risk proposition than reverting a massive, multi-PR change.
- **Age of the change being reverted:** Reverting a recently-merged PR will be more acceptable than reverting an older PR. A bug discovered in an older PR is unlikely to be causing widespread serious issues.
- **Risk inherent in reverting:** Will the reversion break critical functionality? Is the medicine more dangerous than the disease?
- **Difficulty of crafting a fix:** In the case of issues with a clear solution, it may be preferable to implement and merge a fix rather than a revert.
Should you decide that reverting is desirable, it is the responsibility of the Contributor performing the revert to:
- **Contact the interested parties:** The PR's author and the engineer who merged the work should both be contacted and informed of the revert.
- **Provide concise reproduction steps:** Ensure that the issue can be clearly understood and duplicated by the original author of the PR.
- **Put the revert through code review:** The revert must be approved by another committer.
## Design Guidelines
### Capitalization guidelines
#### Sentence case
Use sentence-case capitalization for everything in the UI (except these \*\*).
Sentence case is predominantly lowercase. Capitalize only the initial character of the first word, and other words that require capitalization, like:
- **Proper nouns.** Objects in the product _are not_ considered proper nouns e.g. dashboards, charts, saved queries etc. Proprietary feature names eg. SQL Lab, Preset Manager _are_ considered proper nouns
- **Acronyms** (e.g. CSS, HTML)
- When referring to **UI labels that are themselves capitalized** from sentence case (e.g. page titles - Dashboards page, Charts page, Saved queries page, etc.)
- User input that is reflected in the UI. E.g. a user-named a dashboard tab
**Sentence case vs. Title case:**
Title case: "A Dog Takes a Walk in Paris"
Sentence case: "A dog takes a walk in Paris"
**Why sentence case?**
- Its generally accepted as the quickest to read
- Its the easiest form to distinguish between common and proper nouns
#### How to refer to UI elements
When writing about a UI element, use the same capitalization as used in the UI.
For example, if an input field is labeled “Name” then you refer to this as the “Name input field”. Similarly, if a button has the label “Save” in it, then it is correct to refer to the “Save button”.
Where a product page is titled “Settings”, you refer to this in writing as follows:
“Edit your personal information on the Settings page”.
Often a product page will have the same title as the objects it contains. In this case, refer to the page as it appears in the UI, and the objects as common nouns:
- Upload a dashboard on the Dashboards page
- Go to Dashboards
- View dashboard
- View all dashboards
- Upload CSS templates on the CSS templates page
- Queries that you save will appear on the Saved queries page
- Create custom queries in SQL Lab then create dashboards
#### \*\*Exceptions to sentence case
- Input labels, buttons and UI tabs are all caps
- User input values (e.g. column names, SQL Lab tab names) should be in their original case
## Programming Language Conventions
### Python
Parameters in the `config.py` (which are accessible via the Flask app.config dictionary) are
assumed to always be defined and thus should be accessed directly via,
```python
blueprints = app.config["BLUEPRINTS"]
```
rather than,
```python
blueprints = app.config.get("BLUEPRINTS")
```
or similar as the later will cause typing issues. The former is of type `List[Callable]`
whereas the later is of type `Optional[List[Callable]]`.
#### Typing / Types Hints
To ensure clarity, consistency, all readability, _all_ new functions should use
[type hints](https://docs.python.org/3/library/typing.html) and include a
docstring.
Note per [PEP-484](https://www.python.org/dev/peps/pep-0484/#exceptions) no
syntax for listing explicitly raised exceptions is proposed and thus the
recommendation is to put this information in a docstring, i.e.,
```python
import math
from typing import Union
def sqrt(x: Union[float, int]) -> Union[float, int]:
"""
Return the square root of x.
:param x: A number
:returns: The square root of the given number
:raises ValueError: If the number is negative
"""
return math.sqrt(x)
```
### TypeScript
TypeScript is fully supported and is the recommended language for writing all new frontend
components. When modifying existing functions/components, migrating to TypeScript is
appreciated, but not required. Examples of migrating functions/components to TypeScript can be
found in [#9162](https://github.com/apache/superset/pull/9162) and [#9180](https://github.com/apache/superset/pull/9180).

View File

@@ -1,634 +0,0 @@
---
title: Development How-tos
hide_title: true
sidebar_position: 4
version: 1
---
# Development How-tos
## Contributing to Documentation
The latest documentation and tutorial are available at https://superset.apache.org/.
The documentation site is built using [Docusaurus 3](https://docusaurus.io/), a modern
static website generator, the source for which resides in `./docs`.
### Local Development
To set up a local development environment with hot reloading for the documentation site:
```shell
cd docs
yarn install # Installs NPM dependencies
yarn start # Starts development server at http://localhost:3000
```
### Build
To create and serve a production build of the documentation site:
```shell
yarn build
yarn serve
```
### Deployment
Commits to `master` trigger a rebuild and redeploy of the documentation site. Submit pull requests that modify the documentation with the `docs:` prefix.
## Creating Visualization Plugins
Visualizations in Superset are implemented in JavaScript or TypeScript. Superset
comes preinstalled with several visualizations types (hereafter "viz plugins") that
can be found under the `superset-frontend/plugins` directory. Viz plugins are added to
the application in the `superset-frontend/src/visualizations/presets/MainPreset.js`.
The Superset project is always happy to review proposals for new high quality viz
plugins. However, for highly custom viz types it is recommended to maintain a fork
of Superset, and add the custom built viz plugins by hand.
**Note:** Additional community-generated resources about creating and deploying custom visualization plugins can be found on the [Superset Wiki](https://github.com/apache/superset/wiki/Community-Resource-Library#creating-custom-data-visualizations)
### Prerequisites
In order to create a new viz plugin, you need the following:
- Run MacOS or Linux (Windows is not officially supported, but may work)
- Node.js 16
- npm 7 or 8
A general familiarity with [React](https://reactjs.org/) and the npm/Node system is
also recommended.
### Creating a simple Hello World viz plugin
To get started, you need the Superset Yeoman Generator. It is recommended to use the
version of the template that ships with the version of Superset you are using. This
can be installed by doing the following:
```bash
npm i -g yo
cd superset-frontend/packages/generator-superset
npm i
npm link
```
After this you can proceed to create your viz plugin. Create a new directory for your
viz plugin with the prefix `superset-plugin-chart` and run the Yeoman generator:
```bash
mkdir /tmp/superset-plugin-chart-hello-world
cd /tmp/superset-plugin-chart-hello-world
```
Initialize the viz plugin:
```bash
yo @superset-ui/superset
```
After that the generator will ask a few questions (the defaults should be fine):
```bash
$ yo @superset-ui/superset
_-----_ ╭──────────────────────────╮
| | │ Welcome to the │
|--(o)--| │ generator-superset │
`---------´ │ generator! │
( _´U`_ ) ╰──────────────────────────╯
/___A___\ /
| ~ |
__'.___.'__
´ ` |° ´ Y `
? Package name: superset-plugin-chart-hello-world
? Description: Hello World
? What type of chart would you like? Time-series chart
create package.json
create .gitignore
create babel.config.js
create jest.config.js
create README.md
create tsconfig.json
create src/index.ts
create src/plugin/buildQuery.ts
create src/plugin/controlPanel.ts
create src/plugin/index.ts
create src/plugin/transformProps.ts
create src/types.ts
create src/SupersetPluginChartHelloWorld.tsx
create test/index.test.ts
create test/__mocks__/mockExportString.js
create test/plugin/buildQuery.test.ts
create test/plugin/transformProps.test.ts
create types/external.d.ts
create src/images/thumbnail.png
```
To build the viz plugin, run the following commands:
```bash
npm i --force
npm run build
```
Alternatively, to run the viz plugin in development mode (=rebuilding whenever changes
are made), start the dev server with the following command:
```bash
npm run dev
```
To add the package to Superset, go to the `superset-frontend` subdirectory in your
Superset source folder run
```bash
npm i -S /tmp/superset-plugin-chart-hello-world
```
If you publish your package to npm, you can naturally install directly from there, too.
After this edit the `superset-frontend/src/visualizations/presets/MainPreset.js`
and make the following changes:
```js
import { SupersetPluginChartHelloWorld } from 'superset-plugin-chart-hello-world';
```
to import the viz plugin and later add the following to the array that's passed to the
`plugins` property:
```js
new SupersetPluginChartHelloWorld().configure({ key: 'ext-hello-world' }),
```
After that the viz plugin should show up when you run Superset, e.g. the development
server:
```bash
npm run dev-server
```
## Testing
### Python Testing
`pytest`, backend by docker-compose is how we recommend running tests locally.
For a more complex test matrix (against different database backends, python versions, ...) you
can rely on our GitHub Actions by simply opening a draft pull request.
Note that the test environment uses a temporary directory for defining the
SQLite databases which will be cleared each time before the group of test
commands are invoked.
There is also a utility script included in the Superset codebase to run python integration tests. The [readme can be
found here](https://github.com/apache/superset/tree/master/scripts/tests)
To run all integration tests for example, run this script from the root directory:
```bash
scripts/tests/run.sh
```
You can run unit tests found in './tests/unit_tests' for example with pytest. It is a simple way to run an isolated test that doesn't need any database setup
```bash
pytest ./link_to_test.py
```
#### Testing with local Presto connections
If you happen to change db engine spec for Presto/Trino, you can run a local Presto cluster with Docker:
```bash
docker run -p 15433:15433 starburstdata/presto:350-e.6
```
Then update `SUPERSET__SQLALCHEMY_EXAMPLES_URI` to point to local Presto cluster:
```bash
export SUPERSET__SQLALCHEMY_EXAMPLES_URI=presto://localhost:15433/memory/default
```
### Frontend Testing
We use [Jest](https://jestjs.io/) and [Enzyme](https://airbnb.io/enzyme/) to test TypeScript/JavaScript. Tests can be run with:
```bash
cd superset-frontend
npm run test
```
To run a single test file:
```bash
npm run test -- path/to/file.js
```
### E2E Integration Testing
For E2E testing, we recommend that you use a `docker compose` backend
```bash
CYPRESS_CONFIG=true docker compose up --build
```
`docker compose` will get to work and expose a Cypress-ready Superset app.
This app uses a different database schema (`superset_cypress`) to keep it isolated from
your other dev environmen(s)t, a specific set of examples, and a set of configurations that
aligns with the expectations within the end-to-end tests. Also note that it's served on a
different port than the default port for the backend (`8088`).
Now in another terminal, let's get ready to execute some Cypress commands. First, tell cypress
to connect to the Cypress-ready Superset backend.
```
CYPRESS_BASE_URL=http://localhost:8081
```
```bash
# superset-frontend/cypress-base is the base folder for everything Cypress-related
# It's essentially its own npm app, with its own dependencies, configurations and utilities
cd superset-frontend/cypress-base
npm install
# use interactive mode to run tests, while keeping memory usage contained
# this will fire up an interactive Cypress UI
# as you alter the code, the tests will re-run automatically, and you can visualize each
# and every step for debugging purposes
npx cypress open --config numTestsKeptInMemory=5
# to run the test suite on the command line using chrome (same as CI)
npm run cypress-run-chrome
# run tests from a specific file
npm run cypress-run-chrome -- --spec cypress/e2e/explore/link.test.ts
# run specific file with video capture
npm run cypress-run-chrome -- --spec cypress/e2e/dashboard/index.test.js --config video=true
# to open the cypress ui
npm run cypress-debug
```
See [`superset-frontend/cypress_build.sh`](https://github.com/apache/superset/blob/master/superset-frontend/cypress_build.sh).
As an alternative you can use docker compose environment for testing:
Make sure you have added below line to your /etc/hosts file:
`127.0.0.1 db`
If you already have launched Docker environment please use the following command to assure a fresh database instance:
`docker compose down -v`
Launch environment:
`CYPRESS_CONFIG=true docker compose up`
It will serve backend and frontend on port 8088.
Run Cypress tests:
```bash
cd cypress-base
npm install
npm run cypress open
```
### Debugging Server App
Follow these instructions to debug the Flask app running inside a docker container.
First add the following to the ./docker-compose.yaml file
```diff
superset:
env_file: docker/.env
image: *superset-image
container_name: superset_app
command: ["/app/docker/docker-bootstrap.sh", "app"]
restart: unless-stopped
+ cap_add:
+ - SYS_PTRACE
ports:
- 8088:8088
+ - 5678:5678
user: "root"
depends_on: *superset-depends-on
volumes: *superset-volumes
environment:
CYPRESS_CONFIG: "${CYPRESS_CONFIG}"
```
Start Superset as usual
```bash
docker compose up
```
Install the required libraries and packages to the docker container
Enter the superset_app container
```bash
docker exec -it superset_app /bin/bash
root@39ce8cf9d6ab:/app#
```
Run the following commands inside the container
```bash
apt update
apt install -y gdb
apt install -y net-tools
pip install debugpy
```
Find the PID for the Flask process. Make sure to use the first PID. The Flask app will re-spawn a sub-process every time you change any of the python code. So it's important to use the first PID.
```bash
ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 14:09 ? 00:00:00 bash /app/docker/docker-bootstrap.sh app
root 6 1 4 14:09 ? 00:00:04 /usr/local/bin/python /usr/bin/flask run -p 8088 --with-threads --reload --debugger --host=0.0.0.0
root 10 6 7 14:09 ? 00:00:07 /usr/local/bin/python /usr/bin/flask run -p 8088 --with-threads --reload --debugger --host=0.0.0.0
```
Inject debugpy into the running Flask process. In this case PID 6.
```bash
python3 -m debugpy --listen 0.0.0.0:5678 --pid 6
```
Verify that debugpy is listening on port 5678
```bash
netstat -tunap
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:5678 0.0.0.0:* LISTEN 462/python
tcp 0 0 0.0.0.0:8088 0.0.0.0:* LISTEN 6/python
```
You are now ready to attach a debugger to the process. Using VSCode you can configure a launch configuration file .vscode/launch.json like so.
```json
{
"version": "0.2.0",
"configurations": [
{
"name": "Attach to Superset App in Docker Container",
"type": "python",
"request": "attach",
"connect": {
"host": "127.0.0.1",
"port": 5678
},
"pathMappings": [
{
"localRoot": "${workspaceFolder}",
"remoteRoot": "/app"
}
]
},
]
}
```
VSCode will not stop on breakpoints right away. We've attached to PID 6 however it does not yet know of any sub-processes. In order to "wakeup" the debugger you need to modify a python file. This will trigger Flask to reload the code and create a new sub-process. This new sub-process will be detected by VSCode and breakpoints will be activated.
### Debugging Server App in Kubernetes Environment
To debug Flask running in POD inside a kubernetes cluster, you'll need to make sure the pod runs as root and is granted the `SYS_TRACE` capability. These settings should not be used in production environments.
```yaml
securityContext:
capabilities:
add: ["SYS_PTRACE"]
```
See [set capabilities for a container](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-capabilities-for-a-container) for more details.
Once the pod is running as root and has the `SYS_PTRACE` capability it will be able to debug the Flask app.
You can follow the same instructions as in `docker compose`. Enter the pod and install the required library and packages: gdb, netstat and debugpy.
Often in a Kubernetes environment nodes are not addressable from outside the cluster. VSCode will thus be unable to remotely connect to port 5678 on a Kubernetes node. In order to do this you need to create a tunnel that port forwards 5678 to your local machine.
```bash
kubectl port-forward pod/superset-<some random id> 5678:5678
```
You can now launch your VSCode debugger with the same config as above. VSCode will connect to to 127.0.0.1:5678 which is forwarded by kubectl to your remote kubernetes POD.
### Storybook
Superset includes a [Storybook](https://storybook.js.org/) to preview the layout/styling of various Superset components, and variations thereof. To open and view the Storybook:
```bash
cd superset-frontend
npm run storybook
```
When contributing new React components to Superset, please try to add a Story alongside the component's `jsx/tsx` file.
## Contributing Translations
We use [Flask-Babel](https://python-babel.github.io/flask-babel/) to translate Superset.
In Python files, we use the following
[translation functions](https://python-babel.github.io/flask-babel/#using-translations)
from `Flask-Babel`:
- `gettext` and `lazy_gettext` (usually aliased to `_`): for translating singular
strings.
- `ngettext`: for translating strings that might become plural.
```python
from flask_babel import lazy_gettext as _
```
then wrap the translatable strings with it, e.g. `_('Translate me')`.
During extraction, string literals passed to `_` will be added to the
generated `.po` file for each language for later translation.
At runtime, the `_` function will return the translation of the given
string for the current language, or the given string itself
if no translation is available.
In TypeScript/JavaScript, the technique is similar:
we import `t` (simple translation), `tn` (translation containing a number).
```javascript
import { t, tn } from "@superset-ui/translation";
```
### Enabling language selection
Add the `LANGUAGES` variable to your `superset_config.py`. Having more than one
option inside will add a language selection dropdown to the UI on the right side
of the navigation bar.
```python
LANGUAGES = {
'en': {'flag': 'us', 'name': 'English'},
'fr': {'flag': 'fr', 'name': 'French'},
'zh': {'flag': 'cn', 'name': 'Chinese'},
}
```
### Creating a new language dictionary
First check if the language code for your target language already exists. Check if the
[two letter ISO 639-1 code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes)
for your target language already exists in the `superset/translations` directory:
```bash
ls superset/translations | grep -E "^[a-z]{2}\/"
```
If your language already has a preexisting translation, skip to the next section
The following languages are already supported by Flask AppBuilder, and will make it
easier to translate the application to your target language:
[Flask AppBuilder i18n documentation](https://flask-appbuilder.readthedocs.io/en/latest/i18n.html)
To create a dictionary for a new language, first make sure the necessary dependencies are installed:
```bash
pip install -r superset/translations/requirements.txt
```
Then run the following, where `LANGUAGE_CODE` is replaced with the language code for your target
language:
```bash
pybabel init -i superset/translations/messages.pot -d superset/translations -l LANGUAGE_CODE
```
For instance, to add a translation for Finnish (language code `fi`), run the following:
```bash
pybabel init -i superset/translations/messages.pot -d superset/translations -l fi
```
### Extracting new strings for translation
Periodically, when working on translations, we need to extract the strings from both the
backend and the frontend to compile a list of all strings to be translated. It doesn't
happen automatically and is a required step to gather the strings and get them into the
`.po` files where they can be translated, so that they can then be compiled.
This script does just that:
```bash
./scripts/translations/babel_update.sh
```
### Updating language files
Run the following command to update the language files with the new extracted strings.
```bash
pybabel update -i superset/translations/messages.pot -d superset/translations --ignore-obsolete
```
You can then translate the strings gathered in files located under
`superset/translation`, where there's one folder per language. You can use [Poedit](https://poedit.net/features)
to translate the `po` file more conveniently.
Here is [a tutorial](https://web.archive.org/web/20220517065036/https://wiki.lxde.org/en/Translate_*.po_files_with_Poedit).
To perform the translation on MacOS, you can install `poedit` via Homebrew:
```bash
brew install poedit
```
After this, just start the `poedit` application and open the `messages.po` file. In the
case of the Finnish translation, this would be `superset/translations/fi/LC_MESSAGES/messages.po`.
### Applying translations
To make the translations available on the frontend, we need to convert the PO file into
a collection of JSON files. To convert all PO files to formatted JSON files you can use
the `build-translation` script
```bash
# Install dependencies if you haven't already
cd superset-frontend/ && npm ci
# Compile translations for the frontend
npm run build-translation
```
Finally, for the translations to take effect we need to compile translation catalogs into
binary MO files for the backend using `pybabel`.
```bash
# inside the project root
pybabel compile -d superset/translations
```
## Linting
### Python
We use [ruff](https://github.com/astral-sh/ruff) for linting which can be invoked via:
```
# auto-reformat using ruff
ruff format
# lint check with ruff
ruff check
# lint fix with ruff
ruff check --fix
```
Ruff configuration is located in our
(pyproject.toml)[https://github.com/apache/superset/blob/master/pyproject.toml] file
All this is configured to run in pre-commit hooks, which we encourage you to setup
with `pre-commit install`
### TypeScript
```bash
cd superset-frontend
npm ci
# run eslint checks
npm run eslint -- .
# run tsc (typescript) checks
npm run type
```
If using the eslint extension with vscode, put the following in your workspace `settings.json` file:
```json
"eslint.workingDirectories": [
"superset-frontend"
]
```
## GitHub Ephemeral Environments
On any given pull request on GitHub, it's possible to create a temporary environment/deployment
by simply adding the label `testenv-up` to the PR. Once you add the `testenv-up` label, a
GitHub Action will be triggered that will:
- build a docker image
- deploy it in EC2 (sponsored by the folks at [Preset](https://preset.io))
- write a comment on the PR with a link to the ephemeral environment
For more advanced use cases, it's possible to set a feature flag on the PR body, which will
take effect on the ephemeral environment. For example, if you want to set the `TAGGING_SYSTEM`
feature flag to `true`, you can add the following line to the PR body/description:
```
FEATURE_TAGGING_SYSTEM=true
```
Simarly, it's possible to disable feature flags with:
```
FEATURE_TAGGING_SYSTEM=false
```

View File

@@ -1,55 +0,0 @@
---
sidebar_position: 6
version: 1
---
# Miscellaneous
## Reporting a Security Vulnerability
Please report security vulnerabilities to private@superset.apache.org.
In the event a community member discovers a security flaw in Superset, it is important to follow the [Apache Security Guidelines](https://www.apache.org/security/committers.html) and release a fix as quickly as possible before public disclosure. Reporting security vulnerabilities through the usual GitHub Issues channel is not ideal as it will publicize the flaw before a fix can be applied.
## SQL Lab Async
It's possible to configure a local database to operate in `async` mode,
to work on `async` related features.
To do this, you'll need to:
- Add an additional database entry. We recommend you copy the connection
string from the database labeled `main`, and then enable `SQL Lab` and the
features you want to use. Don't forget to check the `Async` box
- Configure a results backend, here's a local `FileSystemCache` example,
not recommended for production,
but perfect for testing (stores cache in `/tmp`)
```python
from flask_caching.backends.filesystemcache import FileSystemCache
RESULTS_BACKEND = FileSystemCache('/tmp/sqllab')
```
- Start up a celery worker
```shell script
celery --app=superset.tasks.celery_app:app worker -O fair
```
Note that:
- for changes that affect the worker logic, you'll have to
restart the `celery worker` process for the changes to be reflected.
- The message queue used is a `sqlite` database using the `SQLAlchemy`
experimental broker. Ok for testing, but not recommended in production
- In some cases, you may want to create a context that is more aligned
to your production environment, and use the similar broker as well as
results backend configuration
## Async Chart Queries
It's possible to configure database queries for charts to operate in `async` mode. This is especially useful for dashboards with many charts that may otherwise be affected by browser connection limits. To enable async queries for dashboards and Explore, the following dependencies are required:
- Redis 5.0+ (the feature utilizes [Redis Streams](https://redis.io/topics/streams-intro))
- Cache backends enabled via the `CACHE_CONFIG` and `DATA_CACHE_CONFIG` config settings
- Celery workers configured and running to process async tasks

View File

@@ -1,104 +0,0 @@
---
sidebar_position: 5
version: 1
---
import InteractiveSVG from '../../../src/components/InteractiveERDSVG';
import Mermaid from '@theme/Mermaid';
# Resources
## High Level Architecture
<div style={{ maxWidth: "600px", margin: "0 auto", marginLeft: 0, marginRight: "auto" }}>
```mermaid
flowchart TD
%% Top Level
LB["<b>Load Balancer(s)</b><br/>(optional)"]
LB -.-> WebServers
%% Web Servers
subgraph WebServers ["<b>Web Server(s)</b>"]
WS1["<b>Frontend</b><br/>(React, AntD, ECharts, AGGrid)"]
WS2["<b>Backend</b><br/>(Python, Flask, SQLAlchemy, Pandas, ...)"]
end
%% Infra
subgraph InfraServices ["<b>Infra</b>"]
DB[("<b>Metadata Database</b><br/>(Postgres / MySQL)")]
subgraph Caching ["<b>Caching Subservices<br/></b>(Redis, memcache, S3, ...)"]
direction LR
DummySpace[" "]:::invisible
QueryCache["<b>Query Results Cache</b><br/>(Accelerated Dashboards)"]
CsvCache["<b>CSV Exports Cache</b>"]
ThumbnailCache["<b>Thumbnails Cache</b>"]
AlertImageCache["<b>Alert/Report Images Cache</b>"]
QueryCache -- " " --> CsvCache
linkStyle 1 stroke:transparent;
ThumbnailCache -- " " --> AlertImageCache
linkStyle 2 stroke:transparent;
end
Broker(("<b>Message Queue</b><br/>(Redis / RabbitMQ / SQS)"))
end
AsyncBackend["<b>Async Workers (Celery)</b><br>required for Alerts & Reports, thumbnails, CSV exports, long-running workloads, ..."]
%% External DBs
subgraph ExternalDatabases ["<b>Analytics Databases</b>"]
direction LR
BigQuery[(BigQuery)]
Snowflake[(Snowflake)]
Redshift[(Redshift)]
Postgres[(Postgres)]
Postgres[(... any ...)]
end
%% Connections
LB -.-> WebServers
WebServers --> DB
WebServers -.-> Caching
WebServers -.-> Broker
WebServers -.-> ExternalDatabases
Broker -.-> AsyncBackend
AsyncBackend -.-> ExternalDatabases
AsyncBackend -.-> Caching
%% Legend styling
classDef requiredNode stroke-width:2px,stroke:black;
class Required requiredNode;
class Optional optionalNode;
%% Hide real arrow
linkStyle 0 stroke:transparent;
%% Styling
classDef optionalNode stroke-dasharray: 5 5, opacity:0.9;
class LB optionalNode;
class Caching optionalNode;
class AsyncBackend optionalNode;
class Broker optionalNode;
class QueryCache optionalNode;
class CsvCache optionalNode;
class ThumbnailCache optionalNode;
class AlertImageCache optionalNode;
class Celery optionalNode;
classDef invisible fill:transparent,stroke:transparent;
```
</div>
## Entity-Relationship Diagram
Here is our interactive ERD:
<InteractiveSVG />
<br />
[Download the .svg](https://github.com/apache/superset/tree/master/docs/static/img/erd.svg)

Some files were not shown because too many files have changed in this diff Show More