Files
superset2/AGENTS.md
2026-01-28 21:33:01 -08:00

270 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# LLM Context Guide for Apache Superset
Apache Superset is a data visualization platform with Flask/Python backend and React/TypeScript frontend.
## ⚠️ CRITICAL: Always Run Pre-commit Before Pushing
**ALWAYS run `pre-commit run --all-files` before pushing commits.** CI will fail if pre-commit checks don't pass. This is non-negotiable.
```bash
# Stage your changes first
git add .
# Run pre-commit on all files
pre-commit run --all-files
# If there are auto-fixes, stage them and commit
git add .
git commit --amend # or new commit
```
Common pre-commit failures:
- **Formatting** - black, prettier, eslint will auto-fix
- **Type errors** - mypy failures need manual fixes
- **Linting** - ruff, pylint issues need manual fixes
## ⚠️ CRITICAL: Ongoing Refactors (What NOT to Do)
**These migrations are actively happening - avoid deprecated patterns:**
### Frontend Modernization
- **NO `any` types** - Use proper TypeScript types
- **NO JavaScript files** - Convert to TypeScript (.ts/.tsx)
- **Use @superset-ui/core** - Don't import Ant Design directly, prefer Ant Design component wrappers from @superset-ui/core/components
- **Use antd theming tokens** - Prefer antd tokens over legacy theming tokens
- **Avoid custom css and styles** - Follow antd best practices and avoid styling and custom CSS whenever possible
### Testing Strategy Migration
- **Prefer unit tests** over integration tests
- **Prefer integration tests** over end-to-end tests
- **Use Playwright for E2E tests** - Migrating from Cypress
- **Cypress is deprecated** - Will be removed once migration is completed
- **Use Jest + React Testing Library** for component testing
- **Use `test()` instead of `describe()`** - Follow [avoid nesting when testing](https://kentcdodds.com/blog/avoid-nesting-when-youre-testing) principles
### Backend Type Safety
- **Add type hints** - All new Python code needs proper typing
- **MyPy compliance** - Run `pre-commit run mypy` to validate
- **SQLAlchemy typing** - Use proper model annotations
### UUID Migration
- **Prefer UUIDs over auto-incrementing IDs** - New models should use UUID primary keys
- **External API exposure** - Use UUIDs in public APIs instead of internal integer IDs
- **Existing models** - Add UUID fields alongside integer IDs for gradual migration
## Key Directories
```
superset/
├── superset/ # Python backend (Flask, SQLAlchemy)
│ ├── views/api/ # REST API endpoints
│ ├── models/ # Database models
│ └── connectors/ # Database connections
├── superset-frontend/src/ # React TypeScript frontend
│ ├── components/ # Reusable components
│ ├── explore/ # Chart builder
│ ├── dashboard/ # Dashboard interface
│ └── SqlLab/ # SQL editor
├── superset-frontend/packages/
│ └── superset-ui-core/ # UI component library (USE THIS)
├── tests/ # Python/integration tests
├── docs/ # Documentation (UPDATE FOR CHANGES)
└── UPDATING.md # Breaking changes log
```
## Code Standards
### TypeScript Frontend
- **Avoid `any` types** - Use proper TypeScript, reuse existing types
- **Functional components** with hooks
- **@superset-ui/core** for UI components (not direct antd)
- **Jest** for testing (NO Enzyme)
- **Redux** for global state where it exists, hooks for local
### Python Backend
- **Type hints required** for all new code
- **MyPy compliant** - run `pre-commit run mypy`
- **SQLAlchemy models** with proper typing
- **pytest** for testing
### Apache License Headers
- **New files require ASF license headers** - When creating new code files, include the standard Apache Software Foundation license header
- **LLM instruction files are excluded** - Files like AGENTS.md, CLAUDE.md, etc. are in `.rat-excludes` to avoid header token overhead
### Code Comments
- **Avoid time-specific language** - Don't use words like "now", "currently", "today" in code comments as they become outdated
- **Write timeless comments** - Comments should remain accurate regardless of when they're read
## Documentation Requirements
- **docs/**: Update for any user-facing changes
- **UPDATING.md**: Add breaking changes here
- **Docstrings**: Required for new functions/classes
## Developer Portal: Storybook-to-MDX Documentation
The Developer Portal auto-generates MDX documentation from Storybook stories. **Stories are the single source of truth.**
### Core Philosophy
- **Fix issues in the STORY, not the generator** - When something doesn't render correctly, update the story file first
- **Generator should be lightweight** - It extracts and passes through data; avoid special cases
- **Stories define everything** - Props, controls, galleries, examples all come from story metadata
### Story Requirements for Docs Generation
- Use `export default { title: '...' }` (inline), not `const meta = ...; export default meta;`
- Name interactive stories `Interactive${ComponentName}` (e.g., `InteractiveButton`)
- Define `args` for default prop values
- Define `argTypes` at the story level (not meta level) with control types and descriptions
- Use `parameters.docs.gallery` for size×style variant grids
- Use `parameters.docs.sampleChildren` for components that need children
- Use `parameters.docs.liveExample` for custom live code blocks
- Use `parameters.docs.staticProps` for complex object props that can't be parsed inline
### Generator Location
- Script: `docs/scripts/generate-superset-components.mjs`
- Wrapper: `docs/src/components/StorybookWrapper.jsx`
- Output: `docs/developer_portal/components/`
## Architecture Patterns
### Security & Features
- **RBAC**: Role-based access via Flask-AppBuilder
- **Feature flags**: Control feature rollouts
- **Row-level security**: SQL-based data access control
## Test Utilities
### Python Test Helpers
- **`SupersetTestCase`** - Base class in `tests/integration_tests/base_tests.py`
- **`@with_config`** - Config mocking decorator
- **`@with_feature_flags`** - Feature flag testing
- **`login_as()`, `login_as_admin()`** - Authentication helpers
- **`create_dashboard()`, `create_slice()`** - Data setup utilities
### TypeScript Test Helpers
- **`superset-frontend/spec/helpers/testing-library.tsx`** - Custom render() with providers
- **`createWrapper()`** - Redux/Router/Theme wrapper
- **`selectOption()`** - Select component helper
- **React Testing Library** - NO Enzyme (removed)
### Test Database Patterns
- **Mock patterns**: Use `MagicMock()` for config objects, avoid `AsyncMock` for synchronous code
- **API tests**: Update expected columns when adding new model fields
### Running Tests
```bash
# Frontend
npm run test # All tests
npm run test -- filename.test.tsx # Single file
# E2E Tests (Playwright - NEW)
npm run playwright:test # All Playwright tests
npm run playwright:ui # Interactive UI mode
npm run playwright:headed # See browser during tests
npx playwright test tests/auth/login.spec.ts # Single file
npm run playwright:debug tests/auth/login.spec.ts # Debug specific file
# E2E Tests (Cypress - DEPRECATED)
cd superset-frontend/cypress-base
npm run cypress-run-chrome # All Cypress tests (headless)
npm run cypress-debug # Interactive Cypress UI
# Backend
pytest # All tests
pytest tests/unit_tests/specific_test.py # Single file
pytest tests/unit_tests/ # Directory
# If pytest fails with database/setup issues, ask the user to run test environment setup
```
## Environment Validation
**Quick Setup Check (run this first):**
```bash
# Verify Superset is running
curl -f http://localhost:8088/health || echo "❌ Setup required - see https://superset.apache.org/docs/contributing/development#working-with-llms"
```
**If health checks fail:**
"It appears you aren't set up properly. Please refer to the [Working with LLMs](https://superset.apache.org/docs/contributing/development#working-with-llms) section in the development docs for setup instructions."
**Key Project Files:**
- `superset-frontend/package.json` - Frontend build scripts (`npm run dev` on port 9000, `npm run test`, `npm run lint`)
- `pyproject.toml` - Python tooling (ruff, mypy configs)
- `requirements/` folder - Python dependencies (base.txt, development.txt)
## SQLAlchemy Query Best Practices
- **Use negation operator**: `~Model.field` instead of `== False` to avoid ruff E712 errors
- **Example**: `~Model.is_active` instead of `Model.is_active == False`
## Pull Request Guidelines
**When creating pull requests:**
1. **Read the current PR template**: Always check `.github/PULL_REQUEST_TEMPLATE.md` for the latest format
2. **Use the template sections**: Include all sections from the template (SUMMARY, BEFORE/AFTER, TESTING INSTRUCTIONS, ADDITIONAL INFORMATION)
3. **Follow PR title conventions**: Use [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
- Format: `type(scope): description`
- Example: `fix(dashboard): load charts correctly`
- Types: `fix`, `feat`, `docs`, `style`, `refactor`, `perf`, `test`, `chore`
**Important**: Always reference the actual template file at `.github/PULL_REQUEST_TEMPLATE.md` instead of using cached content, as the template may be updated over time.
## Pre-commit Validation
**Use pre-commit hooks for quality validation:**
```bash
# Install hooks
pre-commit install
# IMPORTANT: Stage your changes first!
git add . # Pre-commit only checks staged files
# Quick validation (faster than --all-files)
pre-commit run # Staged files only
pre-commit run mypy # Python type checking
pre-commit run prettier # Code formatting
pre-commit run eslint # Frontend linting
```
**Important pre-commit usage notes:**
- **Stage files first**: Run `git add .` before `pre-commit run` to check only changed files (much faster)
- **Virtual environment**: Activate your Python virtual environment before running pre-commit
```bash
# Common virtual environment locations (yours may differ):
source .venv/bin/activate # if using .venv
source venv/bin/activate # if using venv
source ~/venvs/superset/bin/activate # if using a central location
```
If you get a "command not found" error, ask the user which virtual environment to activate
- **Auto-fixes**: Some hooks auto-fix issues (e.g., trailing whitespace). Re-run after fixes are applied
## Common File Patterns
### API Structure
- **`/api.py`** - REST endpoints with decorators and OpenAPI docstrings
- **`/schemas.py`** - Marshmallow validation schemas for OpenAPI spec
- **`/commands/`** - Business logic classes with @transaction() decorators
- **`/models/`** - SQLAlchemy database models
- **OpenAPI docs**: Auto-generated at `/swagger/v1` from docstrings and schemas
### Migration Files
- **Location**: `superset/migrations/versions/`
- **Naming**: `YYYY-MM-DD_HH-MM_hash_description.py`
- **Utilities**: Use helpers from `superset.migrations.shared.utils` for database compatibility
- **Pattern**: Import utilities instead of raw SQLAlchemy operations
## Platform-Specific Instructions
- **[CLAUDE.md](CLAUDE.md)** - For Claude/Anthropic tools
- **[.github/copilot-instructions.md](.github/copilot-instructions.md)** - For GitHub Copilot
- **[GEMINI.md](GEMINI.md)** - For Google Gemini tools
- **[GPT.md](GPT.md)** - For OpenAI/ChatGPT tools
- **[.cursor/rules/dev-standard.mdc](.cursor/rules/dev-standard.mdc)** - For Cursor editor
---
**LLM Note**: This codebase is actively modernizing toward full TypeScript and type safety. Always run `pre-commit run` to validate changes. Follow the ongoing refactors section to avoid deprecated patterns.