mirror of
https://github.com/apache/superset.git
synced 2026-04-20 00:24:38 +00:00
feat(examples): Modernize example data loading with Parquet and YAML configs (#36538)
Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
35
UPDATING.md
35
UPDATING.md
@@ -24,6 +24,41 @@ assists people when migrating to a new version.
|
||||
|
||||
## Next
|
||||
|
||||
### Example Data Loading Improvements
|
||||
|
||||
#### New Directory Structure
|
||||
Examples are now organized by name with data and configs co-located:
|
||||
```
|
||||
superset/examples/
|
||||
├── _shared/ # Shared database & metadata configs
|
||||
├── birth_names/ # Each example is self-contained
|
||||
│ ├── data.parquet # Dataset (Parquet format)
|
||||
│ ├── dataset.yaml # Dataset metadata
|
||||
│ ├── dashboard.yaml # Dashboard config (optional)
|
||||
│ └── charts/ # Chart configs (optional)
|
||||
└── ...
|
||||
```
|
||||
|
||||
#### Simplified Parquet-based Loading
|
||||
- Auto-discovery: create `superset/examples/my_dataset/data.parquet` to add a new example
|
||||
- Parquet is an Apache project format: compressed (~27% smaller), self-describing schema
|
||||
- YAML configs define datasets, charts, and dashboards declaratively
|
||||
- Removed Python-based data generation from individual example files
|
||||
|
||||
#### Test Data Reorganization
|
||||
- Moved `big_data.py` to `superset/cli/test_loaders.py` - better reflects its purpose as a test utility
|
||||
- Fixed inverted logic for `--load-test-data` flag (now correctly includes .test.yaml files when flag is set)
|
||||
- Clarified CLI flags:
|
||||
- `--force` / `-f`: Force reload even if tables exist
|
||||
- `--only-metadata` / `-m`: Create table metadata without loading data
|
||||
- `--load-test-data` / `-t`: Include test dashboards and .test.yaml configs
|
||||
- `--load-big-data` / `-b`: Generate synthetic stress-test data
|
||||
|
||||
#### Bug Fixes
|
||||
- Fixed numpy array serialization for PostgreSQL (converts complex types to JSON strings)
|
||||
- Fixed KeyError for `allow_csv_upload` field in database configs (now optional with default)
|
||||
- Fixed test data loading logic that was incorrectly filtering files
|
||||
|
||||
### MCP Service
|
||||
|
||||
The MCP (Model Context Protocol) service enables AI assistants and automation tools to interact programmatically with Superset.
|
||||
|
||||
Reference in New Issue
Block a user