Files
superset2/superset-frontend/plugins/plugin-chart-country-map/scripts
Evan Rusackas 989ed61f34 feat(country-map): build script — composite_maps transform (5/5 transforms)
Implements the fifth and final transform from the notebook audit.
A composite combines a base country's Admin 1 features with:
- base_repositions (with optional `group: true` for grouped transforms
  like Paris + petite couronne treated as one body)
- additions (features pulled from sibling countries' Admin 1, with
  optional dissolve, drop_parts, reposition, and attribute set)

Verified on France-with-Overseas:
  france_overseas: 108 features → composite_france_overseas_ukr.geo.json
                                  (322,058 bytes)

108 = 101 FRA admin1 departments + 7 additions (Polynésie française,
Terres australes et antarctiques françaises, Wallis-et-Futuna,
Nouvelle-Calédonie, Saint-Pierre-et-Miquelon, Saint-Martin,
Saint-Barthélémy).

Bug fix during implementation: composites pull additions from Admin 1
of sibling countries (Windward Islands is a PYF Admin 1 subdivision,
not an Admin 0 country), not from Admin 0. Initial implementation got
this wrong and warned 0 features. Fixed by sourcing from base_admin1
(the global Admin 1 dataset, which contains all countries'
subdivisions).

New helpers:
- _drop_parts(geom, indices) — drop sub-polygon indices from MultiPolygon
- _translate_and_scale_with_pivot — explicit pivot (vs feature centroid),
  used for `group: true` transforms

==== Build pipeline status ====

All 5 declarative transforms implemented and verified:
  ✓ name_overrides         (19 updates per Admin 1 build)
  ✓ flying_islands         (12 reposition + 5 bbox drop)
  ✓ territory_assignments  (4 features added: TWN/HKG/MAC/ALD)
  ✓ regional_aggregations  (4 region sets: TUR/FRA/ITA/PHL)
  ✓ composite_maps         (1 composite: france_overseas)

Current outputs (UA worldview):
  ukr_admin0.geo.json                       2.1 MB   249 features
  ukr_admin1.geo.json                        15 MB  4595 features
  regional_TUR_nuts_1_ukr.geo.json           23 KB    12 regions
  regional_FRA_regions_ukr.geo.json          32 KB    18 regions
  regional_ITA_regions_ukr.geo.json          32 KB    20 regions
  regional_PHL_regions_ukr.geo.json          32 KB    17 regions
  composite_france_overseas_ukr.geo.json    322 KB   108 features

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 16:34:43 -07:00
..

Country Map data pipeline

This directory contains the build pipeline that turns upstream Natural Earth data into the GeoJSON files consumed by @superset-ui/plugin-chart-country-map.

It replaces the legacy scripts/Country Map GeoJSON Generator.ipynb notebook. See SIP_DRAFT.md in the parent directory for the full design rationale.

Layout

scripts/
  build.sh                   # one-shot reproducible build
  README.md                  # this file
  config/                    # declarative YAML — handles ~95% of fixes
    name_overrides.yaml      # typos, deprecated ISO codes, admin renames
    flying_islands.yaml      # repositioning + bbox drops for far-flung territories
    territory_assignments.yaml   # add features from sibling Admin 0 records
    regional_aggregations.yaml   # dissolve Admin 1 into administrative regions
    composite_maps.yaml      # multi-country composites (e.g. France-with-Overseas)
  procedural/                # escape hatch — handles the rare 5%
    README.md                # when to use, when not
    NN_<descriptive_name>.py # one focused script per genuine edge case
  output/                    # gitignored — build artifacts

Operating principles

  • Default tool: declarative YAML. Most touchups are renames, repositions, dissolves, or filters — all expressible in YAML. Diffs are small, conflicts localize cleanly to one entry, contributors can submit "fix typo X" as a one-line PR.
  • Escape hatch: procedural/ directory of small, named, single-purpose Python scripts for the rare cases YAML can't express cleanly. Each script has a header comment explaining why it's not in YAML. See procedural/README.md for the bar.
  • Build is reproducible from a pinned NE version. build.sh records the NE git SHA it consumed; outputs are deterministic given inputs.
  • CI regenerates on schema change and opens a PR if outputs differ. Maintainers review the cartographic diff in legible GeoJSON, not opaque notebook JSON.

Workflow for adding a fix

  1. Identify the upstream NE issue (wrong name, missing territory, etc.).
  2. Try YAML first. Add the smallest possible entry to the appropriate config file with a description field explaining the fix.
  3. If YAML can't express it cleanly, add a numbered script in procedural/ with a header comment explaining why YAML didn't fit.
  4. Run build.sh locally, verify the output GeoJSON looks right.
  5. Open PR. Reviewer sees the YAML diff (or new procedural script) plus the regenerated GeoJSON.

See also

  • SIP_DRAFT.md (parent dir) — design rationale, notebook audit, obsolescence check
  • procedural/README.md — when to use the escape hatch