feat(country-map): scaffold scripts/ dir with YAML config schemas

First-pass schemas for the build pipeline's declarative config layer.
Each schema is documented inline + populated with concrete entries
ported from the legacy notebook's audited touchups (those that the
obsolescence check determined still need to ship).

scripts/
├── README.md                 — pipeline overview, layout, workflow
├── config/
│   ├── name_overrides.yaml         — France typos, ISO codes; PHL renames
│   ├── flying_islands.yaml         — USA/NOR/PRT/ESP/FRA repositions; NLD/GBR drops
│   ├── territory_assignments.yaml  — China + SARs; Finland + Åland
│   ├── regional_aggregations.yaml  — Turkey NUTS-1; FRA/ITA/PHL regions
│   └── composite_maps.yaml         — France-with-Overseas
└── procedural/
    └── README.md             — escape-hatch rules + skeleton (currently empty)

All five YAML files parse cleanly (validated with PyYAML).

Schema design choices:
- Every entry has a `description:` field. Forces honest documentation
  of why each fix exists; reviewers can scan rationale at a glance.
- Match semantics: simple AND-of-conditions; supports `{ in: [...] }`
  for value-set matching.
- composite_maps and territory_assignments share the "pull feature
  from sibling Admin 0" primitive; build script can implement once.
- composite_maps.yaml has a TODO marker for SPM offsets — notebook
  cell 63 was truncated in the audit; will backfill during build
  script implementation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Evan Rusackas
2026-05-12 15:56:04 -07:00
parent 1fffd9d832
commit 1eb48e94fc
7 changed files with 628 additions and 0 deletions

View File

@@ -0,0 +1,43 @@
# Country Map data pipeline
This directory contains the build pipeline that turns upstream Natural Earth data into the GeoJSON files consumed by `@superset-ui/plugin-chart-country-map`.
It replaces the legacy `scripts/Country Map GeoJSON Generator.ipynb` notebook. See `SIP_DRAFT.md` in the parent directory for the full design rationale.
## Layout
```
scripts/
build.sh # one-shot reproducible build
README.md # this file
config/ # declarative YAML — handles ~95% of fixes
name_overrides.yaml # typos, deprecated ISO codes, admin renames
flying_islands.yaml # repositioning + bbox drops for far-flung territories
territory_assignments.yaml # add features from sibling Admin 0 records
regional_aggregations.yaml # dissolve Admin 1 into administrative regions
composite_maps.yaml # multi-country composites (e.g. France-with-Overseas)
procedural/ # escape hatch — handles the rare 5%
README.md # when to use, when not
NN_<descriptive_name>.py # one focused script per genuine edge case
output/ # gitignored — build artifacts
```
## Operating principles
- **Default tool: declarative YAML.** Most touchups are renames, repositions, dissolves, or filters — all expressible in YAML. Diffs are small, conflicts localize cleanly to one entry, contributors can submit "fix typo X" as a one-line PR.
- **Escape hatch: `procedural/` directory** of small, named, single-purpose Python scripts for the rare cases YAML can't express cleanly. Each script has a header comment explaining *why* it's not in YAML. See `procedural/README.md` for the bar.
- **Build is reproducible from a pinned NE version.** `build.sh` records the NE git SHA it consumed; outputs are deterministic given inputs.
- **CI regenerates on schema change** and opens a PR if outputs differ. Maintainers review the cartographic diff in legible GeoJSON, not opaque notebook JSON.
## Workflow for adding a fix
1. Identify the upstream NE issue (wrong name, missing territory, etc.).
2. **Try YAML first.** Add the smallest possible entry to the appropriate config file with a `description` field explaining the fix.
3. If YAML can't express it cleanly, add a numbered script in `procedural/` with a header comment explaining why YAML didn't fit.
4. Run `build.sh` locally, verify the output GeoJSON looks right.
5. Open PR. Reviewer sees the YAML diff (or new procedural script) plus the regenerated GeoJSON.
## See also
- `SIP_DRAFT.md` (parent dir) — design rationale, notebook audit, obsolescence check
- `procedural/README.md` — when to use the escape hatch

View File

@@ -0,0 +1,130 @@
# Multi-country composite maps that pull features from several Admin 0
# records into a single GeoJSON output, repositioning each into insets.
#
# The canonical example is "France with Overseas" — mainland France plus
# 5 DROM departments (already part of FRA Admin 0) plus territories from
# 5 separate Admin 0 records (PYF, ATF, WLF, NCL, SPM) all repositioned
# around the mainland into one composite map.
#
# Build script reads each composite definition and:
# 1. Loads the base country at the requested admin level
# 2. Applies base_repositions to specified features
# 3. For each addition: pulls the feature from another Admin 0 record,
# optionally drops sub-polygons by index, repositions, dissolves
# 4. Outputs a single GeoJSON keyed by the composite's identifier
# 5. Plugin exposes it in the country picker with `display_name`
#
# Schema:
# composites:
# <composite_id>:
# description: human-readable
# display_name: text shown in UI dropdown
# admin_level: 0 | 1 # which level the composite represents
# base:
# adm0_a3: <ISO3>
# base_repositions: # optional; applied to base features
# - description: human-readable
# match: { name: <feature name>, ... }
# offset: [x, y]
# scale: <number> # optional
# drop_parts: [<int>, ...] # optional; drop specific sub-polygon indices
# additions: # features pulled from other Admin 0 records
# - description: human-readable
# from:
# adm0_a3: <ISO3 source>
# match: { ... } # which feature(s) to pull
# dissolve: true # optional; merge all matched features
# drop_parts: [<int>, ...] # optional
# reposition:
# offset: [x, y]
# scale: <number> # optional
composites:
# -------------------------------------------------------------------
# France with Overseas — mainland France + DROMs + sister Admin 0
# territories under French sovereignty, all repositioned into a single
# frame. Most complex composite; templates the schema for others.
# -------------------------------------------------------------------
france_overseas:
description: |
Mainland France plus all overseas territories (DROMs + COMs +
Polynesia + Southern Lands + Wallis-Futuna + New Caledonia +
Saint-Pierre-et-Miquelon) shown in one composite map.
display_name: "France (with overseas)"
admin_level: 1
base:
adm0_a3: FRA
base_repositions:
# The 5 overseas DROMs (départements et régions d'outre-mer) — already
# part of FRA Admin 0 in NE. Repositioned aggressively for layout.
- description: Reposition Guadeloupe near mainland
match: { name: Guadeloupe }
offset: [53.2, 29.0]
scale: 1.5
- description: Reposition Martinique near mainland
match: { name: Martinique }
offset: [52.8, 27.5]
scale: 1.5
- description: Reposition French Guiana (shrunk — it's vast)
match: { name: "Guyane française" }
offset: [45.0, 35.5]
scale: 0.3
- description: Reposition La Réunion
match: { name: "La Réunion" }
offset: [-58.2, 60.5]
scale: 1.5
- description: Reposition Mayotte
match: { name: Mayotte }
offset: [-50.5, 52.2]
scale: 2.0
additions:
# French Polynesia — only the Windward Islands (Tahiti area), and
# we drop the second sub-polygon to avoid visual conflict with
# Corsica when laid out.
- description: Add Tahiti (Windward Islands) from French Polynesia, drop Rimatuu sub-polygon
from:
adm0_a3: PYF
match: { name: "Windward Islands" }
drop_parts: [1]
reposition:
offset: [158.2, 57.3]
scale: 2.0
# French Southern and Antarctic Lands — Kerguelen Islands only.
- description: Add Archipel des Kerguelen from French Southern Lands
from:
adm0_a3: ATF
match: { name: "Archipel des Kerguelen" }
reposition:
offset: [-63.5, 88.5]
scale: 0.9
# Wallis and Futuna — dissolve Alo + Uvea into one shape.
- description: Add Wallis and Futuna (dissolved)
from:
adm0_a3: WLF
dissolve: true
reposition:
offset: [170, 52.5]
scale: 4.0
# New Caledonia — dissolve all subdivisions.
- description: Add New Caledonia (dissolved)
from:
adm0_a3: NCL
dissolve: true
reposition:
offset: [-165.5, 60.4]
scale: 0.4
# Saint-Pierre and Miquelon — dissolved
# NOTE: notebook cell 63 was truncated in the audit; offsets TBD
# during build script implementation. Placeholder values below.
- description: Add Saint-Pierre and Miquelon (dissolved)
from:
adm0_a3: SPM
dissolve: true
reposition:
offset: [0, 0] # TODO: extract from full notebook cell
scale: 1.0

View File

@@ -0,0 +1,137 @@
# Per-country handling of far-flung territories.
#
# Two operations per country, both build-time:
# - repositions: territories moved into insets near the mainland
# - drop_outside_bbox: territories outside this bbox dropped entirely
#
# At chart-render time, the "show flying islands" toggle controls which
# repositioned/dropped territories are visible:
# - ON (default): the chart's renderer applies the repositions defined
# here, OR uses the composite_projection if specified (preferred when
# available — see geoAlbersUsa for example).
# - OFF: territories matched by EITHER `repositions` OR
# `drop_outside_bbox` are filtered out of the rendered feature set.
#
# Schema:
# countries:
# <ISO3>:
# composite_projection: <d3 projection name> # optional, takes
# # precedence over repositions when the renderer supports it
# repositions:
# - description: human-readable why
# match: { name: <feature name>, ... }
# offset: [x, y] # required; degrees
# scale: <number> # optional; default 1.0
# simplify: <number> # optional; mapshaper -simplify factor
# drop_outside_bbox:
# description: human-readable why
# nw: [lon, lat] # northwest corner
# se: [lon, lat] # southeast corner
#
# Match semantics: same as name_overrides.yaml — all conditions AND'd.
# `match.name: { in: [a, b, c] }` matches any of a, b, c.
countries:
# -------------------------------------------------------------------
# USA — Hawaii, Alaska
# D3's geoAlbersUsa handles these natively at render time. We still
# ship build-time repositions as a fallback for renderers that don't
# support composite projections.
# -------------------------------------------------------------------
USA:
composite_projection: geoAlbersUsa
repositions:
- description: Bring Hawaii in alongside California
match: { name: Hawaii }
offset: [51, 5.5]
- description: Shrink and reposition Alaska below the lower 48
match: { name: Alaska }
offset: [35, -34]
scale: 0.35
# -------------------------------------------------------------------
# Norway — Svalbard
# -------------------------------------------------------------------
NOR:
repositions:
- description: Bring Svalbard in closer to mainland Norway
match: { name: Svalbard }
offset: [-12, -8]
scale: 0.5
# -------------------------------------------------------------------
# Portugal — Atlantic islands
# -------------------------------------------------------------------
PRT:
repositions:
- description: Pull the Azores closer to mainland
match: { name: Azores }
offset: [11, 0]
- description: Pull Madeira closer; small extra simplify because dense coastline
match: { name: Madeira }
offset: [6, 2]
simplify: 0.015
# -------------------------------------------------------------------
# Spain — Canary Islands
# -------------------------------------------------------------------
ESP:
repositions:
- description: Bring Canary Islands closer to mainland Spain
match:
name: { in: ["Las Palmas", "Santa Cruz de Tenerife"] }
offset: [3, 7]
# -------------------------------------------------------------------
# France — Overseas DROMs (Départements et régions d'outre-mer)
# For the full France-with-Overseas composite (incl. PYF, ATF, WLF,
# NCL, SPM), see composite_maps.yaml.
# -------------------------------------------------------------------
FRA:
composite_projection: geoConicEqualAreaFrance # if available in renderer
repositions:
- description: Reposition Guadeloupe near mainland France
match: { name: Guadeloupe }
offset: [57.4, 25.4]
scale: 1.5
- description: Reposition Martinique near mainland France
match: { name: Martinique }
offset: [58.4, 27.1]
scale: 1.5
- description: Reposition French Guiana (shrunk significantly — it's larger than mainland France)
match: { name: "Guyane française" }
offset: [52, 37.7]
scale: 0.35
- description: Reposition La Réunion
match: { name: "La Réunion" }
offset: [-55, 62.8]
scale: 1.5
- description: Reposition Mayotte
match: { name: Mayotte }
offset: [-43, 54.3]
scale: 1.5
# -------------------------------------------------------------------
# Netherlands — drop Caribbean territories
# The Caribbean Netherlands (Bonaire, Sint Eustatius, Saba) and the
# constituent countries (Aruba, Curaçao, Sint Maarten) are far from
# mainland NL. The notebook drops them rather than repositioning;
# we preserve that editorial choice.
# -------------------------------------------------------------------
NLD:
drop_outside_bbox:
description: Drop Caribbean Netherlands & ABC islands; keep mainland + Frisian islands
nw: [-20, 60]
se: [20, 20]
# -------------------------------------------------------------------
# United Kingdom — drop British Overseas Territories
# Same pattern as NLD — drop (don't reposition) territories far from
# the British Isles.
# -------------------------------------------------------------------
GBR:
drop_outside_bbox:
description: Drop British Overseas Territories; keep British Isles
nw: [-10, 60]
se: [20, 20]

View File

@@ -0,0 +1,81 @@
# Per-feature attribute corrections to Natural Earth data.
#
# Use when NE has a wrong value for a specific feature: typos, outdated
# administrative names, deprecated ISO codes, etc.
# For one-off geometry fixes, use procedural/ scripts instead.
#
# Schema:
# overrides:
# - description: Human-readable why this override exists (REQUIRED)
# match:
# adm0_a3: <ISO3 country code> # required: scope to one country
# <field>: <value> # one or more match conditions
# set:
# <field>: <value> # one or more fields to set
# [...]
#
# Match semantics: ALL conditions must match (logical AND). Apply to
# both Admin 0 and Admin 1 features unless scope is restricted further.
#
# Tracking: each override should be revisited periodically against
# upstream NE — many of these become obsolete when NE catches up.
overrides:
# -------------------------------------------------------------------
# France — typos in NE attribute table (NE 5.x still ships these)
# -------------------------------------------------------------------
- description: Fix typo "Seien-et-Marne" → "Seine-et-Marne"
match: { adm0_a3: FRA, name: "Seien-et-Marne" }
set: { name: "Seine-et-Marne" }
- description: Fix typo "Haute-Rhin" → "Haut-Rhin"
match: { adm0_a3: FRA, name: "Haute-Rhin" }
set: { name: "Haut-Rhin" }
# -------------------------------------------------------------------
# France — update ISO 3166-2 codes to current values
# NE still uses pre-2016 region codes; map them to current standard.
# -------------------------------------------------------------------
- description: Paris uses ISO 3166-2 code FR-75C as of 2016 (NE has FR-75)
match: { adm0_a3: FRA, iso_3166_2: "FR-75" }
set: { iso_3166_2: "FR-75C" }
- description: Guadeloupe is FR-971 in current ISO (NE has FR-GP)
match: { adm0_a3: FRA, iso_3166_2: "FR-GP" }
set: { iso_3166_2: "FR-971" }
- description: Martinique is FR-972 in current ISO (NE has FR-MQ)
match: { adm0_a3: FRA, iso_3166_2: "FR-MQ" }
set: { iso_3166_2: "FR-972" }
- description: French Guiana is FR-973 in current ISO (NE has FR-GF)
match: { adm0_a3: FRA, iso_3166_2: "FR-GF" }
set: { iso_3166_2: "FR-973" }
- description: La Réunion is FR-974 in current ISO (NE has FR-RE)
match: { adm0_a3: FRA, iso_3166_2: "FR-RE" }
set: { iso_3166_2: "FR-974" }
- description: Mayotte is FR-976 in current ISO (NE has FR-YT)
match: { adm0_a3: FRA, iso_3166_2: "FR-YT" }
set: { iso_3166_2: "FR-976" }
# -------------------------------------------------------------------
# Philippines — administrative renames
# -------------------------------------------------------------------
- description: Region XIII renamed to "Caraga" in 2010 (NE still says "Dinagat Islands")
match: { adm0_a3: PHL, region: "Dinagat Islands (Region XIII)" }
set: { region: "Caraga Administrative Region (Region XIII)" }
- description: ARMM reorganized as BARMM under the Bangsamoro Organic Law (2018-2019)
match: { adm0_a3: PHL, region: "Autonomous Region in Muslim Mindanao (ARMM)" }
set: { region: "Bangsamoro Autonomous Region in Muslim Mindanao (BARMM)" }
# -------------------------------------------------------------------
# NOT included here — handled by other mechanisms:
# - Vietnam diacritics → use NE's NAME_VI field via name_language=vi
# - Crimea/Sevastopol → handled by NE _ukr worldview selection
# - China + SARs → see territory_assignments.yaml
# - Finland + Åland → see territory_assignments.yaml
# - France-with-Overseas → see composite_maps.yaml
# -------------------------------------------------------------------

View File

@@ -0,0 +1,94 @@
# Dissolve Admin 1 features into coarser administrative regions.
#
# Some countries have a meaningful intermediate level between Admin 0
# (country) and Admin 1 (provinces/states/departments). Examples:
# - Turkey: NUTS-1 statistical regions (12 regions from 81 provinces)
# - France: 18 administrative regions dissolved from 101 departments
# - Italy: 20 regions dissolved from 110 provinces
# - Philippines: 17 regions dissolved from 118 provinces+cities
#
# For each defined region set, the build script:
# 1. Loads the country's Admin 1 features
# 2. Dissolves features by the mapping below
# 3. Outputs a new GeoJSON keyed by `<country>_<set_name>`
# 4. Plugin exposes it as a third "admin level" option in the UI:
# "Admin 0 (countries) / Admin 1 (subdivisions) / Aggregated regions"
#
# Schema:
# countries:
# <ISO3>:
# region_sets:
# <set_name>: # arbitrary identifier
# description: human-readable
# display_name: text shown in UI dropdown
# grouping_field: <field> # field on Admin 1 features used to group
# # OR
# explicit_mapping: # explicit ISO → region_code dict
# <region_code>:
# name: <display name>
# members: [<iso_3166_2>, ...]
countries:
# -------------------------------------------------------------------
# Turkey — NUTS-1 statistical regions
# Hand-coded mapping of 81 cities → 12 regions per Eurostat NUTS-1
# classification adapted for Türkiye.
# -------------------------------------------------------------------
TUR:
region_sets:
nuts_1:
description: Eurostat NUTS-1 statistical regions for Türkiye
display_name: "Türkiye (NUTS-1 regions)"
explicit_mapping:
TR1: { name: "İstanbul", members: [TR-34] }
TR2: { name: "Batı Marmara", members: [TR-59, TR-22, TR-39, TR-10, TR-17] }
TR3: { name: "Ege", members: [TR-35, TR-09, TR-20, TR-48, TR-45, TR-03, TR-43, TR-64] }
TR4: { name: "Doğu Marmara", members: [TR-16, TR-26, TR-11, TR-41, TR-54, TR-81, TR-14, TR-77] }
TR5: { name: "Batı Anadolu", members: [TR-06, TR-42, TR-70] }
TR6: { name: "Akdeniz", members: [TR-07, TR-32, TR-15, TR-01, TR-33, TR-31, TR-46, TR-80] }
TR7: { name: "Orta Anadolu", members: [TR-71, TR-68, TR-51, TR-50, TR-40, TR-38, TR-58, TR-66] }
TR8: { name: "Batı Karadeniz", members: [TR-67, TR-78, TR-74, TR-37, TR-18, TR-57, TR-55, TR-60, TR-19, TR-05] }
TR9: { name: "Doğu Karadeniz", members: [TR-61, TR-52, TR-28, TR-53, TR-08, TR-29] }
TRA: { name: "Kuzeydoğu Anadolu", members: [TR-25, TR-24, TR-69, TR-04, TR-36, TR-76, TR-75] }
TRB: { name: "Ortadoğu Anadolu", members: [TR-44, TR-23, TR-12, TR-62, TR-65, TR-49, TR-13, TR-30] }
TRC: { name: "Güneydoğu Anadolu", members: [TR-27, TR-02, TR-79, TR-63, TR-21, TR-47, TR-72, TR-73, TR-56] }
# -------------------------------------------------------------------
# France — 18 administrative regions (since 2016 reform)
# Use NE's `region_cod` field to group departments. After name fixes
# in name_overrides.yaml, the codes should align with the 2016 reform.
# -------------------------------------------------------------------
FRA:
region_sets:
regions:
description: French administrative regions (post-2016 reform)
display_name: "France (regions)"
grouping_field: region_cod
# -------------------------------------------------------------------
# Italy — 20 regions
# -------------------------------------------------------------------
ITA:
region_sets:
regions:
description: Italian administrative regions
display_name: "Italy (regions)"
grouping_field: region_cod
# -------------------------------------------------------------------
# Philippines — 17 regions (after Caraga / BARMM renames)
# -------------------------------------------------------------------
PHL:
region_sets:
regions:
description: Philippine administrative regions
display_name: "Philippines (regions)"
grouping_field: region
# -------------------------------------------------------------------
# Future candidates (not yet enabled — verify NE field availability):
# - DEU: Bundesländer aggregation if NE provides Kreise as Admin 1
# - GBR: NUTS-1 regions (England + Wales + Scotland + NI subdivisions)
# - USA: BEA regions, Census divisions
# -------------------------------------------------------------------

View File

@@ -0,0 +1,81 @@
# Pull features from sibling Admin 0 records and add them to a country's
# Admin 1 view, optionally with a renamed iso_3166_2 code and translated
# names.
#
# Use when NE classifies a territory as a separate Admin 0 record but,
# for the purposes of a particular country's Admin 1 chart, it should
# appear inside that country. Common cases:
# - China + Taiwan/HK/Macau (NE has each as separate Admin 0)
# - Finland + Åland (NE has Åland separate; missing from FIN admin 1)
#
# This is NOT a tool for "moving" disputed territories between countries
# — for that, use NE worldview selection (e.g., the _ukr worldview moves
# Crimea from RUS to UKR for free, no config needed).
#
# Schema:
# countries:
# <ISO3 destination country>:
# additions:
# - description: human-readable why
# from:
# adm0_a3: <ISO3 source country>
# match:
# name_en: <feature name> # or other matchers
# set:
# iso_3166_2: <new code> # set when added
# name: <override display name> # optional
# name_<lang>: <translation> # optional, per language
# [...]
#
# Match semantics: same as other configs.
countries:
# -------------------------------------------------------------------
# China — add Special Administrative Regions
# NE keeps Taiwan (TWN), Hong Kong (HKG), and Macau (MAC) as separate
# Admin 0 records. For the China subdivision view we re-attach them
# using the official ISO 3166-2 codes (CN-71/91/92), with Chinese
# names from the official translations.
# -------------------------------------------------------------------
CHN:
additions:
- description: Add Taiwan as China subdivision CN-71
from:
adm0_a3: TWN
match: { name_en: Taiwan }
set:
iso_3166_2: CN-71
name_zh: 中国台湾
- description: Add Hong Kong SAR as CN-91
from:
adm0_a3: HKG
match: { name_en: Hong Kong }
set:
iso_3166_2: CN-91
name_zh: 香港特别行政区
- description: Add Macau SAR as CN-92
from:
adm0_a3: MAC
match: { name_en: Macau }
set:
iso_3166_2: CN-92
name_zh: 澳门特别行政区
# -------------------------------------------------------------------
# Finland — add Åland
# NE has Åland as a separate Admin 0 record (ALA) and it is missing
# from the FIN admin1 dataset. Re-attach it as FI-01 with the Finnish
# name "Ahvenanmaan maakunta".
# -------------------------------------------------------------------
FIN:
additions:
- description: Add Åland as Finland subdivision FI-01
from:
adm0_a3: ALA
match: { name_en: Åland }
set:
iso_3166_2: FI-01
name_fi: Ahvenanmaan maakunta

View File

@@ -0,0 +1,62 @@
# Procedural escape hatch
Small, named, single-purpose Python scripts for the rare cases where declarative YAML in `../config/` can't cleanly express a fix.
## When to put a script here
Use this directory when **all** of the following are true:
- You've tried to express the fix in YAML and the resulting schema is awkward, ambiguous, or requires a one-off type to be added
- The fix is small (typically <50 lines of code, single conceptual operation)
- The fix is tied to a *specific feature* in the data (not a generalizable transform)
## When NOT to put a script here
If any of the following apply, the fix belongs in `../config/` instead:
- It's a typo, rename, or attribute correction → `name_overrides.yaml`
- It's a reposition or bbox drop of a known territory → `flying_islands.yaml`
- It's adding a feature from another country → `territory_assignments.yaml`
- It's dissolving Admin 1 into a coarser admin level → `regional_aggregations.yaml`
- It's a multi-country composite → `composite_maps.yaml`
If the same kind of operation surfaces here twice, that's a signal to extend a YAML schema rather than ship a third script.
## Script conventions
- **Filename:** `NN_<descriptive_snake_case>.py`. The numeric prefix sets execution order; the name documents intent.
- **Header comment:** required. Must explain *what* the script does AND *why* this couldn't be expressed in YAML. If the "why" is weak, push it back into YAML.
- **Interface:** each script defines `def apply(geo: dict) -> dict` taking a parsed GeoJSON FeatureCollection and returning the modified one. The build orchestrator handles I/O.
- **No side effects** other than the returned data — no network calls, no file writes, no `print` other than logging via `sys.stderr`.
- **Pure function over GeoJSON.** Don't import shapely/geopandas unless the operation truly needs polygon math; many fixes are just attribute mutations.
## Skeleton
```python
"""
NN_descriptive_name.py
======================
WHAT: One-sentence summary of what this script does to the data.
WHY: One-paragraph explanation of why this couldn't be expressed in
../config/<some_yaml>.yaml. If you find yourself writing
"because I didn't want to add a field to the schema", push the
fix into the YAML schema instead.
UPSTREAM TRACKING: link to NE issue / community discussion / blog post
explaining the underlying source of the problem, so future
maintainers can re-evaluate when upstream catches up.
"""
import sys
def apply(geo: dict) -> dict:
# ... mutate features ...
return geo
```
## Currently empty
There are no procedural scripts yet. The audit suggested the France-with-Overseas Windward Islands sub-polygon drop *might* warrant one, but `composite_maps.yaml` already has a `drop_parts` field that covers it. We'll add scripts here only if/when a genuine edge case proves YAML can't express it.