Files
superset2/superset-frontend/plugins/plugin-chart-country-map/scripts/config/name_overrides.yaml
Evan Rusackas 1eb48e94fc feat(country-map): scaffold scripts/ dir with YAML config schemas
First-pass schemas for the build pipeline's declarative config layer.
Each schema is documented inline + populated with concrete entries
ported from the legacy notebook's audited touchups (those that the
obsolescence check determined still need to ship).

scripts/
├── README.md                 — pipeline overview, layout, workflow
├── config/
│   ├── name_overrides.yaml         — France typos, ISO codes; PHL renames
│   ├── flying_islands.yaml         — USA/NOR/PRT/ESP/FRA repositions; NLD/GBR drops
│   ├── territory_assignments.yaml  — China + SARs; Finland + Åland
│   ├── regional_aggregations.yaml  — Turkey NUTS-1; FRA/ITA/PHL regions
│   └── composite_maps.yaml         — France-with-Overseas
└── procedural/
    └── README.md             — escape-hatch rules + skeleton (currently empty)

All five YAML files parse cleanly (validated with PyYAML).

Schema design choices:
- Every entry has a `description:` field. Forces honest documentation
  of why each fix exists; reviewers can scan rationale at a glance.
- Match semantics: simple AND-of-conditions; supports `{ in: [...] }`
  for value-set matching.
- composite_maps and territory_assignments share the "pull feature
  from sibling Admin 0" primitive; build script can implement once.
- composite_maps.yaml has a TODO marker for SPM offsets — notebook
  cell 63 was truncated in the audit; will backfill during build
  script implementation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 15:56:04 -07:00

82 lines
3.7 KiB
YAML

# Per-feature attribute corrections to Natural Earth data.
#
# Use when NE has a wrong value for a specific feature: typos, outdated
# administrative names, deprecated ISO codes, etc.
# For one-off geometry fixes, use procedural/ scripts instead.
#
# Schema:
# overrides:
# - description: Human-readable why this override exists (REQUIRED)
# match:
# adm0_a3: <ISO3 country code> # required: scope to one country
# <field>: <value> # one or more match conditions
# set:
# <field>: <value> # one or more fields to set
# [...]
#
# Match semantics: ALL conditions must match (logical AND). Apply to
# both Admin 0 and Admin 1 features unless scope is restricted further.
#
# Tracking: each override should be revisited periodically against
# upstream NE — many of these become obsolete when NE catches up.
overrides:
# -------------------------------------------------------------------
# France — typos in NE attribute table (NE 5.x still ships these)
# -------------------------------------------------------------------
- description: Fix typo "Seien-et-Marne" → "Seine-et-Marne"
match: { adm0_a3: FRA, name: "Seien-et-Marne" }
set: { name: "Seine-et-Marne" }
- description: Fix typo "Haute-Rhin" → "Haut-Rhin"
match: { adm0_a3: FRA, name: "Haute-Rhin" }
set: { name: "Haut-Rhin" }
# -------------------------------------------------------------------
# France — update ISO 3166-2 codes to current values
# NE still uses pre-2016 region codes; map them to current standard.
# -------------------------------------------------------------------
- description: Paris uses ISO 3166-2 code FR-75C as of 2016 (NE has FR-75)
match: { adm0_a3: FRA, iso_3166_2: "FR-75" }
set: { iso_3166_2: "FR-75C" }
- description: Guadeloupe is FR-971 in current ISO (NE has FR-GP)
match: { adm0_a3: FRA, iso_3166_2: "FR-GP" }
set: { iso_3166_2: "FR-971" }
- description: Martinique is FR-972 in current ISO (NE has FR-MQ)
match: { adm0_a3: FRA, iso_3166_2: "FR-MQ" }
set: { iso_3166_2: "FR-972" }
- description: French Guiana is FR-973 in current ISO (NE has FR-GF)
match: { adm0_a3: FRA, iso_3166_2: "FR-GF" }
set: { iso_3166_2: "FR-973" }
- description: La Réunion is FR-974 in current ISO (NE has FR-RE)
match: { adm0_a3: FRA, iso_3166_2: "FR-RE" }
set: { iso_3166_2: "FR-974" }
- description: Mayotte is FR-976 in current ISO (NE has FR-YT)
match: { adm0_a3: FRA, iso_3166_2: "FR-YT" }
set: { iso_3166_2: "FR-976" }
# -------------------------------------------------------------------
# Philippines — administrative renames
# -------------------------------------------------------------------
- description: Region XIII renamed to "Caraga" in 2010 (NE still says "Dinagat Islands")
match: { adm0_a3: PHL, region: "Dinagat Islands (Region XIII)" }
set: { region: "Caraga Administrative Region (Region XIII)" }
- description: ARMM reorganized as BARMM under the Bangsamoro Organic Law (2018-2019)
match: { adm0_a3: PHL, region: "Autonomous Region in Muslim Mindanao (ARMM)" }
set: { region: "Bangsamoro Autonomous Region in Muslim Mindanao (BARMM)" }
# -------------------------------------------------------------------
# NOT included here — handled by other mechanisms:
# - Vietnam diacritics → use NE's NAME_VI field via name_language=vi
# - Crimea/Sevastopol → handled by NE _ukr worldview selection
# - China + SARs → see territory_assignments.yaml
# - Finland + Åland → see territory_assignments.yaml
# - France-with-Overseas → see composite_maps.yaml
# -------------------------------------------------------------------