mirror of
https://github.com/apache/superset.git
synced 2026-05-29 11:45:16 +00:00
feat(country-map): scaffold scripts/ dir with YAML config schemas
First-pass schemas for the build pipeline's declarative config layer.
Each schema is documented inline + populated with concrete entries
ported from the legacy notebook's audited touchups (those that the
obsolescence check determined still need to ship).
scripts/
├── README.md — pipeline overview, layout, workflow
├── config/
│ ├── name_overrides.yaml — France typos, ISO codes; PHL renames
│ ├── flying_islands.yaml — USA/NOR/PRT/ESP/FRA repositions; NLD/GBR drops
│ ├── territory_assignments.yaml — China + SARs; Finland + Åland
│ ├── regional_aggregations.yaml — Turkey NUTS-1; FRA/ITA/PHL regions
│ └── composite_maps.yaml — France-with-Overseas
└── procedural/
└── README.md — escape-hatch rules + skeleton (currently empty)
All five YAML files parse cleanly (validated with PyYAML).
Schema design choices:
- Every entry has a `description:` field. Forces honest documentation
of why each fix exists; reviewers can scan rationale at a glance.
- Match semantics: simple AND-of-conditions; supports `{ in: [...] }`
for value-set matching.
- composite_maps and territory_assignments share the "pull feature
from sibling Admin 0" primitive; build script can implement once.
- composite_maps.yaml has a TODO marker for SPM offsets — notebook
cell 63 was truncated in the audit; will backfill during build
script implementation.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,43 @@
|
||||
# Country Map data pipeline
|
||||
|
||||
This directory contains the build pipeline that turns upstream Natural Earth data into the GeoJSON files consumed by `@superset-ui/plugin-chart-country-map`.
|
||||
|
||||
It replaces the legacy `scripts/Country Map GeoJSON Generator.ipynb` notebook. See `SIP_DRAFT.md` in the parent directory for the full design rationale.
|
||||
|
||||
## Layout
|
||||
|
||||
```
|
||||
scripts/
|
||||
build.sh # one-shot reproducible build
|
||||
README.md # this file
|
||||
config/ # declarative YAML — handles ~95% of fixes
|
||||
name_overrides.yaml # typos, deprecated ISO codes, admin renames
|
||||
flying_islands.yaml # repositioning + bbox drops for far-flung territories
|
||||
territory_assignments.yaml # add features from sibling Admin 0 records
|
||||
regional_aggregations.yaml # dissolve Admin 1 into administrative regions
|
||||
composite_maps.yaml # multi-country composites (e.g. France-with-Overseas)
|
||||
procedural/ # escape hatch — handles the rare 5%
|
||||
README.md # when to use, when not
|
||||
NN_<descriptive_name>.py # one focused script per genuine edge case
|
||||
output/ # gitignored — build artifacts
|
||||
```
|
||||
|
||||
## Operating principles
|
||||
|
||||
- **Default tool: declarative YAML.** Most touchups are renames, repositions, dissolves, or filters — all expressible in YAML. Diffs are small, conflicts localize cleanly to one entry, contributors can submit "fix typo X" as a one-line PR.
|
||||
- **Escape hatch: `procedural/` directory** of small, named, single-purpose Python scripts for the rare cases YAML can't express cleanly. Each script has a header comment explaining *why* it's not in YAML. See `procedural/README.md` for the bar.
|
||||
- **Build is reproducible from a pinned NE version.** `build.sh` records the NE git SHA it consumed; outputs are deterministic given inputs.
|
||||
- **CI regenerates on schema change** and opens a PR if outputs differ. Maintainers review the cartographic diff in legible GeoJSON, not opaque notebook JSON.
|
||||
|
||||
## Workflow for adding a fix
|
||||
|
||||
1. Identify the upstream NE issue (wrong name, missing territory, etc.).
|
||||
2. **Try YAML first.** Add the smallest possible entry to the appropriate config file with a `description` field explaining the fix.
|
||||
3. If YAML can't express it cleanly, add a numbered script in `procedural/` with a header comment explaining why YAML didn't fit.
|
||||
4. Run `build.sh` locally, verify the output GeoJSON looks right.
|
||||
5. Open PR. Reviewer sees the YAML diff (or new procedural script) plus the regenerated GeoJSON.
|
||||
|
||||
## See also
|
||||
|
||||
- `SIP_DRAFT.md` (parent dir) — design rationale, notebook audit, obsolescence check
|
||||
- `procedural/README.md` — when to use the escape hatch
|
||||
@@ -0,0 +1,130 @@
|
||||
# Multi-country composite maps that pull features from several Admin 0
|
||||
# records into a single GeoJSON output, repositioning each into insets.
|
||||
#
|
||||
# The canonical example is "France with Overseas" — mainland France plus
|
||||
# 5 DROM departments (already part of FRA Admin 0) plus territories from
|
||||
# 5 separate Admin 0 records (PYF, ATF, WLF, NCL, SPM) all repositioned
|
||||
# around the mainland into one composite map.
|
||||
#
|
||||
# Build script reads each composite definition and:
|
||||
# 1. Loads the base country at the requested admin level
|
||||
# 2. Applies base_repositions to specified features
|
||||
# 3. For each addition: pulls the feature from another Admin 0 record,
|
||||
# optionally drops sub-polygons by index, repositions, dissolves
|
||||
# 4. Outputs a single GeoJSON keyed by the composite's identifier
|
||||
# 5. Plugin exposes it in the country picker with `display_name`
|
||||
#
|
||||
# Schema:
|
||||
# composites:
|
||||
# <composite_id>:
|
||||
# description: human-readable
|
||||
# display_name: text shown in UI dropdown
|
||||
# admin_level: 0 | 1 # which level the composite represents
|
||||
# base:
|
||||
# adm0_a3: <ISO3>
|
||||
# base_repositions: # optional; applied to base features
|
||||
# - description: human-readable
|
||||
# match: { name: <feature name>, ... }
|
||||
# offset: [x, y]
|
||||
# scale: <number> # optional
|
||||
# drop_parts: [<int>, ...] # optional; drop specific sub-polygon indices
|
||||
# additions: # features pulled from other Admin 0 records
|
||||
# - description: human-readable
|
||||
# from:
|
||||
# adm0_a3: <ISO3 source>
|
||||
# match: { ... } # which feature(s) to pull
|
||||
# dissolve: true # optional; merge all matched features
|
||||
# drop_parts: [<int>, ...] # optional
|
||||
# reposition:
|
||||
# offset: [x, y]
|
||||
# scale: <number> # optional
|
||||
|
||||
composites:
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# France with Overseas — mainland France + DROMs + sister Admin 0
|
||||
# territories under French sovereignty, all repositioned into a single
|
||||
# frame. Most complex composite; templates the schema for others.
|
||||
# -------------------------------------------------------------------
|
||||
france_overseas:
|
||||
description: |
|
||||
Mainland France plus all overseas territories (DROMs + COMs +
|
||||
Polynesia + Southern Lands + Wallis-Futuna + New Caledonia +
|
||||
Saint-Pierre-et-Miquelon) shown in one composite map.
|
||||
display_name: "France (with overseas)"
|
||||
admin_level: 1
|
||||
base:
|
||||
adm0_a3: FRA
|
||||
base_repositions:
|
||||
# The 5 overseas DROMs (départements et régions d'outre-mer) — already
|
||||
# part of FRA Admin 0 in NE. Repositioned aggressively for layout.
|
||||
- description: Reposition Guadeloupe near mainland
|
||||
match: { name: Guadeloupe }
|
||||
offset: [53.2, 29.0]
|
||||
scale: 1.5
|
||||
- description: Reposition Martinique near mainland
|
||||
match: { name: Martinique }
|
||||
offset: [52.8, 27.5]
|
||||
scale: 1.5
|
||||
- description: Reposition French Guiana (shrunk — it's vast)
|
||||
match: { name: "Guyane française" }
|
||||
offset: [45.0, 35.5]
|
||||
scale: 0.3
|
||||
- description: Reposition La Réunion
|
||||
match: { name: "La Réunion" }
|
||||
offset: [-58.2, 60.5]
|
||||
scale: 1.5
|
||||
- description: Reposition Mayotte
|
||||
match: { name: Mayotte }
|
||||
offset: [-50.5, 52.2]
|
||||
scale: 2.0
|
||||
additions:
|
||||
# French Polynesia — only the Windward Islands (Tahiti area), and
|
||||
# we drop the second sub-polygon to avoid visual conflict with
|
||||
# Corsica when laid out.
|
||||
- description: Add Tahiti (Windward Islands) from French Polynesia, drop Rimatuu sub-polygon
|
||||
from:
|
||||
adm0_a3: PYF
|
||||
match: { name: "Windward Islands" }
|
||||
drop_parts: [1]
|
||||
reposition:
|
||||
offset: [158.2, 57.3]
|
||||
scale: 2.0
|
||||
|
||||
# French Southern and Antarctic Lands — Kerguelen Islands only.
|
||||
- description: Add Archipel des Kerguelen from French Southern Lands
|
||||
from:
|
||||
adm0_a3: ATF
|
||||
match: { name: "Archipel des Kerguelen" }
|
||||
reposition:
|
||||
offset: [-63.5, 88.5]
|
||||
scale: 0.9
|
||||
|
||||
# Wallis and Futuna — dissolve Alo + Uvea into one shape.
|
||||
- description: Add Wallis and Futuna (dissolved)
|
||||
from:
|
||||
adm0_a3: WLF
|
||||
dissolve: true
|
||||
reposition:
|
||||
offset: [170, 52.5]
|
||||
scale: 4.0
|
||||
|
||||
# New Caledonia — dissolve all subdivisions.
|
||||
- description: Add New Caledonia (dissolved)
|
||||
from:
|
||||
adm0_a3: NCL
|
||||
dissolve: true
|
||||
reposition:
|
||||
offset: [-165.5, 60.4]
|
||||
scale: 0.4
|
||||
|
||||
# Saint-Pierre and Miquelon — dissolved
|
||||
# NOTE: notebook cell 63 was truncated in the audit; offsets TBD
|
||||
# during build script implementation. Placeholder values below.
|
||||
- description: Add Saint-Pierre and Miquelon (dissolved)
|
||||
from:
|
||||
adm0_a3: SPM
|
||||
dissolve: true
|
||||
reposition:
|
||||
offset: [0, 0] # TODO: extract from full notebook cell
|
||||
scale: 1.0
|
||||
@@ -0,0 +1,137 @@
|
||||
# Per-country handling of far-flung territories.
|
||||
#
|
||||
# Two operations per country, both build-time:
|
||||
# - repositions: territories moved into insets near the mainland
|
||||
# - drop_outside_bbox: territories outside this bbox dropped entirely
|
||||
#
|
||||
# At chart-render time, the "show flying islands" toggle controls which
|
||||
# repositioned/dropped territories are visible:
|
||||
# - ON (default): the chart's renderer applies the repositions defined
|
||||
# here, OR uses the composite_projection if specified (preferred when
|
||||
# available — see geoAlbersUsa for example).
|
||||
# - OFF: territories matched by EITHER `repositions` OR
|
||||
# `drop_outside_bbox` are filtered out of the rendered feature set.
|
||||
#
|
||||
# Schema:
|
||||
# countries:
|
||||
# <ISO3>:
|
||||
# composite_projection: <d3 projection name> # optional, takes
|
||||
# # precedence over repositions when the renderer supports it
|
||||
# repositions:
|
||||
# - description: human-readable why
|
||||
# match: { name: <feature name>, ... }
|
||||
# offset: [x, y] # required; degrees
|
||||
# scale: <number> # optional; default 1.0
|
||||
# simplify: <number> # optional; mapshaper -simplify factor
|
||||
# drop_outside_bbox:
|
||||
# description: human-readable why
|
||||
# nw: [lon, lat] # northwest corner
|
||||
# se: [lon, lat] # southeast corner
|
||||
#
|
||||
# Match semantics: same as name_overrides.yaml — all conditions AND'd.
|
||||
# `match.name: { in: [a, b, c] }` matches any of a, b, c.
|
||||
|
||||
countries:
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# USA — Hawaii, Alaska
|
||||
# D3's geoAlbersUsa handles these natively at render time. We still
|
||||
# ship build-time repositions as a fallback for renderers that don't
|
||||
# support composite projections.
|
||||
# -------------------------------------------------------------------
|
||||
USA:
|
||||
composite_projection: geoAlbersUsa
|
||||
repositions:
|
||||
- description: Bring Hawaii in alongside California
|
||||
match: { name: Hawaii }
|
||||
offset: [51, 5.5]
|
||||
- description: Shrink and reposition Alaska below the lower 48
|
||||
match: { name: Alaska }
|
||||
offset: [35, -34]
|
||||
scale: 0.35
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Norway — Svalbard
|
||||
# -------------------------------------------------------------------
|
||||
NOR:
|
||||
repositions:
|
||||
- description: Bring Svalbard in closer to mainland Norway
|
||||
match: { name: Svalbard }
|
||||
offset: [-12, -8]
|
||||
scale: 0.5
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Portugal — Atlantic islands
|
||||
# -------------------------------------------------------------------
|
||||
PRT:
|
||||
repositions:
|
||||
- description: Pull the Azores closer to mainland
|
||||
match: { name: Azores }
|
||||
offset: [11, 0]
|
||||
- description: Pull Madeira closer; small extra simplify because dense coastline
|
||||
match: { name: Madeira }
|
||||
offset: [6, 2]
|
||||
simplify: 0.015
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Spain — Canary Islands
|
||||
# -------------------------------------------------------------------
|
||||
ESP:
|
||||
repositions:
|
||||
- description: Bring Canary Islands closer to mainland Spain
|
||||
match:
|
||||
name: { in: ["Las Palmas", "Santa Cruz de Tenerife"] }
|
||||
offset: [3, 7]
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# France — Overseas DROMs (Départements et régions d'outre-mer)
|
||||
# For the full France-with-Overseas composite (incl. PYF, ATF, WLF,
|
||||
# NCL, SPM), see composite_maps.yaml.
|
||||
# -------------------------------------------------------------------
|
||||
FRA:
|
||||
composite_projection: geoConicEqualAreaFrance # if available in renderer
|
||||
repositions:
|
||||
- description: Reposition Guadeloupe near mainland France
|
||||
match: { name: Guadeloupe }
|
||||
offset: [57.4, 25.4]
|
||||
scale: 1.5
|
||||
- description: Reposition Martinique near mainland France
|
||||
match: { name: Martinique }
|
||||
offset: [58.4, 27.1]
|
||||
scale: 1.5
|
||||
- description: Reposition French Guiana (shrunk significantly — it's larger than mainland France)
|
||||
match: { name: "Guyane française" }
|
||||
offset: [52, 37.7]
|
||||
scale: 0.35
|
||||
- description: Reposition La Réunion
|
||||
match: { name: "La Réunion" }
|
||||
offset: [-55, 62.8]
|
||||
scale: 1.5
|
||||
- description: Reposition Mayotte
|
||||
match: { name: Mayotte }
|
||||
offset: [-43, 54.3]
|
||||
scale: 1.5
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Netherlands — drop Caribbean territories
|
||||
# The Caribbean Netherlands (Bonaire, Sint Eustatius, Saba) and the
|
||||
# constituent countries (Aruba, Curaçao, Sint Maarten) are far from
|
||||
# mainland NL. The notebook drops them rather than repositioning;
|
||||
# we preserve that editorial choice.
|
||||
# -------------------------------------------------------------------
|
||||
NLD:
|
||||
drop_outside_bbox:
|
||||
description: Drop Caribbean Netherlands & ABC islands; keep mainland + Frisian islands
|
||||
nw: [-20, 60]
|
||||
se: [20, 20]
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# United Kingdom — drop British Overseas Territories
|
||||
# Same pattern as NLD — drop (don't reposition) territories far from
|
||||
# the British Isles.
|
||||
# -------------------------------------------------------------------
|
||||
GBR:
|
||||
drop_outside_bbox:
|
||||
description: Drop British Overseas Territories; keep British Isles
|
||||
nw: [-10, 60]
|
||||
se: [20, 20]
|
||||
@@ -0,0 +1,81 @@
|
||||
# Per-feature attribute corrections to Natural Earth data.
|
||||
#
|
||||
# Use when NE has a wrong value for a specific feature: typos, outdated
|
||||
# administrative names, deprecated ISO codes, etc.
|
||||
# For one-off geometry fixes, use procedural/ scripts instead.
|
||||
#
|
||||
# Schema:
|
||||
# overrides:
|
||||
# - description: Human-readable why this override exists (REQUIRED)
|
||||
# match:
|
||||
# adm0_a3: <ISO3 country code> # required: scope to one country
|
||||
# <field>: <value> # one or more match conditions
|
||||
# set:
|
||||
# <field>: <value> # one or more fields to set
|
||||
# [...]
|
||||
#
|
||||
# Match semantics: ALL conditions must match (logical AND). Apply to
|
||||
# both Admin 0 and Admin 1 features unless scope is restricted further.
|
||||
#
|
||||
# Tracking: each override should be revisited periodically against
|
||||
# upstream NE — many of these become obsolete when NE catches up.
|
||||
|
||||
overrides:
|
||||
# -------------------------------------------------------------------
|
||||
# France — typos in NE attribute table (NE 5.x still ships these)
|
||||
# -------------------------------------------------------------------
|
||||
- description: Fix typo "Seien-et-Marne" → "Seine-et-Marne"
|
||||
match: { adm0_a3: FRA, name: "Seien-et-Marne" }
|
||||
set: { name: "Seine-et-Marne" }
|
||||
|
||||
- description: Fix typo "Haute-Rhin" → "Haut-Rhin"
|
||||
match: { adm0_a3: FRA, name: "Haute-Rhin" }
|
||||
set: { name: "Haut-Rhin" }
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# France — update ISO 3166-2 codes to current values
|
||||
# NE still uses pre-2016 region codes; map them to current standard.
|
||||
# -------------------------------------------------------------------
|
||||
- description: Paris uses ISO 3166-2 code FR-75C as of 2016 (NE has FR-75)
|
||||
match: { adm0_a3: FRA, iso_3166_2: "FR-75" }
|
||||
set: { iso_3166_2: "FR-75C" }
|
||||
|
||||
- description: Guadeloupe is FR-971 in current ISO (NE has FR-GP)
|
||||
match: { adm0_a3: FRA, iso_3166_2: "FR-GP" }
|
||||
set: { iso_3166_2: "FR-971" }
|
||||
|
||||
- description: Martinique is FR-972 in current ISO (NE has FR-MQ)
|
||||
match: { adm0_a3: FRA, iso_3166_2: "FR-MQ" }
|
||||
set: { iso_3166_2: "FR-972" }
|
||||
|
||||
- description: French Guiana is FR-973 in current ISO (NE has FR-GF)
|
||||
match: { adm0_a3: FRA, iso_3166_2: "FR-GF" }
|
||||
set: { iso_3166_2: "FR-973" }
|
||||
|
||||
- description: La Réunion is FR-974 in current ISO (NE has FR-RE)
|
||||
match: { adm0_a3: FRA, iso_3166_2: "FR-RE" }
|
||||
set: { iso_3166_2: "FR-974" }
|
||||
|
||||
- description: Mayotte is FR-976 in current ISO (NE has FR-YT)
|
||||
match: { adm0_a3: FRA, iso_3166_2: "FR-YT" }
|
||||
set: { iso_3166_2: "FR-976" }
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Philippines — administrative renames
|
||||
# -------------------------------------------------------------------
|
||||
- description: Region XIII renamed to "Caraga" in 2010 (NE still says "Dinagat Islands")
|
||||
match: { adm0_a3: PHL, region: "Dinagat Islands (Region XIII)" }
|
||||
set: { region: "Caraga Administrative Region (Region XIII)" }
|
||||
|
||||
- description: ARMM reorganized as BARMM under the Bangsamoro Organic Law (2018-2019)
|
||||
match: { adm0_a3: PHL, region: "Autonomous Region in Muslim Mindanao (ARMM)" }
|
||||
set: { region: "Bangsamoro Autonomous Region in Muslim Mindanao (BARMM)" }
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# NOT included here — handled by other mechanisms:
|
||||
# - Vietnam diacritics → use NE's NAME_VI field via name_language=vi
|
||||
# - Crimea/Sevastopol → handled by NE _ukr worldview selection
|
||||
# - China + SARs → see territory_assignments.yaml
|
||||
# - Finland + Åland → see territory_assignments.yaml
|
||||
# - France-with-Overseas → see composite_maps.yaml
|
||||
# -------------------------------------------------------------------
|
||||
@@ -0,0 +1,94 @@
|
||||
# Dissolve Admin 1 features into coarser administrative regions.
|
||||
#
|
||||
# Some countries have a meaningful intermediate level between Admin 0
|
||||
# (country) and Admin 1 (provinces/states/departments). Examples:
|
||||
# - Turkey: NUTS-1 statistical regions (12 regions from 81 provinces)
|
||||
# - France: 18 administrative regions dissolved from 101 departments
|
||||
# - Italy: 20 regions dissolved from 110 provinces
|
||||
# - Philippines: 17 regions dissolved from 118 provinces+cities
|
||||
#
|
||||
# For each defined region set, the build script:
|
||||
# 1. Loads the country's Admin 1 features
|
||||
# 2. Dissolves features by the mapping below
|
||||
# 3. Outputs a new GeoJSON keyed by `<country>_<set_name>`
|
||||
# 4. Plugin exposes it as a third "admin level" option in the UI:
|
||||
# "Admin 0 (countries) / Admin 1 (subdivisions) / Aggregated regions"
|
||||
#
|
||||
# Schema:
|
||||
# countries:
|
||||
# <ISO3>:
|
||||
# region_sets:
|
||||
# <set_name>: # arbitrary identifier
|
||||
# description: human-readable
|
||||
# display_name: text shown in UI dropdown
|
||||
# grouping_field: <field> # field on Admin 1 features used to group
|
||||
# # OR
|
||||
# explicit_mapping: # explicit ISO → region_code dict
|
||||
# <region_code>:
|
||||
# name: <display name>
|
||||
# members: [<iso_3166_2>, ...]
|
||||
|
||||
countries:
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Turkey — NUTS-1 statistical regions
|
||||
# Hand-coded mapping of 81 cities → 12 regions per Eurostat NUTS-1
|
||||
# classification adapted for Türkiye.
|
||||
# -------------------------------------------------------------------
|
||||
TUR:
|
||||
region_sets:
|
||||
nuts_1:
|
||||
description: Eurostat NUTS-1 statistical regions for Türkiye
|
||||
display_name: "Türkiye (NUTS-1 regions)"
|
||||
explicit_mapping:
|
||||
TR1: { name: "İstanbul", members: [TR-34] }
|
||||
TR2: { name: "Batı Marmara", members: [TR-59, TR-22, TR-39, TR-10, TR-17] }
|
||||
TR3: { name: "Ege", members: [TR-35, TR-09, TR-20, TR-48, TR-45, TR-03, TR-43, TR-64] }
|
||||
TR4: { name: "Doğu Marmara", members: [TR-16, TR-26, TR-11, TR-41, TR-54, TR-81, TR-14, TR-77] }
|
||||
TR5: { name: "Batı Anadolu", members: [TR-06, TR-42, TR-70] }
|
||||
TR6: { name: "Akdeniz", members: [TR-07, TR-32, TR-15, TR-01, TR-33, TR-31, TR-46, TR-80] }
|
||||
TR7: { name: "Orta Anadolu", members: [TR-71, TR-68, TR-51, TR-50, TR-40, TR-38, TR-58, TR-66] }
|
||||
TR8: { name: "Batı Karadeniz", members: [TR-67, TR-78, TR-74, TR-37, TR-18, TR-57, TR-55, TR-60, TR-19, TR-05] }
|
||||
TR9: { name: "Doğu Karadeniz", members: [TR-61, TR-52, TR-28, TR-53, TR-08, TR-29] }
|
||||
TRA: { name: "Kuzeydoğu Anadolu", members: [TR-25, TR-24, TR-69, TR-04, TR-36, TR-76, TR-75] }
|
||||
TRB: { name: "Ortadoğu Anadolu", members: [TR-44, TR-23, TR-12, TR-62, TR-65, TR-49, TR-13, TR-30] }
|
||||
TRC: { name: "Güneydoğu Anadolu", members: [TR-27, TR-02, TR-79, TR-63, TR-21, TR-47, TR-72, TR-73, TR-56] }
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# France — 18 administrative regions (since 2016 reform)
|
||||
# Use NE's `region_cod` field to group departments. After name fixes
|
||||
# in name_overrides.yaml, the codes should align with the 2016 reform.
|
||||
# -------------------------------------------------------------------
|
||||
FRA:
|
||||
region_sets:
|
||||
regions:
|
||||
description: French administrative regions (post-2016 reform)
|
||||
display_name: "France (regions)"
|
||||
grouping_field: region_cod
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Italy — 20 regions
|
||||
# -------------------------------------------------------------------
|
||||
ITA:
|
||||
region_sets:
|
||||
regions:
|
||||
description: Italian administrative regions
|
||||
display_name: "Italy (regions)"
|
||||
grouping_field: region_cod
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Philippines — 17 regions (after Caraga / BARMM renames)
|
||||
# -------------------------------------------------------------------
|
||||
PHL:
|
||||
region_sets:
|
||||
regions:
|
||||
description: Philippine administrative regions
|
||||
display_name: "Philippines (regions)"
|
||||
grouping_field: region
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Future candidates (not yet enabled — verify NE field availability):
|
||||
# - DEU: Bundesländer aggregation if NE provides Kreise as Admin 1
|
||||
# - GBR: NUTS-1 regions (England + Wales + Scotland + NI subdivisions)
|
||||
# - USA: BEA regions, Census divisions
|
||||
# -------------------------------------------------------------------
|
||||
@@ -0,0 +1,81 @@
|
||||
# Pull features from sibling Admin 0 records and add them to a country's
|
||||
# Admin 1 view, optionally with a renamed iso_3166_2 code and translated
|
||||
# names.
|
||||
#
|
||||
# Use when NE classifies a territory as a separate Admin 0 record but,
|
||||
# for the purposes of a particular country's Admin 1 chart, it should
|
||||
# appear inside that country. Common cases:
|
||||
# - China + Taiwan/HK/Macau (NE has each as separate Admin 0)
|
||||
# - Finland + Åland (NE has Åland separate; missing from FIN admin 1)
|
||||
#
|
||||
# This is NOT a tool for "moving" disputed territories between countries
|
||||
# — for that, use NE worldview selection (e.g., the _ukr worldview moves
|
||||
# Crimea from RUS to UKR for free, no config needed).
|
||||
#
|
||||
# Schema:
|
||||
# countries:
|
||||
# <ISO3 destination country>:
|
||||
# additions:
|
||||
# - description: human-readable why
|
||||
# from:
|
||||
# adm0_a3: <ISO3 source country>
|
||||
# match:
|
||||
# name_en: <feature name> # or other matchers
|
||||
# set:
|
||||
# iso_3166_2: <new code> # set when added
|
||||
# name: <override display name> # optional
|
||||
# name_<lang>: <translation> # optional, per language
|
||||
# [...]
|
||||
#
|
||||
# Match semantics: same as other configs.
|
||||
|
||||
countries:
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# China — add Special Administrative Regions
|
||||
# NE keeps Taiwan (TWN), Hong Kong (HKG), and Macau (MAC) as separate
|
||||
# Admin 0 records. For the China subdivision view we re-attach them
|
||||
# using the official ISO 3166-2 codes (CN-71/91/92), with Chinese
|
||||
# names from the official translations.
|
||||
# -------------------------------------------------------------------
|
||||
CHN:
|
||||
additions:
|
||||
- description: Add Taiwan as China subdivision CN-71
|
||||
from:
|
||||
adm0_a3: TWN
|
||||
match: { name_en: Taiwan }
|
||||
set:
|
||||
iso_3166_2: CN-71
|
||||
name_zh: 中国台湾
|
||||
|
||||
- description: Add Hong Kong SAR as CN-91
|
||||
from:
|
||||
adm0_a3: HKG
|
||||
match: { name_en: Hong Kong }
|
||||
set:
|
||||
iso_3166_2: CN-91
|
||||
name_zh: 香港特别行政区
|
||||
|
||||
- description: Add Macau SAR as CN-92
|
||||
from:
|
||||
adm0_a3: MAC
|
||||
match: { name_en: Macau }
|
||||
set:
|
||||
iso_3166_2: CN-92
|
||||
name_zh: 澳门特别行政区
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Finland — add Åland
|
||||
# NE has Åland as a separate Admin 0 record (ALA) and it is missing
|
||||
# from the FIN admin1 dataset. Re-attach it as FI-01 with the Finnish
|
||||
# name "Ahvenanmaan maakunta".
|
||||
# -------------------------------------------------------------------
|
||||
FIN:
|
||||
additions:
|
||||
- description: Add Åland as Finland subdivision FI-01
|
||||
from:
|
||||
adm0_a3: ALA
|
||||
match: { name_en: Åland }
|
||||
set:
|
||||
iso_3166_2: FI-01
|
||||
name_fi: Ahvenanmaan maakunta
|
||||
@@ -0,0 +1,62 @@
|
||||
# Procedural escape hatch
|
||||
|
||||
Small, named, single-purpose Python scripts for the rare cases where declarative YAML in `../config/` can't cleanly express a fix.
|
||||
|
||||
## When to put a script here
|
||||
|
||||
Use this directory when **all** of the following are true:
|
||||
|
||||
- You've tried to express the fix in YAML and the resulting schema is awkward, ambiguous, or requires a one-off type to be added
|
||||
- The fix is small (typically <50 lines of code, single conceptual operation)
|
||||
- The fix is tied to a *specific feature* in the data (not a generalizable transform)
|
||||
|
||||
## When NOT to put a script here
|
||||
|
||||
If any of the following apply, the fix belongs in `../config/` instead:
|
||||
|
||||
- It's a typo, rename, or attribute correction → `name_overrides.yaml`
|
||||
- It's a reposition or bbox drop of a known territory → `flying_islands.yaml`
|
||||
- It's adding a feature from another country → `territory_assignments.yaml`
|
||||
- It's dissolving Admin 1 into a coarser admin level → `regional_aggregations.yaml`
|
||||
- It's a multi-country composite → `composite_maps.yaml`
|
||||
|
||||
If the same kind of operation surfaces here twice, that's a signal to extend a YAML schema rather than ship a third script.
|
||||
|
||||
## Script conventions
|
||||
|
||||
- **Filename:** `NN_<descriptive_snake_case>.py`. The numeric prefix sets execution order; the name documents intent.
|
||||
- **Header comment:** required. Must explain *what* the script does AND *why* this couldn't be expressed in YAML. If the "why" is weak, push it back into YAML.
|
||||
- **Interface:** each script defines `def apply(geo: dict) -> dict` taking a parsed GeoJSON FeatureCollection and returning the modified one. The build orchestrator handles I/O.
|
||||
- **No side effects** other than the returned data — no network calls, no file writes, no `print` other than logging via `sys.stderr`.
|
||||
- **Pure function over GeoJSON.** Don't import shapely/geopandas unless the operation truly needs polygon math; many fixes are just attribute mutations.
|
||||
|
||||
## Skeleton
|
||||
|
||||
```python
|
||||
"""
|
||||
NN_descriptive_name.py
|
||||
======================
|
||||
|
||||
WHAT: One-sentence summary of what this script does to the data.
|
||||
|
||||
WHY: One-paragraph explanation of why this couldn't be expressed in
|
||||
../config/<some_yaml>.yaml. If you find yourself writing
|
||||
"because I didn't want to add a field to the schema", push the
|
||||
fix into the YAML schema instead.
|
||||
|
||||
UPSTREAM TRACKING: link to NE issue / community discussion / blog post
|
||||
explaining the underlying source of the problem, so future
|
||||
maintainers can re-evaluate when upstream catches up.
|
||||
"""
|
||||
|
||||
import sys
|
||||
|
||||
|
||||
def apply(geo: dict) -> dict:
|
||||
# ... mutate features ...
|
||||
return geo
|
||||
```
|
||||
|
||||
## Currently empty
|
||||
|
||||
There are no procedural scripts yet. The audit suggested the France-with-Overseas Windward Islands sub-polygon drop *might* warrant one, but `composite_maps.yaml` already has a `drop_parts` field that covers it. We'll add scripts here only if/when a genuine edge case proves YAML can't express it.
|
||||
Reference in New Issue
Block a user