Status: pass
metadata registry only; no source is approved for model training yet
Sources: 22 total; 8 priority-one; 0 model-ready.
| Priority | Source | Category | Kind | Local status | First action | Leakage rule |
|---|---|---|---|---|---|---|
| 1 | Local Government Directory | admin_code_registry | dated_snapshot | not_downloaded | Create dated LGD extract metadata and compare code coverage across boundary sources. | Metadata/backbone only; not a target or benchmark signal. |
| 1 | GeoParquet | artifact_format | stable_format | not_required_yet | Standardize future H3 feature tables and coverage probes. | Storage format only; no modeling signal. |
| 1 | Google Open Buildings | building_footprints | dated_snapshot | referenced_by_prior_art | Cross-check with ramSeraph and Microsoft footprints for disagreement cells. | Physical-structure proxy only; never household truth by itself. |
| 1 | Microsoft Global ML Building Footprints | building_footprints | dated_snapshot | referenced_by_prior_art | Compare disagreement cells against Google/ramSeraph building sources. | Physical-structure proxy only; combine with population and land-use evidence. |
| 1 | ESA WorldCover | landcover_masks | fixed_release | not_downloaded | Fill the current missing WorldCover raster gap with source metadata and AOI probes. | Land cover can suppress impossible cells; it is not income or household truth. |
| 1 | Official Census India PCA and houselisting | official_denominator | fixed_official_release | not_downloaded | Prefer official PCA/houselisting extracts over scraped mirrors for denominator QA. | Independent source; age must be corrected with non-GeoIQ calibration or internal outcomes. |
| 1 | GHSL / Global Human Settlement Layer | population_builtup | fixed_release | not_downloaded | Register release metadata, then compare against WorldPop/Census/building denominators. | May be used as independent feature; never selected by GeoIQ benchmark fit. |
| 1 | DuckDB Spatial | processing_tool | stable_tool | not_required_yet | Use for registry/probe scripts when dependencies are available. | Engineering tool only; no demand signal. |
| 2 | PMTiles | artifact_format | stable_format | not_required_yet | Use after AOI layers exist and need visual review. | Visualization format only; no modeling signal. |
| 2 | Google Open Buildings 2.5D Temporal | building_footprints | dated_snapshot | not_downloaded | Probe only after static footprint coverage QA is in place. | Use time slices fixed before evaluation; no post-outcome leakage in time holdouts. |
| 2 | Dynamic World V1 | landcover_masks | dated_image_window | not_downloaded | Use only fixed AOI/date windows after WorldCover probes are stable. | Dated independent raster only; no tuning windows against GeoIQ TAM. |
| 2 | Planetary Computer STAC | raster_access | catalog_access_pattern | not_downloaded | Use for raster provenance manifests, not as a feature by itself. | Catalog metadata only; imagery dates must respect time holdouts. |
| 2 | OpenStreetMap via OSMnx/SRAI | roads_pois_places | dated_extract | partially_supported_by_srai_prior_art | Replace ad hoc live pulls with versioned extracts plus source completeness flags. | Sparse OSM means missingness can be mapping bias, not low opportunity. |
| 2 | Overture Maps | roads_pois_places | monthly_release | not_downloaded | Use named releases for H3 POI/road counts and OSM coverage comparison. | Independent features only; release must predate any time-holdout outcome window. |
| 2 | ohsome API | source_coverage_qa | dated_query_output | not_downloaded | Add completeness features before interpreting OSM sparse cells. | Coverage QA only; should explain confidence, not become demand truth. |
| 3 | M-Lab | connectivity_proxy | dated_aggregate | not_downloaded | Use city-level aggregates only if Ookla/internal comparison shows value. | Measurement availability is biased; treat as diagnostic until validated. |
| 3 | Ookla Open Data | connectivity_proxy | quarterly_snapshot | not_downloaded | Probe on selected AOIs before adding any national feature. | External proxy only; not a substitute for internal serviceability or capacity. |
| 3 | OpenCelliD | connectivity_proxy | dated_snapshot | not_downloaded | Review license/account constraints, then run only AOI coverage probes. | Infrastructure presence does not imply serviceability for this product. |
| 4 | Clay Foundation Model | foundation_model_reference | model_reference | not_downloaded | Defer until transparent satellite features are stable. | No benchmark-driven feature selection against GeoIQ. |
| 4 | IBM/NASA Prithvi | foundation_model_reference | model_reference | not_downloaded | Defer until transparent raster/building features leave clear residual gaps. | No fine-tuning or model selection on GeoIQ labels. |
| 4 | AllenAI SatlasPretrain | foundation_model_reference | model_reference | not_downloaded | Use only as a benchmark against TorchGeo/transparent raster features. | No direct or indirect GeoIQ-label tuning. |
| 4 | TerraTorch | foundation_model_tooling | stable_tool | not_required_yet | Defer until the project has stable raster labels and compute budget. | Advanced tooling only; no GeoIQ-tuned remote-sensing model. |