Population Household Denominator
TAM cannot be right if the denominator is wrong; this is the first reality anchor.
Validation target: district/city household and population rollups, then internal serviceable households
A benchmark or evaluation label only; never a production feature.
A validation split that withholds geographic areas, not random rows.
A validation split that tests whether the model survives unseen cities.
The population, household, or building base before ability-to-buy adjustments.
Whether the business can feasibly reach or install in the cell.
Whether demand can be won through channels, price, and competition.
A guardrail that separates missing map data from missing demand.
A heterogeneous graph prior art for area, semantic, and POI context.
A reader for the operator who has to decide what belongs in the model
A standalone research map for the local prior-art corpus. It separates population, welfare, POI, road, land-mask, graph, and business-reality layers so GeoIQ-style vendor TAM remains a benchmark label rather than a production input.
The practical stack is not one model. It is a controlled sequence of independent denominator, welfare, activity, serviceability, graph context, and internal outcome layers.
Name the layers before choosing tools.
A repo is useful only after it is tied to a specific layer and validation gate.
TAM cannot be right if the denominator is wrong; this is the first reality anchor.
Validation target: district/city household and population rollups, then internal serviceable households
Separates dense residential demand from empty land, industrial sheds, and underbuilt cells.
Validation target: audited buildings/households and installable address density
Maps visible welfare and housing quality into the target-income/affordability band.
Validation target: Census/SECC/NFHS/SHRUG welfare indicators and internal ARPU/plan affordability
Captures economic intensity and electrification that pure population misses.
Validation target: income/development proxies, leads, installs, ARPU, and retained revenue
Converts theoretical demand into reachable/serviceable demand.
Validation target: serviceability pass/fail, failed installs, partner travel time, network coverage
Distinguishes residential <10 LPA opportunity cells from purely commercial, industrial, or sparse cells.
Validation target: lead density, conversion, install density, and local ops knowledge
Prevents false TAM where households cannot or should not exist.
Validation target: manual QA, failed installs, low-address-density cells, and land/water masks
Replaces leaky neighbor-TAM interpolation with independent neighbour/context features.
Validation target: spatial-block holdout against vendor TAM and real business outcomes
This is the actual business reality layer; without it, confidence is only proxy confidence.
Validation target: time-based and spatial-block holdout outcomes
This report keeps the repository rule intact: do not present optimistic metrics without leakage checks. The useful GeoHG-style numbers are spatial-block and city holdouts; random CV remains a diagnostic warning.
Separate diagnostic metrics from defensible methodology claims.
Any prior-art adoption must preserve the no-vendor-TAM-feature policy.
Vendor TAM is a benchmark label for evaluation and distillation research. It is not allowed as a production score feature, neighbor aggregate, target-derived rank, or tuning signal.
The existing scatter plot is linked below when the HTML is opened from the outputs directory.
Each family below names the prior-art repos that matter, the signal it contributes, and the validation target that must pass before it becomes more than research context.
Convert repo names into auditable workstreams.
Every later repo section should point back to at least one family here.
| Feature family | Repos | Signals | Why it matters | Validation target |
|---|---|---|---|---|
| population household denominator | prs-eth/Popcorn, yashveeeeeeer/india-geodata, devdatalab/shrug-public, pigshell/india-census-2011 | WorldPop population, census/SHRUG reconciliation, built-up occupancy, household density | TAM cannot be right if the denominator is wrong; this is the first reality anchor. | district/city household and population rollups, then internal serviceable households |
| buildings and settlement structure | ramSeraph/indian_buildings, yashveeeeeeer/india-geodata, prs-eth/Popcorn | building_count, built_area_share, building_density, occupancy_proxy, settlement_compactness | Separates dense residential demand from empty land, industrial sheds, and underbuilt cells. | audited buildings/households and installable address density |
| satellite welfare affluence | AIandGlobalDevelopmentLab/EOML-for-India, amangupt01/Village_Development_Model, mani-shailesh/satimage, torchgeo/torchgeo | roof/material proxy, lighting proxy, drinking-water proxy, Landsat/Sentinel embeddings, village development score | Maps visible welfare and housing quality into the target-income/affordability band. | Census/SECC/NFHS/SHRUG welfare indicators and internal ARPU/plan affordability |
| nightlights and economic activity | amangupt01/Village_Development_Model, yashveeeeeeer/india-geodata, devdatalab/shrug-public | VIIRS mean, VIIRS trend, nightlight blob score, commercial activity proxy | Captures economic intensity and electrification that pure population misses. | income/development proxies, leads, installs, ARPU, and retained revenue |
| roads accessibility serviceability | kraina-ai/srai, Calychas/highway2vec, yashveeeeeeer/india-geodata, ramSeraph/indian_land_features | road_length_by_class, distance_to_major_road, road_embedding, travel_friction, non-serviceable terrain | Converts theoretical demand into reachable/serviceable demand. | serviceability pass/fail, failed installs, partner travel time, network coverage |
| poi urban function | kraina-ai/srai, kraina-ai/hex2vec, PaddlePaddle/PaddleSpatial, yashveeeeeeer/india-geodata | POI counts by category, Hex2Vec/ContextualCount embeddings, schools/healthcare/markets, urban function vector | Distinguishes residential <10 LPA opportunity cells from purely commercial, industrial, or sparse cells. | lead density, conversion, install density, and local ops knowledge |
| land use exclusions and risk | ramSeraph/indian_land_features, yashveeeeeeer/india-geodata | water_share, forest_share, mining_share, industrial_land_share, flood/coastal risk, slope/elevation | Prevents false TAM where households cannot or should not exist. | manual QA, failed installs, low-address-density cells, and land/water masks |
| heterogeneous graph and spatial context | CityMind-Lab/GeoHG, kraina-ai/srai, seai-lab/TorchSpatial, wherobots/GeoTorchAI | neighbor context, semantic similarity, land-cover hypernodes, POI hypernodes, location encoders | Replaces leaky neighbor-TAM interpolation with independent neighbour/context features. | spatial-block holdout against vendor TAM and real business outcomes |
| internal business reality | firm internal data, not GitHub | leads, installs, retained installs, gross margin, CAC, failed installs, serviceability checks, capacity | This is the actual business reality layer; without it, confidence is only proxy confidence. | time-based and spatial-block holdout outcomes |
Each repo chapter below answers what the prior art does, what it consumes, what it emits, how it maps to TAM, and what must be validated before use.
Turns geographic raster/vector features into a heterogeneous graph with area nodes, semantic entity hypernodes, POI hypernodes, and spatial adjacency.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Estimates high-resolution population maps from Sentinel-1 and Sentinel-2 imagery with a small number of coarse census counts.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Downloads and processes OSM, Overture, GTFS, vector data, microregions, and embeddings.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Learns context-aware embeddings for H3 hexagons from OpenStreetMap tags.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Represents OSM microregions through road network characteristics.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Provides PyTorch-native geospatial datasets, samplers, transforms, trainers, and pretrained models.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Combines PyTorch models with Apache Sedona preprocessing for raster imagery and spatiotemporal non-imagery data.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Implements a framework and benchmark suite for spatial representation learning.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Provides spatial-temporal data mining functions for transfer learning, time-series prediction, and region profiling.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/PaddlePaddle_PaddleSpatial
Upstream: PaddlePaddle/PaddleSpatial
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Reproduces research on measuring Indian living standards and health indicators from satellite images.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/AIandGlobalDevelopmentLab_EOML-for-India
Upstream: AIandGlobalDevelopmentLab/EOML-for-India
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Downloads Indian state imagery through Google Earth Engine and cuts it into fixed village image crops.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/amangupt01_Village_Development_Model
Upstream: amangupt01/Village_Development_Model
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Predicts developmental parameters from satellite images using a multi-task fully convolutional model.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Contains notebooks and thesis material for satellite-based socioeconomic indicator estimation.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/ArmaanBhullar_Socio-Economic-Indicators-using-Satellite-Imagery
Upstream: ArmaanBhullar/Socio-Economic-Indicators-using-Satellite-Imagery
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Tracks releases and issues for SHRUG, the India rural-urban socioeconomic geography platform.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Aggregates India administrative, census, environment, water, infrastructure, buildings, healthcare, education, urban, postal, police, and remote-sensing layers.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/yashveeeeeeer_india-geodata
Upstream: yashveeeeeeer/india-geodata
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Publishes Indian administrative boundary releases for states, districts, subdistricts, blocks, panchayats, villages, habitations, urban areas, forests, postal, police, constituencies, and historical districts.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/ramSeraph_indian_admin_boundaries
Upstream: ramSeraph/indian_admin_boundaries
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Collects Indian building footprint releases from urban sources, Google Open Buildings 2023, and Microsoft Buildings.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/ramSeraph_indian_buildings
Upstream: ramSeraph/indian_buildings
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Publishes Indian land-use, urban land-use, soil health, geomorphology, geology, forests, mining, groundwater prospects, elevation, and land degradation layers.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/ramSeraph_indian_land_features
Upstream: ramSeraph/indian_land_features
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Serves India vector/raster tiles from PMTiles and cloud-optimized geospatial files.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Collects municipal spatial data scraped from city websites and other sources.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/datameet_Municipal_Spatial_Data
Upstream: datameet/Municipal_Spatial_Data
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Collects community-contributed Indian village boundaries for selected states.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/datameet_indian_village_boundaries
Upstream: datameet/indian_village_boundaries
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Contains scraped Census 2011 PCA and houselisting-style tables.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/pigshell_india-census-2011
Upstream: pigshell/india-census-2011
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Provides GeoJSON pincode boundary extents for several major Indian cities.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Provides a workflow for seeding, editing, and downloading India pincode boundaries as GeoJSON.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Contains India, state, city, and other shapefile/GeoJSON/KML resources.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
Local path: prior art/HindustanTimesLabs_shapefiles
Upstream: HindustanTimesLabs/shapefiles
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Provides simple GeoJSON boundaries for Indian states, districts, and taluks.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
Attempts to convert government ward/boundary PDFs and images into clean GeoJSON/KML data.
Decide whether this repo supplies a production input, a challenger, or context only.
Promote only the outputs that survive source versioning, leakage checks, and outcome validation.
population household denominator
buildings and settlement structure
satellite welfare affluence
nightlights and economic activity
roads accessibility serviceability
poi urban function
land use exclusions and risk
heterogeneous graph and spatial context
internal business reality
These stable upstream sources and tooling references are not automatically model-ready; each needs version, extract date, license, CRS, coverage, and leakage-safe validation before promotion.
Separate source discovery from production readiness.
Treat every source as a candidate until custody and coverage checks pass.
Independent population, built-up, and settlement benchmark.
Official population, household, housing, and rural/urban reconciliation.
Admin-code crosswalks across Census, SHRUG, boundaries, and internal systems.
Built-up, water, vegetation, crop, and non-residential masks.
Time-windowed land-cover probabilities for built-up and exclusion masks.
Building count, built area, settlement morphology, and residential-density evidence.
Independent building-footprint source for denominator redistribution and QA.
Versioned places, buildings, roads, addresses, and base features.
Roads, POIs, network topology, and accessibility features.
OSM history and completeness statistics so sparse mapping is not mistaken for sparse demand.
Quarterly broadband/mobile performance tiles as weak connectivity context.
Broadband measurement aggregates for independent connectivity residual analysis.
Cell tower locations as optional mobile infrastructure proxy.
Reproducible raster catalog item IDs and query manifests.
Fast local spatial SQL for feature tables, probes, and crosswalks.
Columnar spatial artifacts with CRS and metadata.
Compact distribution of QA map layers without a tile server.
Foundation model embeddings for land cover and environmental context.
General satellite embeddings for welfare or settlement morphology residuals.
Remote-sensing pretrained models for buildings, roads, land use, and visual priors.
Training and evaluation framework for geospatial foundation models.
Use this as the implementation order. It keeps the business formula auditable and avoids jumping straight from prior art to a black-box TAM score.
Turn prior-art reading into a build sequence.
Start with denominator and custody before adding embeddings or graph context.
| Step | Layer | Prior art | Score field | Gate |
|---|---|---|---|---|
| 1 | Denominator | WorldPop, Census, SHRUG, POPCORN, buildings | households_est | district/city rollups |
| 2 | Residential confidence | buildings, land use, WorldCover, land masks | residential_confidence | manual QA plus failed-install checks |
| 3 | Income/affluence proxy | EOML, Village Development, satimage, nightlights | income_band_prob | Census/SHRUG/NFHS/internal ARPU |
| 4 | Serviceability | roads, terrain, partner/network, failed installs | serviceable_prob | ops serviceability outcomes |
| 5 | Acquirability | leads, installs, conversion, competition, channel coverage | acquirable_prob | time and spatial holdout |
| 6 | Graph context | GeoHG, SRAI, Hex2Vec, Highway2Vec | context features | spatial-block and city holdout |
This dense visual appendix maps each prior-art item to feature families and repeats the gates that keep the work audit-safe.
Make the long tail scannable without flattening it into prose.
Read the tiles as prompts for validation, not as readiness scores.
The floating lingo panel is convenient on screen; this chapter keeps the same vocabulary available when the report is printed or archived.
Keep the report's repeated terms unambiguous.
When a model artifact uses one of these terms, it should use the same meaning.
| Term | Plain-English meaning |
|---|---|
| Vendor TAM | A benchmark or evaluation label only; never a production feature. |
| Spatial holdout | A validation split that withholds geographic areas, not random rows. |
| City holdout | A validation split that tests whether the model survives unseen cities. |
| Denominator | The population, household, or building base before ability-to-buy adjustments. |
| Residential confidence | Evidence that a cell contains usable residential demand rather than empty built-up area. |
| Serviceability | Whether the business can feasibly reach or install in the cell. |
| Acquirability | Whether demand can be won through channels, price, and competition. |
| OSM completeness | A guardrail that separates missing map data from missing demand. |
| GeoHG | A heterogeneous graph prior art for area, semantic, and POI context. |
| Distillation | A research step that approximates benchmark labels without allowing those labels into production features. |