⚠ Model-based forecast — NOT an official warning. Validated F1 = 0.878, 53h lead time (Sentinel-1 SAR, 2 seasons). Read limitations

Risk Level

L1 Safe (P < 0.10)
L2 Low (0.10–0.40)
L3 Moderate (0.40–0.85)
L4 High (≥ 0.85)

Dashboard

Cells total
L3+ cells
Peak P (district)
Peak time

Sentinel-1 Validation — 2024 + 2025 seasons

Precision
88.7%
Recall
86.9%
F1 score
0.878
Mean lead time
53 h

Event-level: model warned (mean_prob ≥ 0.10) in 72 h before SAR-confirmed flood. 99 flood events across 115 Sentinel-1 overpasses · 2025 out-of-sample F1 = 0.885.

Communes — peak risk

Commune Cells Peak level Peak P % area L3+
Loading…

Window breakdown

WindowL4 High cellsL3 Moderate cells
Loading…

Run metadata

Loading…

Methodology

1. Overview

This engine produces a 72-hour probabilistic flood forecast on a 250 m grid (1,956 cells) covering Dangkor District, Phnom Penh. It runs every 6 hours on GitHub Actions using only free/open data. Predictions are physics-based — no machine learning in v1 — and combine a meteorological driver, a hydrological response model with Monte Carlo uncertainty, and regional amplifier signals.

Grid: 250 m Cells: 1,956 Communes: 13 Horizon: 72 h Timestep: 6 h Ensemble N = 30 Cadence: every 6 h

2. Four-Layer Architecture

  1. Layer 1 — Meteorological Driver. Multi-model rainfall forecasts (ECMWF IFS, GFS, ICON) pulled from Open-Meteo for 9 sample points on a 3×3 grid across the district, then interpolated to the 250 m grid via Inverse-Distance Weighting (IDW, power = 2).
  2. Layer 2 — Hydrological Response. SCS Curve Number runoff per cell, accumulated along a static flow-routing surface derived from Copernicus DEM (GLO-30) + pysheds D8. A Monte Carlo ensemble of N = 30 scenarios perturbs rainfall, curve number, and drainage capacity to produce a flood probability.
  3. Layer 3 — Regional Signals. Three external signals provide confidence context: (1) Forecast verification — compares yesterday's Open-Meteo forecast against the observed archive to score model accuracy; (2) JAXA GSMaP NRT — independent satellite rainfall from the joint NASA/JAXA GPM constellation, 0.1° hourly with ~4h latency; (3) GloFAS — Copernicus river discharge forecast (stub, pending integration). JAXA and forecast-verification are displayed as confidence indicators. GloFAS, when active, boosts the regional amplifier to 1.3.
  4. Layer 4 — Fusion & Classification. Physics probability is multiplied by a regional amplifier, clipped to [0, 1], then binned into four risk levels.

3. Data Sources

SourceLayerUse
Open-Meteo (ECMWF IFS, GFS, ICON)172 h rainfall forecast, multi-model ensemble
Open-Meteo Archive1, 37-day antecedent rainfall for AMC scaling (Layer 1); forecast verification signal — yesterday's forecast vs. observed archive (Layer 3)
Copernicus DEM GLO-30 (AWS S3)2Elevation, flow direction, flow accumulation, HAND
ESA WorldCover 2021 (COG /vsicurl)2Land-cover class → Curve Number
GADM level-2 & level-3StaticDangkor boundary + 13 commune polygons
Open-Meteo Archive3Forecast verification signal — compares yesterday's forecast to observed archive (not raw satellite)
JAXA GSMaP NRT3Independent satellite rainfall confirmation — joint NASA/JAXA GPM constellation, 0.1° hourly, ~4h latency. Live when JAXA G-Portal credentials are configured.
GloFAS (Copernicus CDS)3River discharge forecast — stub (pending Copernicus CDS integration)

4. Layer 1 — Rainfall (IDW Interpolation)

Sample-point forecasts are interpolated to grid cell i using:

Pi = ( Σk wik · Pk ) / ( Σk wik ),  wik = 1 / dik2

where dik is the Euclidean distance from cell i to sample point k in UTM zone 48N (EPSG:32648). Hourly values are aggregated into 6-hour timesteps by summation.

5. Layer 2 — SCS Curve Number Runoff

The NRCS (SCS) Curve Number method (TR-55, 1986) converts rainfall P (mm) to direct runoff Q (mm) for each cell and timestep:

S = 25400 / CN − 254  (mm)
Ia = 0.2 · S
Q = (P − Ia)2 / (P − Ia + S)  if P > Ia, else Q = 0

Curve Number is assigned per cell from ESA WorldCover × hydrologic soil group (default group D — clay floodplain for Dangkor).

AMC (Antecedent Moisture Condition) scales CN based on 7-day prior rainfall:

  • Dry (< 12.7 mm): factor 0.85
  • Normal (12.7–38.1 mm): factor 1.00
  • Wet (> 38.1 mm): factor 1.15

6. Flow Routing & Accumulation

Per-scenario re-routing is avoided by pre-computing a static weight surface once (pysheds: DEM fill → D8 flow direction → flow accumulation). Accumulated runoff volume at each cell is:

Vacc = Vlocal · (1 + ln(1 + Aflow))

where Vlocal = Q · cell_area / 1000 (m³) and Aflow is the upstream contributing cell count.

7. HAND & Inundation

HAND (Height Above Nearest Drainage) is computed by tracing each cell downstream along D8 until a drainage cell (flow accumulation ≥ 100) is reached; the vertical difference is the HAND value.

A cell is flagged as flooded in a timestep if either condition holds:

  1. HAND < 2.0 m AND Vacc > drainage_capacity (500 m³)
  2. Cumulative local runoff > local_storage (500 m³) — bathtub fallback

8. Monte Carlo Ensemble (N = 30)

Each scenario samples a model + perturbs inputs:

  • Rainfall: lognormal, σ ≈ 10 %
  • Curve Number: normal, σ ≈ 10 %
  • Drainage capacity: normal, σ ≈ 20 %
Pphysics = (# scenarios flooded) / N

9. Layer 4 — Fusion & Classification

Pfinal = clip( Pphysics · αregional, 0, 1 )

αregional = 1.0 by default, boosted to 1.3 when GloFAS indicates active flood conditions. In v0.2, JAXA GSMaP and forecast-verification signals are displayed as confidence indicators only and do not alter αregional — active amplification is reserved for v1.0 pending re-validation.

LevelProbabilityLabelColor
L1P < 0.10Safe
L20.10 ≤ P < 0.40Low
L30.40 ≤ P < 0.85Moderate
L4P ≥ 0.85High

10. Commune Aggregation

Each grid cell is spatially joined to one of Dangkor's 13 sangkats (GADM level-3). For each commune:

Pcommune = max(Pcell for all cells in commune)

plus the percent of commune area at L3+ and L4 — used for public-facing messaging and Telegram alerts.

11. Sentinel-1 SAR Validation

The engine was validated against two full monsoon seasons (2024 + 2025) of Copernicus Sentinel-1 SAR imagery processed in Google Earth Engine. Each S1 overpass was converted to a per-cell flood flag using VV-band backscatter change detection (−3 dB anomaly threshold vs. dry-season baseline, JRC permanent-water mask applied). This produced 224,940 cell-image observations across 115 overpasses and 1,956 grid cells.

Metric choice: Cell-level instantaneous accuracy is the wrong measure for a 72-hour early-warning system, because the model predicts a 72-hour peak probability while SAR captures one instantaneous snapshot. The correct question is: "On days when SAR confirmed significant flooding, did the model issue a warning in the 72 hours before?"

MetricValueNote
S1 flood events99 / 115 images>2% of district cells flooded
Events warned ✓86 / 99Model triggered ≥72h ahead
Precision88.7%Warnings that matched real floods
Recall86.9%Real floods that got a warning
F1 score0.878Harmonic mean
Mean lead time53 hMedian 60 h, min 12 h
False alarms11 / full season≈ 1 per 10 days during monsoon

The 2025 season was fully out-of-sample (no thresholds were tuned on it) and returned F1 = 0.885 — confirming the model generalises across monsoon years. Warn threshold: district mean_prob ≥ 0.10 in any run within 72 h of the overpass.

12. Limitations

  • No live river-gauge assimilation (Mekong/Bassac levels enter only via GloFAS, stubbed in v1).
  • Flow routing uses a static weight surface, not dynamic hydraulics — no backwater or levee-overtopping physics.
  • Urban drainage infrastructure is represented only by a lumped capacity parameter.
  • 13 missed events (of 99) cluster in brief early-season convective storms where rainfall intensity exceeds 72h average signal.
  • Forecasts are probabilistic and model-based — this is NOT an official warning.

See LIMITATIONS.md for the full list.

Engine v0.2 · Open data · Open source · GitHub

Telegram Alerts — Coming Soon

Subscribe to automatic flood alerts for up to 5 locations of your choice — home, workplace, family, school. Pick any point on the map (Google Maps integration) and receive a Telegram message whenever the forecast risk at that location reaches your chosen threshold.

Up to 5 locations

Pin any address inside Dangkor via Google Maps Autocomplete.

Per-user threshold

Alert at L2, L3, or only L4. You choose the sensitivity.

Commune-aware messages

Human-readable alerts: "L3 Moderate risk in Stueng Mean Chey — peak in 18h".

EN / ខ្មែរ

Full bilingual support (English + Khmer).

Roadmap

  1. Historical database — Supabase/Postgres storing every 6h run since May 2024 (368 dates, 2 full monsoon seasons)
  2. Sentinel-1 SAR validation — 115 satellite images, 99 flood events validated. F1 = 0.878, mean lead time 53h.
  3. Alert dispatcher built — district mean_prob ≥ 0.10 triggers warning. Reads live runs every 6h. Telegram integration ready pending bot deployment.
  4. ⏳ Telegram bot on Railway with /subscribe, /mylocations, /threshold
  5. ⏳ Google Places Autocomplete for location entry

Want early access? Star the GitHub repo.