Dangkor Flood Forecast

—

Risk Level

L1 Safe (P < 0.10)

L2 Low (0.10–0.40)

L3 Moderate (0.40–0.85)

L4 High (≥ 0.85)

Dashboard

Cells total

—

L3+ cells

—

Peak P (district)

—

Peak time

—

Sentinel-1 Validation — 2024 + 2025 seasons

Precision

88.7%

Recall

86.9%

F1 score

0.878

Mean lead time

53 h

Event-level: model warned (mean_prob ≥ 0.10) in 72 h before SAR-confirmed flood. 99 flood events across 115 Sentinel-1 overpasses · 2025 out-of-sample F1 = 0.885.

Communes — peak risk

Commune	Cells	Peak level	Peak P	% area L3+
Loading…

Window breakdown

Window	L4 High cells	L3 Moderate cells
Loading…

Run metadata

Loading…

Methodology

1. Overview

This engine produces a 72-hour probabilistic flood forecast on a 250 m grid (1,956 cells) covering Dangkor District, Phnom Penh. It runs every 6 hours on GitHub Actions using only free/open data. Predictions are physics-based — no machine learning in v1 — and combine a meteorological driver, a hydrological response model with Monte Carlo uncertainty, and regional amplifier signals.

Grid: 250 m Cells: 1,956 Communes: 13 Horizon: 72 h Timestep: 6 h Ensemble N = 30 Cadence: every 6 h

2. Four-Layer Architecture

Layer 1 — Meteorological Driver. Multi-model rainfall forecasts (ECMWF IFS, GFS, ICON) pulled from Open-Meteo for 9 sample points on a 3×3 grid across the district, then interpolated to the 250 m grid via Inverse-Distance Weighting (IDW, power = 2).
Layer 2 — Hydrological Response. SCS Curve Number runoff per cell, accumulated along a static flow-routing surface derived from Copernicus DEM (GLO-30) + pysheds D8. A Monte Carlo ensemble of N = 30 scenarios perturbs rainfall, curve number, and drainage capacity to produce a flood probability.
Layer 3 — Regional Signals. Three external signals provide confidence context: (1) Forecast verification — compares yesterday's Open-Meteo forecast against the observed archive to score model accuracy; (2) JAXA GSMaP NRT — independent satellite rainfall from the joint NASA/JAXA GPM constellation, 0.1° hourly with ~4h latency; (3) GloFAS — Copernicus river discharge forecast (stub, pending integration). JAXA and forecast-verification are displayed as confidence indicators. GloFAS, when active, boosts the regional amplifier to 1.3.
Layer 4 — Fusion & Classification. Physics probability is multiplied by a regional amplifier, clipped to [0, 1], then binned into four risk levels.

3. Data Sources

Source	Layer	Use
Open-Meteo (ECMWF IFS, GFS, ICON)	1	72 h rainfall forecast, multi-model ensemble
Open-Meteo Archive	1, 3	7-day antecedent rainfall for AMC scaling (Layer 1); forecast verification signal — yesterday's forecast vs. observed archive (Layer 3)
Copernicus DEM GLO-30 (AWS S3)	2	Elevation, flow direction, flow accumulation, HAND
ESA WorldCover 2021 (COG /vsicurl)	2	Land-cover class → Curve Number
GADM level-2 & level-3	Static	Dangkor boundary + 13 commune polygons
Open-Meteo Archive	3	Forecast verification signal — compares yesterday's forecast to observed archive (not raw satellite)
JAXA GSMaP NRT	3	Independent satellite rainfall confirmation — joint NASA/JAXA GPM constellation, 0.1° hourly, ~4h latency. Live when JAXA G-Portal credentials are configured.
GloFAS (Copernicus CDS)	3	River discharge forecast — stub (pending Copernicus CDS integration)

4. Layer 1 — Rainfall (IDW Interpolation)

Sample-point forecasts are interpolated to grid cell i using:

P_i = ( Σ_k w_ik · P_k ) / ( Σ_k w_ik ), w_ik = 1 / d_ik²

where d_ik is the Euclidean distance from cell i to sample point k in UTM zone 48N (EPSG:32648). Hourly values are aggregated into 6-hour timesteps by summation.

5. Layer 2 — SCS Curve Number Runoff

The NRCS (SCS) Curve Number method (TR-55, 1986) converts rainfall P (mm) to direct runoff Q (mm) for each cell and timestep:

S = 25400 / CN − 254 (mm)

I_a = 0.2 · S

Q = (P − I_a)² / (P − I_a + S) if P > I_a, else Q = 0

Curve Number is assigned per cell from ESA WorldCover × hydrologic soil group (default group D — clay floodplain for Dangkor).

AMC (Antecedent Moisture Condition) scales CN based on 7-day prior rainfall:

Dry (< 12.7 mm): factor 0.85
Normal (12.7–38.1 mm): factor 1.00
Wet (> 38.1 mm): factor 1.15

6. Flow Routing & Accumulation

Per-scenario re-routing is avoided by pre-computing a static weight surface once (pysheds: DEM fill → D8 flow direction → flow accumulation). Accumulated runoff volume at each cell is:

V_acc = V_local · (1 + ln(1 + A_flow))

where V_local = Q · cell_area / 1000 (m³) and A_flow is the upstream contributing cell count.

7. HAND & Inundation

HAND (Height Above Nearest Drainage) is computed by tracing each cell downstream along D8 until a drainage cell (flow accumulation ≥ 100) is reached; the vertical difference is the HAND value.

A cell is flagged as flooded in a timestep if either condition holds:

HAND < 2.0 m AND V_acc > drainage_capacity (500 m³)
Cumulative local runoff > local_storage (500 m³) — bathtub fallback

8. Monte Carlo Ensemble (N = 30)

Each scenario samples a model + perturbs inputs:

Rainfall: lognormal, σ ≈ 10 %
Curve Number: normal, σ ≈ 10 %
Drainage capacity: normal, σ ≈ 20 %

P_physics = (# scenarios flooded) / N

9. Layer 4 — Fusion & Classification

P_final = clip( P_physics · α_regional, 0, 1 )

α_regional = 1.0 by default, boosted to 1.3 when GloFAS indicates active flood conditions. In v0.2, JAXA GSMaP and forecast-verification signals are displayed as confidence indicators only and do not alter α_regional — active amplification is reserved for v1.0 pending re-validation.

Level	Probability	Label
L1	P < 0.10	Safe
L2	0.10 ≤ P < 0.40	Low
L3	0.40 ≤ P < 0.85	Moderate
L4	P ≥ 0.85	High

10. Commune Aggregation

Each grid cell is spatially joined to one of Dangkor's 13 sangkats (GADM level-3). For each commune:

P_commune = max(P_cell for all cells in commune)

plus the percent of commune area at L3+ and L4 — used for public-facing messaging and Telegram alerts.

11. Sentinel-1 SAR Validation

The engine was validated against two full monsoon seasons (2024 + 2025) of Copernicus Sentinel-1 SAR imagery processed in Google Earth Engine. Each S1 overpass was converted to a per-cell flood flag using VV-band backscatter change detection (−3 dB anomaly threshold vs. dry-season baseline, JRC permanent-water mask applied). This produced 224,940 cell-image observations across 115 overpasses and 1,956 grid cells.

Metric choice: Cell-level instantaneous accuracy is the wrong measure for a 72-hour early-warning system, because the model predicts a 72-hour peak probability while SAR captures one instantaneous snapshot. The correct question is: "On days when SAR confirmed significant flooding, did the model issue a warning in the 72 hours before?"

Metric	Value	Note
S1 flood events	99 / 115 images	>2% of district cells flooded
Events warned ✓	86 / 99	Model triggered ≥72h ahead
Precision	88.7%	Warnings that matched real floods
Recall	86.9%	Real floods that got a warning
F1 score	0.878	Harmonic mean
Mean lead time	53 h	Median 60 h, min 12 h
False alarms	11 / full season	≈ 1 per 10 days during monsoon

The 2025 season was fully out-of-sample (no thresholds were tuned on it) and returned F1 = 0.885 — confirming the model generalises across monsoon years. Warn threshold: district mean_prob ≥ 0.10 in any run within 72 h of the overpass.

12. Limitations

No live river-gauge assimilation (Mekong/Bassac levels enter only via GloFAS, stubbed in v1).
Flow routing uses a static weight surface, not dynamic hydraulics — no backwater or levee-overtopping physics.
Urban drainage infrastructure is represented only by a lumped capacity parameter.
13 missed events (of 99) cluster in brief early-season convective storms where rainfall intensity exceeds 72h average signal.
Forecasts are probabilistic and model-based — this is NOT an official warning.

See LIMITATIONS.md for the full list.

✈

Telegram Alerts — Coming Soon

Subscribe to automatic flood alerts for up to 5 locations of your choice — home, workplace, family, school. Pick any point on the map (Google Maps integration) and receive a Telegram message whenever the forecast risk at that location reaches your chosen threshold.

Up to 5 locations

Pin any address inside Dangkor via Google Maps Autocomplete.

Per-user threshold

Alert at L2, L3, or only L4. You choose the sensitivity.

Commune-aware messages

Human-readable alerts: "L3 Moderate risk in Stueng Mean Chey — peak in 18h".

EN / ខ្មែរ

Full bilingual support (English + Khmer).

Roadmap

✅ Historical database — Supabase/Postgres storing every 6h run since May 2024 (368 dates, 2 full monsoon seasons)
✅ Sentinel-1 SAR validation — 115 satellite images, 99 flood events validated. F1 = 0.878, mean lead time 53h.
✅ Alert dispatcher built — district mean_prob ≥ 0.10 triggers warning. Reads live runs every 6h. Telegram integration ready pending bot deployment.
⏳ Telegram bot on Railway with /subscribe, /mylocations, /threshold
⏳ Google Places Autocomplete for location entry

Want early access? Star the GitHub repo.