Executive Summary
FloodCat — what it is, how it works, susceptibility map
Catalonia flood intelligence
32,108 km² · 7.8M people · 8 validated events · 138 historical episodes
What is FloodCat?
The project: FloodCat is an end-to-end flood-intelligence platform for Catalonia. It fuses four decades of historical flood records with Sentinel-1 SAR and Sentinel-2 optical imagery, a terrain-based susceptibility map, and a Random Forest impact model — turning raw geoscience data into decisions for insurers, civil protection, and farmers.
Three datasets, three roles:
- 8 validated modern events (2019–2021) — Sentinel-1/2 satellite-confirmed floods. Used to train the houses-affected regressor and to validate the FloodPotential susceptibility map (59% mean agreement between satellite-detected flood extents and our High/Medium zones).
- 82 AGORA / INUNGAMA episodes (1996–2020) — insurance-grade economic loss records. Used to train the economic-loss regressor (€ per event).
- 56 DesInventar episodes (1981–2003) — UN-OCHA disaster inventory with houses and people impacted. Used together with the modern events to expand the houses-affected training set.
The model: a scikit-learn Random Forest (200 trees, 20 features). Rainfall intensity, antecedent soil moisture, and FloodPotential exposure are the dominant predictors. Detection is delegated to IBM/NASA's Prithvi-EO-2.0 foundation model (Sentinel-2) and adaptive SAR thresholding (Sentinel-1).
Use the sidebar: Farmers to subscribe a parcel · Insurance for loss curves & impact estimation · Civil Protection for seasonality & climate trend · Technical for satellite detection & model performance.
Validation (8 events)
59%
Episodes (40 yrs)
138
Economic loss
€1.78B
.pkl Random Forest models (feature importances, 64 Catalonia training rows, FloodPotential zones) imported directly from the project repository.Zone 1 — High
Floods first — lowest FloodOrder quartile
Zone 2 — Medium
Second quartile of FloodOrder
Zone 3 — Low
Third quartile
Zone 4 — Very Low
Highest FloodOrder quartile
A map of where water naturally wants to go. We take a detailed elevation model of the land (3D terrain, ~30 m resolution), simulate how rain would flow downhill, and highlight the urban areas where it would pool first.
Each city block gets a risk grade from 1 (floods first) to 4 (floods last). This is based purely on the shape of the land — it doesn't know about past floods.
Random Forest · 200 trees · 20 features. Three regressors: AffectedHouses (64 events), AffectedPeople (64), EconomicLoss (126 AGORA).
Rainfall intensity and FloodPotential exposure dominate Gini importance.
① Feature extraction (20 vars)
② RF training (64 + 126 events)
③ Impact prediction
④ Sentinel-2 + Sentinel-1 detection
⑤ Validation against FloodPotential — 58.6%