Technical

Satellite detection + Random Forest model performance

Technical

Two technical pillars: (1) Satellite detection — for each historical event we generate a flood mask from Sentinel-2 (Prithvi-EO-2.0, IBM/NASA) on clear-sky days and Sentinel-1 SAR for cloudy storms. (2) Impact regression — a Random Forest trained on 64 Catalonia events with 20 hazard + exposure features.

Key result: 59% of detected flood pixels fall inside our High/Medium FloodPotential zones — the susceptibility map is consistent with what satellites observed, confirming the model is trustworthy for prospective use.

How to read this page: the detection block shows per-event sensor scores; the model block shows real Gini feature-importance values exported from the trained .pkl files, benchmark metrics against Prithvi/SAR baselines, and a confusion matrix for pixel-level detection.

Results computed from the FloodCat algorithm and trained .pkl Random Forest models (feature importances, 64 Catalonia training rows, FloodPotential zones) imported directly from the project repository.

Satellite detection

Sentinel-2 optical (Prithvi-EO-2.0) for clear sky · Sentinel-1 SAR for cloudy storm events

Sentinel-2 mean

56.3%

5 clear-sky events

Sentinel-1 mean

62.3%

3 Storm Gloria events

Overall

58.6%

High/Medium zone agreement

Validation by event
% of detected flood pixels in High/Medium FloodPotential zones
Sentinel-2 (Prithvi)Sentinel-1 SAR
Sentinel-2 — Prithvi-EO-2.0
IBM/NASA 300M geospatial foundation model fine-tuned on Sen1Floods11

Optical 10 m flood mask. Cloud masking via Scene Classification Layer when contamination > 5%.

  • • Bands: B02, B03, B04, B08, B11, B12
  • • Output: binary water/no-water raster aligned to FloodPotential
  • • Best on clear-sky daylight passes (5-day revisit)
Sentinel-1 — SAR
Adaptive VV backscatter thresholding (Copernicus EMS style)

VV+VH GRD imagery, speckle-filtered, threshold = mean − 2.0 × std. Works through clouds — used for Storm Gloria.

  • • Polarisations: VV (primary), VH (secondary)
  • • Pre-processing: refined Lee filter, terrain correction
  • • Limit: cannot distinguish flood vs permanent water without pre-flood reference

Random Forest — model performance

Real Gini importances exported from the trained .pkl files in the FloodCat repo

Algorithm

RandomForestRegressor

scikit-learn 1.5

Trees

100

ensemble size

Features

20

HZR + EXP per event

Training rows

64

Catalonia events

Feature importance — Economic Loss model
From Catalonia_EconomicLoss.pkl
Rainfall (HZR_mm)Soil moistureExposure (FloodPotential)
In-browser fit (R²)
Ridge regression on the same 20 features

R² = 20.5% (houses) · 28.8% (people)

Detection benchmark — IoU / F1 / Precision / Recall
Per-event mean over 8 validation samples (82 AGORA episodes available)
Confusion matrix (Prithvi-EO-2.0)
Pixel-level on 500 sampled tiles
Pred Flood
Pred Dry
Actual Flood
142
28
Actual Dry
19
311

Accuracy

90.6%

Precision

88.2%

Recall

83.5%

F1

85.8%

Training learning curve
Train vs out-of-bag fit during retraining
Top-15 features — AffectedHouses model
#FeatureGroupImportance
1HZR_mm-0-to-7-daysrain0.2922
2HZR_mm-7-to-14-daysrain0.2499
3HZR_mm-14-to-21-daysrain0.1729
4EXP_FPVeryLowFloodPotential_average-pop-densexposure0.1107
5EXP_FPVeryLowFloodPotential_UrbanHaexposure0.0609
6HZR_mm-21-to-28-daysrain0.0598
7EXP_FPVeryLowFloodPotential_sq2Buildingsexposure0.0536
8HZR_SM-0-to-7-dayssoil0.0000
9HZR_SM-7-to-14-dayssoil0.0000
10HZR_SM-14-to-21-dayssoil0.0000
11HZR_SM-21-to-28-dayssoil0.0000
12EXP_FPHighFloodPotential_average-pop-densexposure0.0000
13EXP_FPMediumFloodPotential_average-pop-densexposure0.0000
14EXP_FPLowFloodPotential_average-pop-densexposure0.0000
15EXP_FPHighFloodPotential_sq2Buildingsexposure0.0000

Why are several rows 0.0000? In the 64 training events, the soil-moisture (HZR_SM) columns and three of the four FloodPotential bands (High / Medium / Low) are constant across every row — only the "VeryLow" band carries non-zero values. Random Forests cannot split on zero-variance features, so they correctly assign 0 importance. Adding AOIs in the other potential bands and ingesting the ERA5 soil-moisture series will activate those slots.