2 min readfrom Frontiers in Marine Science | New and Recent Articles

Binary reformulation for marine debris detection in Sentinel-2 imagery: an empirical study on extreme class imbalance using the first benchmarks on combined MARIDA and MADOS datasets

Binary reformulation for marine debris detection in Sentinel-2 imagery: an empirical study on extreme class imbalance using the first benchmarks on combined MARIDA and MADOS datasets
IntroductionMarine debris detection from satellite imagery is challenged by two major factors: extreme class imbalance, with debris pixels accounting for less than 0.01% of image content, and the need for robust generalization across diverse geographic and temporal domains for operational deployment. Although existing methods often report strong within-dataset performance, cross-dataset generalization, where models trained on one dataset are applied to entirely different geographic regions, remains insufficiently investigated.MethodsTo address this limitation, we conducted rigorous bidirectional cross-dataset validation experiments using the MARIDA and MADOS datasets. The problem was reformulated as a binary segmentation task and addressed using a standard U-Net architecture combined with a composite imbalance-aware loss and a rarity-aware sampling strategy. Two experimental settings were considered: training on MARIDA and testing on MADOS, and training on MADOS and testing on MARIDA.ResultsThe experiments revealed asymmetric cross-dataset generalization. Models trained on the geographically diverse MADOS dataset achieved an F1-score of 0.890 when tested on MARIDA, corresponding to only a 1.25% decrease from the within-dataset baseline of 0.901. In contrast, models trained on MARIDA achieved an F1-score of 0.833 on MADOS, representing a 7.55% decrease. The average cross-dataset degradation was 4.38%, which is substantially lower than the typical 10--25% performance drops reported in remote sensing domain-shift scenarios. Despite comparable patch counts (2,529 for MADOS versus 2,173 for MARIDA), the superior transferability of MADOS-trained models indicates that geographic diversity across globally distributed tiles is more beneficial than exhaustive annotation within concentrated regions. Moreover, the MADOS-to-MARIDA cross-dataset F1-score of 0.890 exceeded MAP-Mapper's within-dataset F1-score of 0.880 and closely approached MariNeXt's reported performance of 0.891.DiscussionThese findings show that careful data formulation and training design can enable standard architectures to achieve strong cross-domain performance under extreme class imbalance, approaching or even surpassing more specialized models in realistic deployment conditions. The results provide practical guidance for operational marine debris monitoring systems: spatially stratified sampling across diverse marine environments should be prioritized, F1-scores in the range of 0.86--0.89 can be expected when deploying on previously unseen regions without fine-tuning, and a two-stage strategy should be considered in which models are first trained on geographically diverse data and then optionally adapted for region-specific applications. To the best of our knowledge, this is the first systematic cross-dataset validation study involving both MARIDA and MADOS, demonstrating that binary reformulation supports generalization-preserving marine debris detection across geographic and temporal domain shifts.

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#marine science
#marine biodiversity
#marine life databases
#satellite remote sensing
#ocean data
#data visualization
#research datasets
#climate monitoring
#in-situ monitoring
#marine debris
#Sentinel-2 imagery
#class imbalance
#cross-dataset generalization
#F1-score
#MARIDA dataset
#MADOS dataset
#binary segmentation
#U-Net architecture
#imbalance-aware loss
#rarity-aware sampling