Ensemble-Based Progressive YOLOv11 Framework for Quantitative Analysis of Riverine Environments

Saomyaraj Jha; Anubhav Jain; Vivek Kumar; Dhruv Singhal; Partha Pratim Roy; Alireza Alaei

doi:10.1007/978-981-95-4395-3_27

Back

Conference proceeding

Ensemble-Based Progressive YOLOv11 Framework for Quantitative Analysis of Riverine Environments

Saomyaraj Jha, Anubhav Jain, Vivek Kumar, Dhruv Singhal, Partha Pratim Roy and Alireza Alaei

Pattern Recognition and Computer Vision, Vol.16174(Part 1), pp.387-401

Lecture Notes in Computer Science

8th Asian Conference on Pattern Recognition, ACPR 2025, 8th (Gold Coast, Australia, 10/11/2025–13/11/2025)

11/11/2025

DOI: https://doi.org/10.1007/978-981-95-4395-3_27

Appears in Recent Faculty of Science and Engineering Publications

Metrics

16 Record Views

Abstract

River Litter Detection

YOLOv11

Progressive Training

Ensemble Learning

Test-Time Augmentation

StrongSORT

Million tonnes of plastic waste from rivers and coastal communities enter the seas and oceans every year. Therefore, monitoring and reducing riverine pollution is vital for environmental sustainability and public health. As manual surveys and monitoring strategies are costly and ineffective, automated computer vision solutions have emerged in recent years. However, detecting and counting floating litter in real time is challenging due to water surface reflections, occlusions, and lighting variations. This paper introduces a three-phase progressive training strategy using YOLOv11-based object detection algorithms. Phase 1 is Progressive Model Training, which trains YOLOv11 variants sequentially with increasing capacity as it allows for earlier convergence on architectures that are simpler before the models scale up. Phase 2 is the Ensembling that captures the features of multiple variants and combines them. Phase 3, Test-Time Augmentation, evaluates each model on four resolutions, maximizing inference accuracy by selecting the best scale for each image. StrongSORT, by its refined appearance embeddings and motion-consistency associations was integrated into our pipeline on top of YOLO’s detection to enhance per-object tracking and counting in live video, maintaining consistent object IDs even under partial occlusion and rapid water flow. The experimental results on our manually collected and annotated dataset show that the best model selected from our ensemble achieves a mAP@0.5 of 0.939, outperforming individual baseline variants.

Details

Title: Ensemble-Based Progressive YOLOv11 Framework for Quantitative Analysis of Riverine Environments
Creators: Saomyaraj Jha - Pandit Deendayal Energy University (India)
Anubhav Jain - College of Engineering Roorkee (India)
Vivek Kumar - National Institute of Technology Raipur
Dhruv Singhal - S.D. College of Engineering and Technology (India)
Partha Pratim Roy - Indian Institute of Technology Roorkee
Alireza Alaei - Southern Cross University
Contributors: Christian Wallraven (Editor) - Korea University
Ran He (Editor) - Chinese Academy of Sciences
Brian Lovell (Editor) - The University of Queensland
Prithwi Chakraborty (Editor) - Southern Cross University
Publication Details: Pattern Recognition and Computer Vision, Vol.16174(Part 1), pp.387-401
Conference: 8th Asian Conference on Pattern Recognition, ACPR 2025, 8th (Gold Coast, Australia, 10/11/2025–13/11/2025)
Series: Lecture Notes in Computer Science
Publisher: Springer Nature Singapore; Singapore
Grant note: This work was partially supported by the Southern Cross University Deputy Vice Chancellor (Research) Seed Grant, project number GLC: 31930. We gratefully thank Southern Cross University for their support.
Identifiers: 991013328720002368
Academic Unit: Information Technology; Faculty of Science and Engineering
Language: English
Resource Type: Conference proceeding

Ensemble-Based Progressive YOLOv11 Framework for Quantitative Analysis of Riverine Environments

Related links

Metrics

Abstract

Details

Southern Cross University Social media