10 DAILY WACV Sunday Oral Presentation Artificial intelligence has achieved strong results across many domains, but its performance is less consistent in specialized scientific settings such as marine research. In response, the team approached the problem from a data perspective rather than focusing primarily on model design. They identified two major limitations in the datasets commonly used for marine applications. The first issue concerns geographic diversity. Many marine datasets are collected in specific locations and reflect only a narrow slice of ocean environments. “Usually when biologists collect data, they have a dataset of the Red Sea or Hong Kong, for example,” David tells us. “They’re limited in terms of diversity.” As a result, models trained on those datasets often struggle to generalize beyond the environments in which the data were collected. To address this, the ORCA dataset was designed to cover a much broader range of marine species and environments. The dataset contains more than 14,000 images spanning hundreds of species, with tens of thousands of bounding-box annotations and expert-verified captions describing individual organisms. By expanding both species ORCA: Object Recognition and Comprehension for Archiving Marine Species Yuk Kwan (David) Wong - left - is a master’s student, and Ziqiang Zheng - center - is a postdoctoral researcher at the Hong Kong University of Science and Technology, working with Kit Yeung - right. Their paper explores how computer vision systems might better support marine science by addressing gaps in existing datasets and task design for ocean research. They speak to us ahead of their oral and poster presentation this afternoon.
RkJQdWJsaXNoZXIy NTc3NzU=