Virginia Tech® home

Biodiversity Image Quality Metadata Augments Convolutional Neural Network Classification of Fish Species

Anuj Karpatne 

Abstract

Biodiversity image repositories are crucial sources for training machine learning approaches to support biological research. Metadata about object (e.g. image) quality is a putatively important prerequisite to selecting samples for these experiments. This paper reports on a study demonstrating the importance of image quality metadata for a species classification experiment involving a corpus of 1935 fish specimen images which were annotated with 22 metadata quality properties. A small subset of high quality images produced an F1 accuracy of 0.41 compared to 0.35 for a taxonomically matched subset low quality images when used by a convolutional neural network approach to species identification. Using the full corpus of images revealed that image quality differed between correctly classified and misclassified images. We found anatomical feature visibility was the most important quality feature for classification accuracy. We suggest biodiversity image repositories consider adopting a minimal set of image quality metadata to support machine learning.

People

Publication Details

Date of publication: March 17, 2021

Conference: Springer Metadata and Semantic Research

Page number(s): 3-12

Volume:

Issue Number:

Publication Note: Jeremy Leipzig , Yasin Bakis , Xiaojun Wang , Mohannad Elhamod , Kelly Diamond , Wasila M. Dahdul , Anuj Karpatne , A. Murat Maga , Paula M. Mabee , Henry L. Bart , Jane Greenberg: Biodiversity Image Quality Metadata Augments Convolutional Neural Network Classification of Fish Species. MTSR 2020: 3-12