Home

Computational Ecology and Software, 2024, 14(1): 30-47
[XML] [EndNote] [RefManager] [BibTex] [

Full PDF (368K)] [Comment/Review Article]

Article

Performance comparison of object detectors in detecting wildlife animals in camera-trap images vis-a-vis their performance on MS COCO

Frank G. Kilima¹, Shubi Kaijage¹, Edith Luhanga¹, Colin Torney²
¹School of computational and Communications Sciences and Engineering (CoCSE), Nelson Mandela African Institution of Science and Technology (NM-AIST), Tanzania
²School of Mathematics and Statistics, University of Glasgow, UK

Received 6 June 2023;Accepted 10 July 2023;Published online 31 August 2023;Published 1 March 2024
IAEES

Abstract
Manual analysis of large amount of wildlife image data collected through camera-trapping is time consuming, expensive, prone to errors and bias, and laborious. It also constrains geographical area, size and duration of wildlife studies, and delays decision making. Recent advancements in deep learning (DL) have provided promising solutions for automating such analyses, and addressed many such drawbacks. However, the existence of many object detectors with varied performance on different datasets may complicate the choice of appropriate object detector for a particular task. This study deployed experimental approach to achieve two goals; firstly, to compare predictive performance and processing speed of Faster R-CNN and single shot multibox detector (SSD) across four feature extractors namely ResNet50, ResNet101, ResNet152 and Inception ResNet on MS COCO, vis-a-vis their performance on camera-trap image dataset (CTID) with 11,019 images. Secondly, to assess and compare performance of the same object detectors when trained on the same CTID. We found that object detectors demonstrate a smaller range in mean average precision (mAP) of 2.72% in our CTID than in MS COCO dataset (8.4%) and that detectors' performance on MS COCO and CTID does not match. We also found that all detectors attained similar predictive performance (accuracy) for each animal class on CTID, and that they all performed well in detecting hyena, giraffe, warthog, lion, and guineafowl, but poorly on baboon, buffalo and wildebeest. Lastly, detectors performed better on some smallbodied species like guineafowl and hartebeest than on large-bodied species like zebra and elephant, respectively, despite the large-bodied species having more training data than small-bodied ones. Similarly, all object detectors performed poorly on zebra with the largest training data (1351 images) than on hyena (931), giraffe (920), warthog (979), lion (862), guineafowl (931), hartebeest (479) and elephant (744). Similarly, elephant was outperformed on hartebeest by all detectors.

Keywords camera-trap image dataset;convolutional neural network;deep learning;MS COCO;object detection.

International Academy of Ecology and Environmental Sciences. E-mail: office@iaees.org
Copyright © 2009-2026 International Academy of Ecology and Environmental Sciences. All rights reserved.
Web administrator: office@iaees.org, website@iaees.org; Last modified: 2026/3/18

Translate page to: