Main content area

An evaluation of the efficiency of passive acoustic monitoring in detecting deer and primates in comparison with camera traps

Enari, Hiroto, Enari, Haruka S., Okuda, Kei, Maruyama, Tetsuya, Okuda, Kana N.
Ecological indicators 2019 v.98 pp. 753-762
Cervus nippon, Macaca, acoustics, automation, cameras, deer, environmental indicators, monitoring, plasticity, population density, population distribution, population dynamics, screening, uncertainty, vocalization, Japan
In recent years, camera traps have rapidly become popular for the large-scale monitoring of wildlife distribution and population; however, we should not ignore the uncertainty regarding the reliability of camera-based monitoring by inexperienced data gatherers. This study introduces passive acoustic monitoring (PAM) as an easier technique for monitoring terrestrial mammals that uses the sound cues that they produce. To validate the efficacy of PAM, we quantitatively compared the detection areas and rates between sound cues (from PAM) and visual cues (from camera traps) of two mammals—the sika deer Cervus nippon and the Japanese macaque Macaca fuscata—across seven study sites in eastern Japan with different population densities. To collect sound cues, we set up multiple autonomous recording units at the sites and continuously recorded ambient sounds, following a pre-determined schedule. The total recording time reached 9081 h for deer and 8235 h for macaques. We then built sound recognizers to automatically detect eight target call types from the recorded data. To collect visual cues, we also set multiple camera traps at the same sites and for the same observation periods. The key findings were as follows: (1) the fully automated procedures that only used the recognizers to detect sound cues produced numerous false positive detections when the call type possessed vocal plasticity and variations; (2) the semi-automated procedures, which included an additional step to validate the automated detections by manual screening, exhibited a great improvement in the detectability and recall rates of the half of the target calls, reaching >0.70; (3) when using the semi-automated procedures, the frequency of deer and macaque detections per trap-day derived from the sound cues were in most cases approximately dozens of times and several times, respectively, higher than that derived from the visual cues; (4) the main advantage of PAM may be its superior detection areas, which were 100–7000 times wider than those of camera traps; and (5) the current success of the recognition of different call types of each species could broaden the use of PAM, which is not possible for camera traps. PAM could provide socio-behavioral data (i.e., the frequencies and types of inter-individual vocal communications) that could help understand the status of population dynamics and the group compositions, in addition to information related to the presence or absence of species.