Introduction

Among the many definitions of recursion (Martins, 2012), the view that it represents the capacity to iterate a signal within a self-similar signal has crossed centuries and disciplines, from von Humboldt (1836) and Hockett (1960), to Mandelbrot (1980) and Chomsky (2010); from fractals in mathematics (Mandelbrot, 1980) to generative grammars in linguistics (Chomsky, 2010). Across varying terminologies, the common denominator across fields is that to re-curse (that is, to ‘re-invoke’) or re-iterate is an algorithmic “hack” to produce infinite signal states from a finite signal set. Although classically associated with syntax (Chomsky, 2010; Idsardi et al., 2018), recursive signal production and the resulting self-embedded structures have also been recognised in phonology (Bennett, 2018; Elfner, 2015; Kabak and Revithiadou, 2009; Nasukawa, 2015, 2020; Vogel, 2012) and (verbal and non-verbal) music (Jackendoff, 2009; Koelsch et al., 2013; Martins et al., 2017; Sharma and Chimalakonda, 2018). Given that language and music are uniquely human, this hints at a neuro-cognitive or neuro-computational transformation in the human brain that enabled the emergence of multiple open-ended communication systems in the human lineage, but seemingly, none other (Hauser et al., 2002). The absence of vocal structures in (nonhuman) primates that could help inform insipient or transitional recursive states, namely within the hominid family, has led some scholars to question altogether whether recursion was the result of natural selection (Berwick and Chomsky, 2019; Bolhuis and Wynne, 2009) and to rebut the role of shared ancestry and evolution as an incremental path-dependent process (cf. de Boer et al., 2020; Jackendoff and Pinker, 2005; Kershenbaum et al., 2014; Martins and Boeckx, 2019), favouring instead sudden “hopeful monster” mutant scenarios.

Decades-long debates on the evolution of recursion have ensued, carved around the successes and limitations of empirical comparative animal research. For example, perception and processing of syntax-like vocal combinatorics has been identified in some bird (Engesser et al., 2019, 2016; Gentner et al., 2006; Liao et al., 2022; Suzuki et al., 2016, 2017) and primate species (Jiang et al., 2018; Wang et al., 2015; Watson et al., 2020) but results’ interpretation has received various criticisms (Johan J Bolhuis et al., 2018; Bowling and Fitch, 2015; Corballis and Corballis, 2014; Rawski et al., 2021). For example, these animal studies have, thus far, almost exclusively focused on perception (cf. Ferrigno et al., 2020; Liao et al., 2022): animals recognizing or responding to recursive signal structures; however, processing operations of recursive signals are not necessarily recursive themselves, for example (Corballis and Corballis, 2014; Miyagawa, 2021). Nonetheless, critically, most research has focused on laboratory animals in artificial test settings and/or who were presented with synthetic stimuli after dedicated human training (Gentner et al., 2006; Jiang et al., 2018; Liao et al., 2022; Watson et al., 2020), including the few exceptions based on production instead of perception (Ferrigno et al., 2020; Liao et al., 2022). This opens comparative research to two vital drawbacks. First, results are mute about evolutionary precursors and processes. Settings and stimuli have not been those to which species adapted over evolutionary time, with most species tested thus far being too distantly related to humans to allow inferences about recursion emergence among hominids in the first place. That animals may be able to recognise recursive structures or engage in non-vocal recursive action after undergoing fixed training protocols with human-designed stimuli will never per se indicate whether recursion has been selected in the wild, including human ancestors.

Second, results have been slanted by the dominant classic theory that, counter-intuitively, disfavours a gradual evolutionary scenario for recursion and instead conceives it as a punctuated event (Berwick and Chomsky, 2019; Bolhuis et al., 2015; Bolhuis and Wynne, 2009). Namely, experimental stimuli have consisted of artificial recursive signal sequences organized along a single temporal scale (though not structurally linear), similarly with how Merge and syntax operate. However, recursive signal structures can also unfold in other manners, such as across nested temporal scales and in the absence of semantics (Fitch, 2017a), as in music. But these remain utterly unprobed and untested.

An approach to the production of recursive signals, complementary to perceptual studies, is thus, desirable. Data from wild primates in particular may help infer signal patterns that were recursive in some degree or kind in an extinct past and subsequently moulded into the recursive structures observed today in modern humans. By virtue of their own primitive nature, proto-recursive structures did not likely fall within modern-day classifications, and thus, will often fail to be predicted based on assumptions guided by full-fledged language (Kershenbaum et al., 2014; Miyagawa, 2021). To this end, a structural approach is particularly advantageous. First, no prior assumptions are required about species’ cognitive capacities. These are directly inferred from how signal sequences are organised. For example, Chomsky’s definition of recursion (Chomsky, 2010) can generate non-self-embedded signal structures, but these would be operationally undetectable amongst other signal combinations. Second, no prior assumptions are required about signal meaning. There are no certain parallels with semantic content and word meaning in animals, but analyses of signal patterning allows to identify similarities between non-semantic (nonhuman) and semantic (human) combinatoric systems (Lipkind et al., 2013; Sainburg et al., 2019). Therefore, the search for recursion can be made in the absence of meaning-base operations, such as Merge, and more generally, semantics and syntax. Third, no prior assumptions are required about signal function. Under punctuated or gradual evolutionary hypotheses, ancestral signal function (whether cooperative, competitive or otherwise) is expected to have derived or been leveraged by its proto-recursive structure. Otherwise, once present and despite its origin process, recursion would not have been fixated among human ancestral populations. A structural approach opens, therefore, the field to untapped signal diversity in nature and yet unrecognised bona fide combinatoric states within the human clade.

Here, undertaking an explorative structural approach to recursion, we provide evidence for recursive self-embedded vocal patterning in a (nonhuman) great ape, namely, in the long calls of flanged orangutan males in the wild. We conducted precise rhythm analyses (De Gregorio et al., 2021; Roeske et al., 2020) of 66 long call audio recordings produced by 10 orangutans (Pongo pygmaeus wurmbii) across approximately 2510 observation hours at Tuanan, Central Kalimantan, Indonesian Borneo. We identified 5 different element types that comprise the structural building blocks of long calls in the wild (Hardus et al., 2009; Lameira and Wich, 2008), of which the primary type are full pulses (Fig. 1A). Full pulses do not, however, always exhibit uninterrupted vocal production throughout a long call [as during a long call’s climax (Spillmann et al., 2010)] but can break-up into 4 different “sub-pulse” element types: (i) grumble sub-pulses [quick succession of staccato calls that typically constitute the first build-up pulses of long calls (Hardus et al., 2009)], (ii) sub-pulse transitory elements and (iii) pulse bodies (typically constituting pulses before and/or after climax pulses) and (iv) bubble sub-pulses (quick succession of staccato calls that typically constitute the last tail-off pulses of long calls) (Fig. 1A). We characterised long calls’ full- and sub-pulses’ rhythmicity to determine if orangutan long calls present a re-iterated structure across different hierarchical strata. We extracted inter-onset-intervals (IOIs; i.e. time difference between the start of a vocal element and the preceding one - tk) from 8993 vocal long call elements (Fig. 1A): 1930 full pulses (1916 after filtering for 0.025<tk<5s), 757 grumble sub-pulses (731), 1068 sub-pulse transitory elements (374), 816 pulse bodies (11) and 4422 bubble sub-pulses (4193). From the extracted IOIs, we calculated their rhythmic ratio by dividing each IOI by its duration plus the duration of the following interval. We then computed the distribution of these ratios to ascertain whether the rhythm of long call full and sub-pulses presented natural categories, following published protocols (De Gregorio et al., 2021; Roeske et al., 2020) (Fig. 1B, C, D).

Organization and rhythmic features of orangutans’ long calls.

(A) On top, the spectrogram of a Full Pulse and its organization in Sub-Pulses (e.g., Grumble sub-pulses). Below are the spectrograms of the three other sub-element types: Sub-pulse transitory elements, Pulse bodies and Bubble sub-pulses. Bars on the top of each spectrogram schematically quantify durations of inter-onset intervals (tk): dark green denotes the higher-level of organization (Full pulse). Orange (in the inset) and light green (bottom right) denote the lower-level organization (sub-pulse element types).

(B) Probability density function showing the distributions of the inter-onset-intervals (tk) for each of the long call element types.

(C) The distributions on the left show rhythm ratios (rk) per element type as calculated on 12 flanged males for a total of 1915 Full-pulses and 5309 sub-pulses. Solid sections of the curves indicate on-isochrony rk values; striped sections indicate off-isochrony rk values. A solid white line indicates the 0.5 rk value corresponding to isochrony. White dotted lines denote the on-isochrony peak value extracted from the probability density function. On the right, a bar plot per each element type shows the percentage of observations (rk) falling into the on-isochrony boundaries (solid bars) or on off-isochrony boundaries (striped bars). The number of on-isochrony rk is significantly larger (GLMM, Full vs Null: Chisq=2717.543, p<0.001) than the number of off-isochrony rk for all long call element types (Full pulse: t-ratio=-25.164, p<0.001; Bubble sub-pulse: t-ratio= -30.694, p<0.001; Grumble sub-pulse: t-ratio=-14.526, p<0.001; Sub-pulse transitory element: t-ratio=-3.148, p<0.001). Pulse body showed no rk values falling within the on-off-isochrony boundaries.

(D) Distribution of a variable calculated as the ratio between the tk of a sub-pulse and the tk of the corresponding higher level of organization, the Full Pulse. We report the peak value of the curve (0.046) and tested the significance of the extent of the central quartiles, which was significantly smaller than peripheral quartiles (Wilcoxon signed-rank test: W=2272, p<0.001).

Results

The density probability function of orangutan full pulses showed one peak (rk=0.493) in close vicinity to a theoretically pure isochronic rhythm, that is, full pulses were regularly paced at 1:1 ratio, following a constant tempo along the long call (Fig. 1C). Our model (GLMM, full model vs null model: Chisq=298.2876, df=7, p<0.001; see Supplementary Materials) showed that pulse type, range of the curve (on-off-isochrony), and their interaction, had a significant effect on the count of rk values. In particular, full pulses’ isochronous peak tested significant (t.ratio=-15.957, p<0.0001), that is, the number of rk values falling inside on-isochrony range was significantly higher than the number of rks falling inside the off-isochrony range (Fig. 1C). Critically, three (of the four) orangutan sub-pulse element types – grumble sub-pulses, sub-pulse transitory elements and bubble sub-pulses – also showed significant peaks (grumble sub-pulses: t.ratio = -5.940, p<.0001; sub-pulse transitory elements: t.ratio=-4.048, p=0.0001; bubble sub-pulses: t.ratio= - 10.640, p<.0001) around pure isochrony (peak rk: grumble sub-pulses = 0.501; sub-pulse transitory elements=0.495; bubble sub-pulses=0.502; Fig. 1C). That is, sub-pulses were regularly paced within regularly paced full pulses, denoting isochrony within isochrony (Fig. 1C) at different average tempi (mean tk (sd): full pulses=1.696 (0.508); grumble sub-pulses=0.118 (0.111); sub-pulse transitory elements=0.239 (0.468); bubble sub-pulses= 0.186 (0.292); Fig. 1B). Overall, sub-pulses’ tk was equivalent to 0.046 of their comprising full-pulses (Fig. 1D), which put sub-pulses at an approximate ratio of 1:20 relative to that of full-pulses, the smallest categorical temporal rhythmic interval registered thus far in a vertebrate (De Gregorio et al., 2021; Roeske et al., 2020). Permuted discriminant function analyses (Mundry and Sommer, 2007) (crossed, in order to control for individual variation) in R (Team, 2013) based on seven acoustic measures extracted from grumble, transitory and bubble sub-pulses confirmed that these represented indeed distinct sub-pulse categories, where the percentage of correctly classified selected cases (62.7%) was significantly higher (p=0.001) than expected (37%).

Discussion

Orangutan long call nested rhythmic patterns reveal self-similar embedded isochrony in the vocal production of a wild great ape, notably, with two discernible structural strata – the full- and sub-pulse level – and three non-exclusive rhythmic arrangements in the form of [isochronyA [isochronya,b,c]].

Human and nonhuman great apes have similar auditory capacities (Quam et al., 2015). There are no identified skeletal differences in inner ear anatomy that could suggest significantly distinct sound sensitivity, resolution or activation thresholds in the time domain (Quam et al., 2015). Humans perceive an acoustic pulse as a continuous pitch, instead of a rhythm, at rates higher than 30 Hz (i.e., 30 beats per second). Long call sub-pulses exhibited average rhythms at ∼9.263 (3.994) Hz [i.e., tk=0.184 (0.303)s]. Therefore, hominid ear anatomy offers strong confidence that orangutans, like humans (and other great apes) perceive sub-pulse rhythmic motifs as such, i.e., as a train of signals, instead of one uninterrupted signal. Assuming otherwise would imply that auditory time-resolution differ by more than one order of magnitude between humans and other great apes in the absence of obvious anatomical culprits.

The simultaneous occurrence of non-exclusive recursive patterns excludes the likelihood that orangutans concatenate long calls and their subunits in linear structure without any recursive processes. To generate the observed vocal motifs as linear structure, three independent neuro-computational procedures would need to run in parallel to generate distinct isochronic rhythms at the sub-pulse level, whilst being indistinguishable, transposable and/or interchangeable at the pulse level without interference, which would be unlikely, if theoretically possible at all. Non-exclusive simultaneous recursive patterns also help exclude the probability that recursion was the primary by-product of anatomic constrains, such as breath length, heart beat and other physiological rhythmic processes or movements (Pouw et al., 2020). Such processes could generate various simultaneous rhythmic patterns; however, these would be expected to be in the form of harmonics of the same base “carrier” rhythm. Yet, the three observed rhythmic arrangements at the sub-pulse level were not related to the pulse level by any small integer ratios. Together, this strongly suggests that the observed recursive self-embedded motifs are most likely generated by a recursive neurological procedure or recursion algorithm.

Recursive self-embedded vocal motifs in orangutans indicate that vocal recursion among hominids is not exclusive to human vocal and cognitive systems. This is not to suggest that they exhibit all properties that recursion exhibits in modern language-able humans. Such expectation would be unreasonable, as it would imply that no evolution occurred in >10 million years since the split between the orangutan and human phylogenetic lineages. Thus, any apparent differences with recursion in today’s syntax, phonology, or music do not invalidate the probability that the reported vocal motifs represent an ancient, or potential ancestral, state for the ensuing evolution of recursion along the human clade.

Recursion and fractal phenomena are prevalent across the universe. From celestial and planetary movement to the splitting of tree branches and river deltas, and the morphology of bacteria colonies; Patterns within self-similar patterns are the norm, not the exception. This makes the seeming singularity of human recursion amongst vertebrate signals only the more enigmatic. Our findings indicate that ancient vocal patterns organized across nested structural strata were likely present in ancestral hominids. Recursive vocal production likely predated the evolutionary emergence of (spoken) language within the Hominid family and along the human lineage. This data-driven possibility poses new evolutionary trajectories and timelines distinct from the classic theoretical notions that the advent of recursion was a saltational all-or-nothing event that took place only recently in modern humans (Berwick and Chomsky, 2019). Gradual cumulative evolutionary scenarios for the emergence and evolution of recursion (de Boer et al., 2020; Martins and Boeckx, 2019) can and should be considered and investigated, with the advantage that living primate models in general, and natural hominid behaviour in particular, offer superior empirical and comparative validity beyond purely theoretical considerations (Lameira et al., 2021). A combination of both approaches will likely pay the highest heuristic dividends. For example, future research may implement playback experiments designed to clarify whether or how great apes cognitively represent natural recursive self-embedded motifs as informed by previous designs and theoretical considerations (Corballis and Corballis, 2014; Engesser et al., 2016; Watson et al., 2020).

Implications for the evolution of recursion and cognition

The presence of recursive self-embedded vocal motifs in orangutans carries four major implications for the evolution of recursion and cognition. First, much ink has been laid on the topic, yet, the possibility of self-embedded isochrony, or alternatively, non-exclusive self-embedded patterns occurring within the same signal sequence, has on no account been formulated or conjectured as a possible state of recursive signalling, be it in vertebrates, mammals, primates or otherwise, extant or extinct. This suggests that controversy may have thus far been underscored by data-poor circumstances, namely, lack of rigorous and detailed knowledge about call combinations in wild primates in general, and great apes in particular. The presence of recursive vocal patterns in a wild great ape in the absence of syntax, phonology or music helps prevent prior assumptions that recursion precursors ought, or should be expected to, operate as modern-day recursion. Orangutan recursive vocal patterns provide the first data point and open a new charter for possible insipient or transitional states within the hominid family that, whilst inherently distinct from modern-day recursion, were nonetheless homologous to it and fully functional in their own right. The open discussion about what kind of properties would or could make a structure proto-recursive or not will be essential to move the state-of-knowledge past antithetical, jointly exhaustive or mutually exclusive options of what recursion is, how it emerged and evolved.

Second, primate loud calls are functionally analogous to, for instance, bird and whale song. Accordingly, sophisticated signalling in far-related species does not undermine the value of great apes as living models for the study of language and its underpinning neuro-motoric operations, as often claimed (Berwick and Chomsky, 2019; Johan J. Bolhuis et al., 2018; Bolhuis and Wynne, 2009; Lattenkamp and Vernes, 2018; Suzuki et al., 2018; Vernes et al., 2021) based on the false assumption that great ape vocal research has been at a standstill for the last half-century (cf. Belyk and Brown, 2017; Bianchi et al., 2016; Crockford et al., 2004; Hopkins et al., 2007; Lameira, 2017; Lameira et al., 2016, 2015, 2013; Lameira and Shumaker, 2019; Pereira et al., 2020; Russell et al., 2013; Staes et al., 2017; Taglialatela et al., 2012; Watson et al., 2015; Wich et al., 2009, 2012). Our findings invite results across lineages to be tested with primates in order to explicitly assess their evolutionary relevance within the human clade. This will help avoid prematurely interpreting absence of evidence for evidence of absence in primates.

Third, our findings suggest that, despite criticism, recursive perceptual capacities identified in primates (Watson et al., 2020) may be factual and likely evolved to subserve vocal signals in the species’ natural repertoire (though potentially yet unrecognized). This invites for a renewed interest and re-analysis of primate signalling behaviour in the wild (Gabrić, 2021). However, findings also show that it may be too hasty to discuss whether perceptual capacities in primates or birds are equivalent to those engaged in syntax (Watson et al., 2020) or phonology (Rawski et al., 2021). Such classifications may be putting the proverbial cart before the horse; they are based on untested assumptions (e.g., that syntax and phonology evolved as separate “modules”, that one attained modern form before the other, et cetera) that may not have applied to proto-recursive ancestors (Kershenbaum et al., 2014; Miyagawa, 2021).

Forth, given that isochrony universally governs music and that recursion is a feature of music, findings suggest a possible evolutionary link between great ape loud calls and vocal music. Loud calling is an archetypal trait in primates (Wich and Nunn, 2002) but is absent in modern humans. Our findings suggest this may not be coincidental. Great ape loud calling may have preceded and subsequently transmuted into modern recursive vocal structures in humans. Given their conspicuousness, loud calls represent one of the most studied aspects of primate vocal behaviour (Wich and Nunn, 2002), but their rhythmic patterns have seldom been characterized with precision (Clink et al., 2020; De Gregorio et al., 2021; Gamba et al., 2016). Besides our analyses, there are remarkably few confirmed cases of isochrony in great apes (but see Raimondi et al., 2023), but the behaviours that have been rhythmically measured with accuracy have been implicated in the evolution of percussion (Fuhrmann et al., 2015) and musical expression (Dufour et al., 2015; Hattori and Tomonaga, 2020), such as social entrainment in chimpanzees in connection with the origin of dance (Lameira et al., 2019) [a capacity once also assumed to be neurologically impossible in great apes (Fitch, 2017b; Patel, 2014)]. This opens the possibility that recursive vocal production and supporting neural procedures were first and foremost a feature of proto-musical expression in human ancestors, later recruited and “re-engineered” for the generation of linguistic combinatorics.

Future studies outlining the distribution of isochrony across primate (vocal) behaviour offer promising new paths to empirically assess the evolution of recursive signal structures in music and language and will help move the needle forward on one the most tantalizing riddles in the evolution of language and cognition. These crucial data and insights will materialise if, as stewards of our planetary co-habitants, humankind secures the survival of these species and the preservation of their natural wild habitat (Estrada et al., 2022, 2017; Laurance, 2013; Laurance et al., 2012).

Methods and Materials

Study site

We conducted our research at the Tuanan Research Station (2°09′S; 114°26′E), Central Kalimantan, Indonesia. Long calls were opportunistically recorded from identified flanged males (Pongo pygmaeus wurmbii) using a Marantz Analogue Recorder PMD222 in combination with a Sennheiser Microphone ME 64 or a Sony Digital Recorder TCD-D100 in combination with a Sony Microphone ECM-M907.

Acoustic data extraction

Audio recordings were transferred to a computer with a sampling rate of 44.1 kHz. Seven acoustic measures were extracted directly from the spectrogram window (window type: Hann; 3 dB filter bandwidth: 124 Hz; grid frequency resolution: 2.69 Hz; grid time resolution: 256 samples) by manually drawing a selection encompassing the complete long call (sub)pulse from onset to offset, using Raven interactive sound analysis software (version 1.5, Cornell Lab of Ornithology). These parameters were duration(s), peak frequency (Hz), peak time, peak frequency contour average slope (Hz), peak frequency contour maximum slope (Hz), average entropy (Hz), signal-to-noise ratio (NIST quick method). Please see software’s documentation for full description of parameters (https://ravensoundsoftware.com/knowledge-base/pitch-tracking-frequency-contour-measurements/). Acoustic data extraction complemented the classification of long calls elements, both at the pulse and sub-pulse levels, based on close visual and auditory inspection of spectrograms, both based on elements’ distinctiveness between each other as well as in relation to the remaining catalogued orangutan call repertoire (Hardus et al., 2009) (see also supplementary audio files). Of these parameters, duration and peak frequency in particular have been shown to be resilient across recording settings(Lameira et al., 2013) and to adequately represent variation in the time and frequency axes (Lameira et al., 2017).

Rhythm data analyses

Inter-onset-intervals (IOI’s = tk) were only calculated from the begin time (s) of each full-and sub-pulse long call elements using Raven interactive sound analysis software, as above explained. tk was calculated only from subsequent (full/sub) pulse elements of the same type. Ratio values (rk) were calculated as tk/(tk+tk+1). Following the methodology of Roeske et al., 2020 and De Gregorio et al. 2021, to assess the significance of the peaks around isochrony (corresponding to the 0.5.rk value), we counted the number of rks falling inside on-isochrony ranges (0.440 < rk < 0.555) and off-isochrony ranges (0.400 < rk < 0.440 and 0.555 < rk < 0.600), symmetrically falling at the right and left sides of 1:1 ratios (0.5 rk value). We tested the count of on-isochrony rks versus the count of off-isochrony rks, per pulse type, with a GLMM for negative-binomial family distributions, using glmmTMB R library. In particular, we built a full model with the count of rk values as the response variable, the pulse type in interaction with the range the observation fell in (on- or off-isochrony) as predictors. We added an offset weighting the rk count based on the width of the bin. The individual contribution was set as random factor. We built a null model comprising only the offset and the random intercepts. We checked the number of residuals of the full and null models, and compared the two models with a likelihood ratio test (Anova with “Chisq” argument). We calculated p-values for each predictor using the R summary function and performed pairwise comparisons for each level of the explanatory variables with emmeans R package, adjusting all p-values with Bonferroni correction. We checked normality, homogeneity (via function provided by R. Mundry), and number of the residuals. We checked for overdispersion with performance R package (Lüdecke et al., 2020). Graphic visualization was prepared using R (Team, 2013) packages ggplot2 (Wickham, 2009) and ggridges (Wilke, 2022). Data reshape and organization were managed with dplyr and tidyr R packages.

Acoustic data analyses

Permutated discriminant function analysis with cross classification was performed using R and a function provided by Roger Mundry (Mundry and Sommer, 2007). The script was: pdfa.res=pDFA.crossed (test.fac=“Sub-pulse-type”, contr.fac=“Individual.ID”, variables=c(“Delta.Time”, “Peak.Freq”, “Peak.Time”, “PFC.Avg.Slope”, “PFC.Max.Slope”, “Avg.Entropy”, “SNR.NIST.Quick”), n.to.sel=NULL, n.sel=100, n.perm=1000, pdfa.data=xdata). These analyses assured that long call elements, at the pulse and sub-pulse level, indeed represented biologically distinct categories.

Acknowledgements

We thank the Indonesian Ministry of Research and Technology, the Indonesian Ministry of Environment and Forestry, the Indonesian Ministry of Home Affairs, the Directorate General of Natural Resources and Ecosystem Conservation and the former Directorate General of Forest Protection and Nature Conservation for authorization to carry out research in Indonesia; the Universitas National for supporting the project and acting as sponsors and counter-partners; the Bornean Orangutan Survival Foundation and the MAWAS Programme in Palangkaraya for their support and permission to stay and work in the MAWAS Reserve. A.R.L. was supported by the UK Research & Innovation, Future Leaders Fellowship grant agreement number MR/T04229X/1.

Author contributions

A.R.L. conceived and designed the study. A.R.L. and M.E.H. collected data. A.R.L., A.R., T.R. and M.G. analysed data. A.R.L., M.E.H., A.R., T.R. and M.G. wrote the paper.

Competing interests

The authors declare no competing interests.

Correspondence and requests for materials

should be addressed to Adriano R. Lameira.

Supplementary Materials