Abstract
Recursive procedures that allow placing a vocal signal inside another of similar kind provide a neuro-computational blueprint for syntax and phonology in spoken language and human song. There are, however, no known vocal patterns among nonhuman primates arranged in self-embedded combinations that evince vocal recursion or potential insipient forms and neuro-procedures thereof, suggesting a neuro-cognitive transformation exclusive to humans. Here, we uncover that wild flanged male orangutan long calls show two hierarchical strata, wherein rhythmically isochronous call sequences are nested within self-similar isochronous call sequences. Remarkably, three unrelated recursive motifs occurred simultaneously in long calls, refuting that motifs resulted from three parallel linear procedures or that motifs were simple anatomical artifacts of bodily constrains. Findings represent a case of recursive hominid vocal production in the absence syntax, semantics, phonology or music. Second-order combinatorics, ‘sequences within sequences’, involving hierarchically organized and cyclically structured vocal sounds in ancient hominids may have preluded the evolution of recursion in modern language-able humans.
Introduction
Among the many definitions of recursion (Martins, 2012), the view that it represents the capacity to iterate a signal within a self-similar signal has crossed centuries and disciplines, from von Humboldt (1836) and Hockett (1960), to Mandelbrot (1980) and Chomsky (2010); from fractals in mathematics (Mandelbrot, 1980) to generative grammars in linguistics (Chomsky, 2010). Across varying terminologies, the common denominator across fields is that to re-curse (that is, to ‘re-invoke’) or re-iterate is an algorithmic “hack” to produce infinite signal states from a finite signal set. Although classically associated with syntax (Chomsky, 2010; Idsardi et al., 2018), recursive signal production and the resulting self-embedded structures have also been recognised in phonology (Bennett, 2018; Elfner, 2015; Kabak and Revithiadou, 2009; Nasukawa, 2015, 2020; Vogel, 2012) and (verbal and non-verbal) music (Jackendoff, 2009; Koelsch et al., 2013; Martins et al., 2017; Sharma and Chimalakonda, 2018). Given that language and music are uniquely human, this hints at a neuro-cognitive or neuro-computational transformation in the human brain that enabled the emergence of multiple open-ended communication systems in the human lineage, but seemingly, none other (Hauser et al., 2002). The absence of vocal structures in (nonhuman) primates that could help inform insipient or transitional recursive states, namely within the hominid family, has led some scholars to question altogether whether recursion was the result of natural selection (Berwick and Chomsky, 2019; Bolhuis and Wynne, 2009) and to rebut the role of shared ancestry and evolution as an incremental path-dependent process (cf. de Boer et al., 2020; Jackendoff and Pinker, 2005; Kershenbaum et al., 2014; Martins and Boeckx, 2019), favouring instead sudden “hopeful monster” mutant scenarios.
Decades-long debates on the evolution of recursion have ensued, carved around the successes and limitations of empirical comparative animal research. For example, perception and processing of syntax-like vocal combinatorics has been identified in some bird (Engesser et al., 2019, 2016; Gentner et al., 2006; Liao et al., 2022; Suzuki et al., 2016, 2017) and primate species (Jiang et al., 2018; Wang et al., 2015; Watson et al., 2020) but results’ interpretation has received various criticisms (Johan J Bolhuis et al., 2018; Bowling and Fitch, 2015; Corballis and Corballis, 2014; Rawski et al., 2021). For example, these animal studies have, thus far, almost exclusively focused on perception (cf. Ferrigno et al., 2020; Liao et al., 2022): animals recognizing or responding to recursive signal structures; however, processing operations of recursive signals are not necessarily recursive themselves, for example (Corballis and Corballis, 2014; Miyagawa, 2021). Nonetheless, critically, most research has focused on laboratory animals in artificial test settings and/or who were presented with synthetic stimuli after dedicated human training (Gentner et al., 2006; Jiang et al., 2018; Liao et al., 2022; Watson et al., 2020), including the few exceptions based on production instead of perception (Ferrigno et al., 2020; Liao et al., 2022). This opens comparative research to two vital drawbacks. First, results are mute about evolutionary precursors and processes. Settings and stimuli have not been those to which species adapted over evolutionary time, with most species tested thus far being too distantly related to humans to allow inferences about recursion emergence among hominids in the first place. That animals may be able to recognise recursive structures or engage in non-vocal recursive action after undergoing fixed training protocols with human-designed stimuli will never per se indicate whether recursion has been selected in the wild, including human ancestors.
Second, results have been slanted by the dominant classic theory that, counter-intuitively, disfavours a gradual evolutionary scenario for recursion and instead conceives it as a punctuated event (Berwick and Chomsky, 2019; Bolhuis et al., 2015; Bolhuis and Wynne, 2009). Namely, experimental stimuli have consisted of artificial recursive signal sequences organized along a single temporal scale (though not structurally linear), similarly with how Merge and syntax operate. However, recursive signal structures can also unfold in other manners, such as across nested temporal scales and in the absence of semantics (Fitch, 2017a), as in music. But these remain utterly unprobed and untested.
An approach to the production of recursive signals, complementary to perceptual studies, is thus, desirable. Data from wild primates in particular may help infer signal patterns that were recursive in some degree or kind in an extinct past and subsequently moulded into the recursive structures observed today in modern humans. By virtue of their own primitive nature, proto-recursive structures did not likely fall within modern-day classifications, and thus, will often fail to be predicted based on assumptions guided by full-fledged language (Kershenbaum et al., 2014; Miyagawa, 2021). To this end, a structural approach is particularly advantageous. First, no prior assumptions are required about species’ cognitive capacities. These are directly inferred from how signal sequences are organised. For example, Chomsky’s definition of recursion (Chomsky, 2010) can generate non-self-embedded signal structures, but these would be operationally undetectable amongst other signal combinations. Second, no prior assumptions are required about signal meaning. There are no certain parallels with semantic content and word meaning in animals, but analyses of signal patterning allows to identify similarities between non-semantic (nonhuman) and semantic (human) combinatoric systems (Lipkind et al., 2013; Sainburg et al., 2019). Therefore, the search for recursion can be made in the absence of meaning-base operations, such as Merge, and more generally, semantics and syntax. Third, no prior assumptions are required about signal function. Under punctuated or gradual evolutionary hypotheses, ancestral signal function (whether cooperative, competitive or otherwise) is expected to have derived or been leveraged by its proto-recursive structure. Otherwise, once present and despite its origin process, recursion would not have been fixated among human ancestral populations. A structural approach opens, therefore, the field to untapped signal diversity in nature and yet unrecognised bona fide combinatoric states within the human clade.
Here, undertaking an explorative structural approach to recursion, we provide evidence for recursive self-embedded vocal patterning in a (nonhuman) great ape, namely, in the long calls of flanged orangutan males in the wild. We conducted precise rhythm analyses (De Gregorio et al., 2021; Roeske et al., 2020) of 66 long call audio recordings produced by 10 orangutans (Pongo pygmaeus wurmbii) across approximately 2510 observation hours at Tuanan, Central Kalimantan, Indonesian Borneo. We identified 5 different element types that comprise the structural building blocks of long calls in the wild (Hardus et al., 2009; Lameira and Wich, 2008), of which the primary type are full pulses (Fig. 1A). Full pulses do not, however, always exhibit uninterrupted vocal production throughout a long call [as during a long call’s climax (Spillmann et al., 2010)] but can break-up into 4 different “sub-pulse” element types: (i) grumble sub-pulses [quick succession of staccato calls that typically constitute the first build-up pulses of long calls (Hardus et al., 2009)], (ii) sub-pulse transitory elements and (iii) pulse bodies (typically constituting pulses before and/or after climax pulses) and (iv) bubble sub-pulses (quick succession of staccato calls that typically constitute the last tail-off pulses of long calls) (Fig. 1A). We characterised long calls’ full- and sub-pulses’ rhythmicity to determine if orangutan long calls present a re-iterated structure across different hierarchical strata. We extracted inter-onset-intervals (IOIs; i.e. time difference between the start of a vocal element and the preceding one - tk) from 8993 vocal long call elements (Fig. 1A): 1930 full pulses (1916 after filtering for 0.025<tk<5s), 757 grumble sub-pulses (731), 1068 sub-pulse transitory elements (374), 816 pulse bodies (11) and 4422 bubble sub-pulses (4193). From the extracted IOIs, we calculated their rhythmic ratio by dividing each IOI by its duration plus the duration of the following interval. We then computed the distribution of these ratios to ascertain whether the rhythm of long call full and sub-pulses presented natural categories, following published protocols (De Gregorio et al., 2021; Roeske et al., 2020) (Fig. 1B, C, D).

Organization and rhythmic features of orangutans’ long calls.
(A) On top, the spectrogram of a Full Pulse and its organization in Sub-Pulses (e.g., Grumble sub-pulses). Below are the spectrograms of the three other sub-element types: Sub-pulse transitory elements, Pulse bodies and Bubble sub-pulses. Bars on the top of each spectrogram schematically quantify durations of inter-onset intervals (tk): dark green denotes the higher-level of organization (Full pulse). Orange (in the inset) and light green (bottom right) denote the lower-level organization (sub-pulse element types).
(B) Probability density function showing the distributions of the inter-onset-intervals (tk) for each of the long call element types.
(C) The distributions on the left show rhythm ratios (rk) per element type as calculated on 12 flanged males for a total of 1915 Full-pulses and 5309 sub-pulses. Solid sections of the curves indicate on-isochrony rk values; striped sections indicate off-isochrony rk values. A solid white line indicates the 0.5 rk value corresponding to isochrony. White dotted lines denote the on-isochrony peak value extracted from the probability density function. On the right, a bar plot per each element type shows the percentage of observations (rk) falling into the on-isochrony boundaries (solid bars) or on off-isochrony boundaries (striped bars). The number of on-isochrony rk is significantly larger (GLMM, Full vs Null: Chisq=2717.543, p<0.001) than the number of off-isochrony rk for all long call element types (Full pulse: t-ratio=-25.164, p<0.001; Bubble sub-pulse: t-ratio= -30.694, p<0.001; Grumble sub-pulse: t-ratio=-14.526, p<0.001; Sub-pulse transitory element: t-ratio=-3.148, p<0.001). Pulse body showed no rk values falling within the on-off-isochrony boundaries.
(D) Distribution of a variable calculated as the ratio between the tk of a sub-pulse and the tk of the corresponding higher level of organization, the Full Pulse. We report the peak value of the curve (0.046) and tested the significance of the extent of the central quartiles, which was significantly smaller than peripheral quartiles (Wilcoxon signed-rank test: W=2272, p<0.001).
Results
The density probability function of orangutan full pulses showed one peak (rk=0.493) in close vicinity to a theoretically pure isochronic rhythm, that is, full pulses were regularly paced at 1:1 ratio, following a constant tempo along the long call (Fig. 1C). Our model (GLMM, full model vs null model: Chisq=298.2876, df=7, p<0.001; see Supplementary Materials) showed that pulse type, range of the curve (on-off-isochrony), and their interaction, had a significant effect on the count of rk values. In particular, full pulses’ isochronous peak tested significant (t.ratio=-15.957, p<0.0001), that is, the number of rk values falling inside on-isochrony range was significantly higher than the number of rks falling inside the off-isochrony range (Fig. 1C). Critically, three (of the four) orangutan sub-pulse element types – grumble sub-pulses, sub-pulse transitory elements and bubble sub-pulses – also showed significant peaks (grumble sub-pulses: t.ratio = -5.940, p<.0001; sub-pulse transitory elements: t.ratio=-4.048, p=0.0001; bubble sub-pulses: t.ratio= - 10.640, p<.0001) around pure isochrony (peak rk: grumble sub-pulses = 0.501; sub-pulse transitory elements=0.495; bubble sub-pulses=0.502; Fig. 1C). That is, sub-pulses were regularly paced within regularly paced full pulses, denoting isochrony within isochrony (Fig. 1C) at different average tempi (mean tk (sd): full pulses=1.696 (0.508); grumble sub-pulses=0.118 (0.111); sub-pulse transitory elements=0.239 (0.468); bubble sub-pulses= 0.186 (0.292); Fig. 1B). Overall, sub-pulses’ tk was equivalent to 0.046 of their comprising full-pulses (Fig. 1D), which put sub-pulses at an approximate ratio of 1:20 relative to that of full-pulses, the smallest categorical temporal rhythmic interval registered thus far in a vertebrate (De Gregorio et al., 2021; Roeske et al., 2020). Permuted discriminant function analyses (Mundry and Sommer, 2007) (crossed, in order to control for individual variation) in R (Team, 2013) based on seven acoustic measures extracted from grumble, transitory and bubble sub-pulses confirmed that these represented indeed distinct sub-pulse categories, where the percentage of correctly classified selected cases (62.7%) was significantly higher (p=0.001) than expected (37%).
Discussion
Orangutan long call nested rhythmic patterns reveal self-similar embedded isochrony in the vocal production of a wild great ape, notably, with two discernible structural strata – the full- and sub-pulse level – and three non-exclusive rhythmic arrangements in the form of [isochronyA [isochronya,b,c]].
Human and nonhuman great apes have similar auditory capacities (Quam et al., 2015). There are no identified skeletal differences in inner ear anatomy that could suggest significantly distinct sound sensitivity, resolution or activation thresholds in the time domain (Quam et al., 2015). Humans perceive an acoustic pulse as a continuous pitch, instead of a rhythm, at rates higher than 30 Hz (i.e., 30 beats per second). Long call sub-pulses exhibited average rhythms at ∼9.263 (3.994) Hz [i.e., tk=0.184 (0.303)s]. Therefore, hominid ear anatomy offers strong confidence that orangutans, like humans (and other great apes) perceive sub-pulse rhythmic motifs as such, i.e., as a train of signals, instead of one uninterrupted signal. Assuming otherwise would imply that auditory time-resolution differ by more than one order of magnitude between humans and other great apes in the absence of obvious anatomical culprits.
The simultaneous occurrence of non-exclusive recursive patterns excludes the likelihood that orangutans concatenate long calls and their subunits in linear structure without any recursive processes. To generate the observed vocal motifs as linear structure, three independent neuro-computational procedures would need to run in parallel to generate distinct isochronic rhythms at the sub-pulse level, whilst being indistinguishable, transposable and/or interchangeable at the pulse level without interference, which would be unlikely, if theoretically possible at all. Non-exclusive simultaneous recursive patterns also help exclude the probability that recursion was the primary by-product of anatomic constrains, such as breath length, heart beat and other physiological rhythmic processes or movements (Pouw et al., 2020). Such processes could generate various simultaneous rhythmic patterns; however, these would be expected to be in the form of harmonics of the same base “carrier” rhythm. Yet, the three observed rhythmic arrangements at the sub-pulse level were not related to the pulse level by any small integer ratios. Together, this strongly suggests that the observed recursive self-embedded motifs are most likely generated by a recursive neurological procedure or recursion algorithm.
Recursive self-embedded vocal motifs in orangutans indicate that vocal recursion among hominids is not exclusive to human vocal and cognitive systems. This is not to suggest that they exhibit all properties that recursion exhibits in modern language-able humans. Such expectation would be unreasonable, as it would imply that no evolution occurred in >10 million years since the split between the orangutan and human phylogenetic lineages. Thus, any apparent differences with recursion in today’s syntax, phonology, or music do not invalidate the probability that the reported vocal motifs represent an ancient, or potential ancestral, state for the ensuing evolution of recursion along the human clade.
Recursion and fractal phenomena are prevalent across the universe. From celestial and planetary movement to the splitting of tree branches and river deltas, and the morphology of bacteria colonies; Patterns within self-similar patterns are the norm, not the exception. This makes the seeming singularity of human recursion amongst vertebrate signals only the more enigmatic. Our findings indicate that ancient vocal patterns organized across nested structural strata were likely present in ancestral hominids. Recursive vocal production likely predated the evolutionary emergence of (spoken) language within the Hominid family and along the human lineage. This data-driven possibility poses new evolutionary trajectories and timelines distinct from the classic theoretical notions that the advent of recursion was a saltational all-or-nothing event that took place only recently in modern humans (Berwick and Chomsky, 2019). Gradual cumulative evolutionary scenarios for the emergence and evolution of recursion (de Boer et al., 2020; Martins and Boeckx, 2019) can and should be considered and investigated, with the advantage that living primate models in general, and natural hominid behaviour in particular, offer superior empirical and comparative validity beyond purely theoretical considerations (Lameira et al., 2021). A combination of both approaches will likely pay the highest heuristic dividends. For example, future research may implement playback experiments designed to clarify whether or how great apes cognitively represent natural recursive self-embedded motifs as informed by previous designs and theoretical considerations (Corballis and Corballis, 2014; Engesser et al., 2016; Watson et al., 2020).
Implications for the evolution of recursion and cognition
The presence of recursive self-embedded vocal motifs in orangutans carries four major implications for the evolution of recursion and cognition. First, much ink has been laid on the topic, yet, the possibility of self-embedded isochrony, or alternatively, non-exclusive self-embedded patterns occurring within the same signal sequence, has on no account been formulated or conjectured as a possible state of recursive signalling, be it in vertebrates, mammals, primates or otherwise, extant or extinct. This suggests that controversy may have thus far been underscored by data-poor circumstances, namely, lack of rigorous and detailed knowledge about call combinations in wild primates in general, and great apes in particular. The presence of recursive vocal patterns in a wild great ape in the absence of syntax, phonology or music helps prevent prior assumptions that recursion precursors ought, or should be expected to, operate as modern-day recursion. Orangutan recursive vocal patterns provide the first data point and open a new charter for possible insipient or transitional states within the hominid family that, whilst inherently distinct from modern-day recursion, were nonetheless homologous to it and fully functional in their own right. The open discussion about what kind of properties would or could make a structure proto-recursive or not will be essential to move the state-of-knowledge past antithetical, jointly exhaustive or mutually exclusive options of what recursion is, how it emerged and evolved.
Second, primate loud calls are functionally analogous to, for instance, bird and whale song. Accordingly, sophisticated signalling in far-related species does not undermine the value of great apes as living models for the study of language and its underpinning neuro-motoric operations, as often claimed (Berwick and Chomsky, 2019; Johan J. Bolhuis et al., 2018; Bolhuis and Wynne, 2009; Lattenkamp and Vernes, 2018; Suzuki et al., 2018; Vernes et al., 2021) based on the false assumption that great ape vocal research has been at a standstill for the last half-century (cf. Belyk and Brown, 2017; Bianchi et al., 2016; Crockford et al., 2004; Hopkins et al., 2007; Lameira, 2017; Lameira et al., 2016, 2015, 2013; Lameira and Shumaker, 2019; Pereira et al., 2020; Russell et al., 2013; Staes et al., 2017; Taglialatela et al., 2012; Watson et al., 2015; Wich et al., 2009, 2012). Our findings invite results across lineages to be tested with primates in order to explicitly assess their evolutionary relevance within the human clade. This will help avoid prematurely interpreting absence of evidence for evidence of absence in primates.
Third, our findings suggest that, despite criticism, recursive perceptual capacities identified in primates (Watson et al., 2020) may be factual and likely evolved to subserve vocal signals in the species’ natural repertoire (though potentially yet unrecognized). This invites for a renewed interest and re-analysis of primate signalling behaviour in the wild (Gabrić, 2021). However, findings also show that it may be too hasty to discuss whether perceptual capacities in primates or birds are equivalent to those engaged in syntax (Watson et al., 2020) or phonology (Rawski et al., 2021). Such classifications may be putting the proverbial cart before the horse; they are based on untested assumptions (e.g., that syntax and phonology evolved as separate “modules”, that one attained modern form before the other, et cetera) that may not have applied to proto-recursive ancestors (Kershenbaum et al., 2014; Miyagawa, 2021).
Forth, given that isochrony universally governs music and that recursion is a feature of music, findings suggest a possible evolutionary link between great ape loud calls and vocal music. Loud calling is an archetypal trait in primates (Wich and Nunn, 2002) but is absent in modern humans. Our findings suggest this may not be coincidental. Great ape loud calling may have preceded and subsequently transmuted into modern recursive vocal structures in humans. Given their conspicuousness, loud calls represent one of the most studied aspects of primate vocal behaviour (Wich and Nunn, 2002), but their rhythmic patterns have seldom been characterized with precision (Clink et al., 2020; De Gregorio et al., 2021; Gamba et al., 2016). Besides our analyses, there are remarkably few confirmed cases of isochrony in great apes (but see Raimondi et al., 2023), but the behaviours that have been rhythmically measured with accuracy have been implicated in the evolution of percussion (Fuhrmann et al., 2015) and musical expression (Dufour et al., 2015; Hattori and Tomonaga, 2020), such as social entrainment in chimpanzees in connection with the origin of dance (Lameira et al., 2019) [a capacity once also assumed to be neurologically impossible in great apes (Fitch, 2017b; Patel, 2014)]. This opens the possibility that recursive vocal production and supporting neural procedures were first and foremost a feature of proto-musical expression in human ancestors, later recruited and “re-engineered” for the generation of linguistic combinatorics.
Future studies outlining the distribution of isochrony across primate (vocal) behaviour offer promising new paths to empirically assess the evolution of recursive signal structures in music and language and will help move the needle forward on one the most tantalizing riddles in the evolution of language and cognition. These crucial data and insights will materialise if, as stewards of our planetary co-habitants, humankind secures the survival of these species and the preservation of their natural wild habitat (Estrada et al., 2022, 2017; Laurance, 2013; Laurance et al., 2012).
Methods and Materials
Study site
We conducted our research at the Tuanan Research Station (2°09′S; 114°26′E), Central Kalimantan, Indonesia. Long calls were opportunistically recorded from identified flanged males (Pongo pygmaeus wurmbii) using a Marantz Analogue Recorder PMD222 in combination with a Sennheiser Microphone ME 64 or a Sony Digital Recorder TCD-D100 in combination with a Sony Microphone ECM-M907.
Acoustic data extraction
Audio recordings were transferred to a computer with a sampling rate of 44.1 kHz. Seven acoustic measures were extracted directly from the spectrogram window (window type: Hann; 3 dB filter bandwidth: 124 Hz; grid frequency resolution: 2.69 Hz; grid time resolution: 256 samples) by manually drawing a selection encompassing the complete long call (sub)pulse from onset to offset, using Raven interactive sound analysis software (version 1.5, Cornell Lab of Ornithology). These parameters were duration(s), peak frequency (Hz), peak time, peak frequency contour average slope (Hz), peak frequency contour maximum slope (Hz), average entropy (Hz), signal-to-noise ratio (NIST quick method). Please see software’s documentation for full description of parameters (https://ravensoundsoftware.com/knowledge-base/pitch-tracking-frequency-contour-measurements/). Acoustic data extraction complemented the classification of long calls elements, both at the pulse and sub-pulse levels, based on close visual and auditory inspection of spectrograms, both based on elements’ distinctiveness between each other as well as in relation to the remaining catalogued orangutan call repertoire (Hardus et al., 2009) (see also supplementary audio files). Of these parameters, duration and peak frequency in particular have been shown to be resilient across recording settings(Lameira et al., 2013) and to adequately represent variation in the time and frequency axes (Lameira et al., 2017).
Rhythm data analyses
Inter-onset-intervals (IOI’s = tk) were only calculated from the begin time (s) of each full-and sub-pulse long call elements using Raven interactive sound analysis software, as above explained. tk was calculated only from subsequent (full/sub) pulse elements of the same type. Ratio values (rk) were calculated as tk/(tk+tk+1). Following the methodology of Roeske et al., 2020 and De Gregorio et al. 2021, to assess the significance of the peaks around isochrony (corresponding to the 0.5.rk value), we counted the number of rks falling inside on-isochrony ranges (0.440 < rk < 0.555) and off-isochrony ranges (0.400 < rk < 0.440 and 0.555 < rk < 0.600), symmetrically falling at the right and left sides of 1:1 ratios (0.5 rk value). We tested the count of on-isochrony rks versus the count of off-isochrony rks, per pulse type, with a GLMM for negative-binomial family distributions, using glmmTMB R library. In particular, we built a full model with the count of rk values as the response variable, the pulse type in interaction with the range the observation fell in (on- or off-isochrony) as predictors. We added an offset weighting the rk count based on the width of the bin. The individual contribution was set as random factor. We built a null model comprising only the offset and the random intercepts. We checked the number of residuals of the full and null models, and compared the two models with a likelihood ratio test (Anova with “Chisq” argument). We calculated p-values for each predictor using the R summary function and performed pairwise comparisons for each level of the explanatory variables with emmeans R package, adjusting all p-values with Bonferroni correction. We checked normality, homogeneity (via function provided by R. Mundry), and number of the residuals. We checked for overdispersion with performance R package (Lüdecke et al., 2020). Graphic visualization was prepared using R (Team, 2013) packages ggplot2 (Wickham, 2009) and ggridges (Wilke, 2022). Data reshape and organization were managed with dplyr and tidyr R packages.
Acoustic data analyses
Permutated discriminant function analysis with cross classification was performed using R and a function provided by Roger Mundry (Mundry and Sommer, 2007). The script was: pdfa.res=pDFA.crossed (test.fac=“Sub-pulse-type”, contr.fac=“Individual.ID”, variables=c(“Delta.Time”, “Peak.Freq”, “Peak.Time”, “PFC.Avg.Slope”, “PFC.Max.Slope”, “Avg.Entropy”, “SNR.NIST.Quick”), n.to.sel=NULL, n.sel=100, n.perm=1000, pdfa.data=xdata). These analyses assured that long call elements, at the pulse and sub-pulse level, indeed represented biologically distinct categories.
Acknowledgements
We thank the Indonesian Ministry of Research and Technology, the Indonesian Ministry of Environment and Forestry, the Indonesian Ministry of Home Affairs, the Directorate General of Natural Resources and Ecosystem Conservation and the former Directorate General of Forest Protection and Nature Conservation for authorization to carry out research in Indonesia; the Universitas National for supporting the project and acting as sponsors and counter-partners; the Bornean Orangutan Survival Foundation and the MAWAS Programme in Palangkaraya for their support and permission to stay and work in the MAWAS Reserve. A.R.L. was supported by the UK Research & Innovation, Future Leaders Fellowship grant agreement number MR/T04229X/1.
Competing interests
The authors declare no competing interests.
Correspondence and requests for materials
should be addressed to Adriano R. Lameira.
Supplementary Materials

References
- The origins of the vocal brain in humansNeurosci Biobehav Rev 77:177–193https://doi.org/10.1016/j.neubiorev.2017.03.014Google Scholar
- Recursive prosodic words in Kaqchikel (Mayan)Glossa: a journal of general linguistics 3:67https://doi.org/10.5334/gjgl.550Google Scholar
- All or nothing: No half-Merge and the evolution of syntaxPLoS Biol 17:e3000539https://doi.org/10.1371/journal.pbio.3000539Google Scholar
- Neocortical grey matter distribution underlying voluntary, flexible vocalizations in chimpanzeesScientific reports 6:34733https://doi.org/10.1038/srep34733Google Scholar
- Meaningful syntactic structure in songbird vocalizations?PLoS biology 16:e2005157https://doi.org/10.1371/journal.pbio.2005157Google Scholar
- The slings and arrows of comparative linguisticsPLoS Biol 16:e3000019https://doi.org/10.1371/journal.pbio.3000019Google Scholar
- Language: UG or Not to Be, That Is the QuestionPLoS Biol 13:e1002063https://doi.org/10.1371/journal.pbio.1002063Google Scholar
- Can evolution explain how minds work?Nature 458:832–833https://doi.org/10.1038/458832aGoogle Scholar
- Do Animal Communication Systems Have Phonemes?Trends in Cognitive Sciences 19:555–557https://doi.org/10.1016/j.tics.2015.08.011Google Scholar
- Some simple evo devo theses: how true might they be for language?In: The Evolution of Human Language Cambridge: Cambridge University Press pp. 45–62https://doi.org/10.1017/CBO9780511817755.003Google Scholar
- Vocal individuality and rhythm in male and female duet contributions of a nonhuman primateCurrent Zoology 66:173–186https://doi.org/10.1093/cz/zoz035Google Scholar
- The Recursive Mind: The Origins of Human Language, Thought, and Civilization - Updated EditionPrinceton University Press https://doi.org/10.1515/9781400851492Google Scholar
- Wild Chimpanzees Produce Group-Specific Calls: a Case for Vocal Learning?Ethology 110:221–243https://doi.org/10.1111/j.1439-0310.2004.00968.xGoogle Scholar
- Evolutionary Dynamics Do Not Motivate a Single-Mutant Theory of Human LanguageSci Rep 10:451https://doi.org/10.1038/s41598-019-57235-8Google Scholar
- Categorical rhythms in a singing primateCurrent Biology 31:R1379–R1380https://doi.org/10.1016/j.cub.2021.09.032Google Scholar
- Chimpanzee drumming: a spontaneous performance with characteristics of human musical drummingScientific reports 5:11320https://doi.org/10.1038/srep11320Google Scholar
- Recursion in prosodic phrasing: evidence from Connemara IrishNat Lang Linguist Theory 33:1169–1208https://doi.org/10.1007/s11049-014-9281-5Google Scholar
- Chestnut-crowned babbler calls are composed of meaningless shared building blocksPNAS 201819513 https://doi.org/10.1073/pnas.1819513116Google Scholar
- Meaningful call combinations and compositional processing in the southern pied babblerProceedings of the National Academy of Sciences of the United States of America 201600970 https://doi.org/10.1073/pnas.1600970113Google Scholar
- Global importance of Indigenous Peoples, their lands, and knowledge systems for saving the world’s primates from extinctionSci Adv 8:eabn2927https://doi.org/10.1126/sciadv.abn2927Google Scholar
- Impending extinction crisis of the world’s primates: Why primates matter e1600946https://doi.org/10.1126/sciadv.1600946Google Scholar
- Recursive sequence generation in monkeys, children, U.S. adults, and native AmazoniansSci Adv 6:eaaz1002https://doi.org/10.1126/sciadv.aaz1002Google Scholar
- Dendrophilia and the Evolution of SyntaxIn: Origins of Human Language: Continuities and Discontinuities with Nonhuman Primates Berlin: Peter Lang pp. 305–328Google Scholar
- Empirical approaches to the study of language evolutionPsychonomic Bulletin & Review 24:1–31https://doi.org/10.3758/s13423-017-1236-5Google Scholar
- Synchrony and motor mimicking in chimpanzee observational learningSci Rep-uk 4:srep05283https://doi.org/10.1038/srep05283Google Scholar
- Overlooked evidence for semantic compositionality and signal reduction in wild chimpanzees (Pan troglodytes)Anim Cogn https://doi.org/10.1007/s10071-021-01584-3Google Scholar
- The Indris Have Got Rhythm! Timing and Pitch Variation Of A Primate Song Examined Between Sexes And Age ClassesFrontiers in neuroscience 10:249https://doi.org/10.3389/fnins.2016.00249Google Scholar
- Recursive syntactic pattern learning by songbirdsNature 440:1204–1207https://doi.org/10.1038/nature04675Google Scholar
- A description of the orangutan’s vocal and sound repertoire, with a focus on geographic variationNew York: Oxford University Press pp. 49–60Google Scholar
- Rhythmic swaying induced by sound in chimpanzees (Pan troglodytesProc Natl Acad Sci USA 117:936–942https://doi.org/10.1073/pnas.1910318116Google Scholar
- The faculty of language: what is it, who has it, and how did it evolve?Science (New York, NY) 298:1569–1579https://doi.org/10.1126/science.298.5598.1569Google Scholar
- The origin of speechScientific American 203:89–96Google Scholar
- Chimpanzees differentially produce novel vocalizations to capture the attention of a humanAnimal Behaviour 73:281–286Google Scholar
- Why Is Phonology Different? No RecursionIn: Language, Syntax, and the Natural Sciences Cambridge University Press pp. 212–223Google Scholar
- Parallels and Nonparallels between Language and MusicMusic Perception 26:195–204https://doi.org/10.1525/mp.2009.26.3.195Google Scholar
- The nature of the language faculty and its implications for evolution of language (Reply to Fitch, Hauser, and Chomsky)Cognition 97:211–225https://doi.org/10.1016/j.cognition.2005.04.006Google Scholar
- Production of Supra-regular Spatial Sequences by Macaque MonkeysCurrent Biology 0:1851–1859https://doi.org/10.1016/j.cub.2018.04.047Google Scholar
- An interface approach to prosodic word recursion In: Grijzenhout J, Kabak Baris, editors. Phonological DomainsIn: Interface Explorations. Berlin New York: Mouton de Gruyter pp. 105–134https://doi.org/10.1515/9783110219234.2.105Google Scholar
- Animal vocal sequences: not the Markov chains we thought they wereProceedings Biological sciences / The Royal Society 281:20141370https://doi.org/10.1098/rspb.2014.1370Google Scholar
- Processing of hierarchical syntactic structure in musicProceedings of the National Academy of Sciences of the United States of America 110:15443–15448https://doi.org/10.1073/pnas.1300272110Google Scholar
- Bidding evidence for primate vocal learning and the cultural substrates for speech evolutionNeuroscience & Biobehavioral Reviews 83:429–439https://doi.org/10.1016/j.neubiorev.2017.09.021Google Scholar
- Orangutan information broadcast via consonant-like and vowel-like calls breaches mathematical models of linguistic evolutionBiol Lett 17:20210302https://doi.org/10.1098/rsbl.2021.0302Google Scholar
- Coupled whole-body rhythmic entrainment between two chimpanzeesSci Rep 9:18914https://doi.org/10.1038/s41598-019-55360-yGoogle Scholar
- Speech-like rhythm in a voiced and voiceless orangutan callPloS one 10:e116136https://doi.org/10.1371/journal.pone.0116136Google Scholar
- Orangutan (Pongo spp.) whistling and implications for the emergence of an open-ended call repertoire: A replication and extensionJournal of the Acoustical Society of America 134:1–11https://doi.org/10.1121/1.4817929Google Scholar
- Vocal fold control beyond the species-specific repertoire in an orang-utanScientific reports 6:30315https://doi.org/10.1038/srep30315Google Scholar
- Orangutans show active voicing through a membranophoneSci Rep 9:12289https://doi.org/10.1038/s41598-019-48760-7Google Scholar
- Protoconsonants were information-dense via identical bioacoustic tags to proto-vowelsNature Human Behaviour 1:0044https://doi.org/10.1038/s41562-017-0044Google Scholar
- Orangutan Long Call Degradation and Individuality Over Distance: A Playback ApproachInternational Journal of Primatology 29:615–625https://doi.org/10.1007/s10764-008-9253-xGoogle Scholar
- Vocal learning: a language-relevant trait in need of a broad cross-species approachCurr Opin Behav Sci https://doi.org/10.1016/j.cobeha.2018.04.007Google Scholar
- Does research help to safeguard protected areas?Trends in Ecology {& Evolution 28:261–266https://doi.org/10.1016/j.tree.2013.01.017Google Scholar
- Averting biodiversity collapse in tropical forest protected areasNature advance on https://doi.org/10.1038/nature11318Google Scholar
- Recursive sequence generation in crowsSci Adv 8:eabq3356https://doi.org/10.1126/sciadv.abq3356Google Scholar
- Stepwise acquisition of vocal combinatorial capacity in songbirds and human infantsNature 498:104–108https://doi.org/10.1038/nature12173Google Scholar
- FRACTAL ASPECTS OF THE ITERATION OF z →Λz(1-z) FOR COMPLEX Λ AND zAnnals of the New York Academy of Sciences 357:249–259https://doi.org/10.1111/j.1749-6632.1980.tb29690.xGoogle Scholar
- Distinctive signatures of recursionPhilosophical Transactions of the Royal Society B: Biological Sciences 367:2055–2064https://doi.org/10.1098/rstb.2012.0097Google Scholar
- Cognitive representation of “musical fractals”: Processing hierarchy and recursion in the auditory domainCognition 161:31–45https://doi.org/10.1016/j.cognition.2017.01.001Google Scholar
- Language evolution and complexity considerations: The no half-Merge fallacyPLoS Biol 17:e3000389https://doi.org/10.1371/journal.pbio.3000389Google Scholar
- Revisiting Fitch and Hauser’s Observation That Tamarin Monkeys Can Learn Combinations Based on Finite-State GrammarFront Psychol 12:772291https://doi.org/10.3389/fpsyg.2021.772291Google Scholar
- Discriminant function analysis with nonindependent data: consequences and an alternativeAnimal Behaviour 74:965–976https://doi.org/10.1016/j.anbehav.2006.12.028Google Scholar
- Morpheme-internal recursion in phonology, Studies in generative grammarBerlin ; Boston: De Gruyter Mouton Google Scholar
- Recursion in the lexical structure of morphemesRepresenting Structure in Phonology and Syntax. DE GRUYTER :211–238https://doi.org/10.1515/9781501502224-009Google Scholar
- The evolutionary biology of musical rhythm: was Darwin wrong?PLoS biology 12:e1001821https://doi.org/10.1371/journal.pbio.1001821Google Scholar
- Chimpanzee lip-smacks confirm primate continuity for speech-rhythm evolutionBiol Lett 16:20200232https://doi.org/10.1098/rsbl.2020.0232Google Scholar
- Acoustic information about upper limb movement in voicingProc Natl Acad Sci USA 202004163 https://doi.org/10.1073/pnas.2004163117Google Scholar
- Early hominin auditory capacities {\textbar} Science AdvancesSci Adv 1:e1500355https://doi.org/10.1126/sciadv.1500355Google Scholar
- Isochrony and rhythmic interaction in ape duettingProc R Soc B 290:20222244https://doi.org/10.1098/rspb.2022.2244Google Scholar
- Comment on “Nonadjacent dependency processing in monkeys, apes, and humans.”Sci Adv 7:eabg0455https://doi.org/10.1126/sciadv.abg0455Google Scholar
- Categorical Rhythms Are Shared between Songbirds and HumansCurrent Biology 30:3544–3555https://doi.org/10.1016/j.cub.2020.06.072Google Scholar
- Vocal learning of a communicative signal in captive chimpanzees, Pan troglodytesBrain and Language 127:520–525https://doi.org/10.1016/j.bandl.2013.09.009Google Scholar
- Parallels in the sequential organization of birdsong and human speechNat Commun 10:1–11https://doi.org/10.1038/s41467-019-11605-yGoogle Scholar
- Learning Recursion from Music and Music from Recursion2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT). Presented at the 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT). Mumbai: IEEE:257–261https://doi.org/10.1109/ICALT.2018.00066Google Scholar
- Acoustic properties of long calls given by flanged male orang-utans (Pongo pygmaeus wurmbii) reflect both individual identity and contextEthology 116:385–395Google Scholar
- FOXP2 variation in great ape populations offers insight into the evolution of communication skillsScientific reports 7:258https://doi.org/10.1038/s41598-017-16844-xGoogle Scholar
- Experimental evidence for compositional syntax in bird callsNature communications Google Scholar
- Call combinations in birds and the evolution of compositional syntaxPLoS Biol 16:e2006532https://doi.org/10.1371/journal.pbio.2006532Google Scholar
- Wild Birds Use an Ordering Rule to Decode Novel Call SequencesCurrent Biology 27:2331–2336https://doi.org/10.1016/j.cub.2017.06.031Google Scholar
- Social learning of a communicative signal in captive chimpanzeesBiology letters 8:498–501https://doi.org/10.1098/rsbl.2012.0113Google Scholar
- R: A language and environment for statistical computingGoogle Scholar
- Vocal learning in animals and humansPhil Trans R Soc B 376:20200234https://doi.org/10.1098/rstb.2020.0234Google Scholar
- Recursion in phonology?In: Phonological Explorations Berlin, Boston: DE GRUYTER https://doi.org/10.1515/9783110295177.41Google Scholar
- Über die Verschiedenheit des Menschlichen Sprachbaues und ihren Einfluss auf die geristige Entwickelung des MenschengeschlechtsBerlin: Königlichen Akademie der Wissenschaften Google Scholar
- Representation of Numerical and Sequential Patterns in Macaque and Human BrainsCurrent Biology 25:1966–1974https://doi.org/10.1016/j.cub.2015.06.035Google Scholar
- Nonadjacent dependency processing in monkeys, apes, and humansSci Adv 6:eabb0725https://doi.org/10.1126/sciadv.abb0725Google Scholar
- Vocal Learning in the Functionally Referential Food Grunts of ChimpanzeesCurrent Biology 25:495–499https://doi.org/10.1016/j.cub.2014.12.032Google Scholar
- A case of spontaneous acquisition of a human sound by an orangutanPrimates 50:56–64https://doi.org/10.1007/s10329-008-0117-yGoogle Scholar
- Call cultures in orang-utans?PloS one 7:e36180https://doi.org/10.1371/journal.pone.0036180Google Scholar
- Do male “long-distance calls” function in mate defense? A comparative study of long-distance calls in primatesBehavioral Ecology and Sociobiology 52:474–484https://doi.org/10.1007/s00265-002-0541-8Google Scholar
- ggplot2: Elegant Graphics for Data AnalysisNew York: Springer-Verlag. Wilke C Google Scholar
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
- Version of Record published:
Cite all versions
You can cite all versions using the DOI https://doi.org/10.7554/eLife.88348. This DOI represents all versions, and will always resolve to the latest one.
Copyright
© 2023, Lameira et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 2,331
- downloads
- 230
- citations
- 22
Views, downloads and citations are aggregated across all versions of this paper published by eLife.