Abstract
We provide quantitative evidence suggesting social learning in sperm whales across sociocultural boundaries, using acoustic data from the Pacific and Atlantic Oceans. Traditionally, sperm whale populations are categorized into clans based on their vocal repertoire: the rhythmically patterned click sequences (codas) that they use. Among these codas, identity codas function as symbolic markers for each clan, accounting for 35-60% of codas they produce. We introduce a computational method to model whale speech, which encodes rhythmic microvariations within codas, capturing their vocal style. We find that vocal style-clans closely align with repertoire-clans. However, contrary to vocal repertoire, we show that sympatry increases vocal style similarity between clans for non-identity codas, i.e. most codas, suggesting social learning across cultural boundaries. More broadly, this subcoda structure model offers a framework for comparing communication systems in other species, with potential implications for deeper understanding of vocal and cultural transmission within animal societies.
Introduction
Cultural transmission is defined as the transmission of information or behaviors between individuals of the same species by means of social learning [1]. While humans represent a benchmark of such capacity, cultural transmission has been observed in a wide variety of animals, including cetaceans [2, 3], songbirds [4], non-human primates [5], and insects [6]. It typically takes one of three forms: vertical transmission, from adult kin to young kin; oblique transmission, from unrelated adults to young; or horizontal transmission, from peer to peer [7].
When animals have the capacity for social learning, group-specific differences can arise and remain stable when they become distinguishable by symbolic markers: arbitrary group-identity signals that are recognizable by both members of the group itself and by members of other groups [8, 9]. In humans, symbolic markers can take a myriad of forms, ranging from visible signs, such as tattoos or garments, to communication cues or signals, such as idiomatic sentences or accents [8, 9, 10]. In animals, however, quantitative evidence of symbolic markers is remarkably scarce, one exception being recent results on the use of identity codas in sperm whale social communication [11].
Sperm whales live in multi-tiered societies and have a complex vocal communication system [13]. They communicate through rhythmic patterns (codas) of short broadband sounds (clicks), which have traditionally been classified into a finite set of coda types based on the total number of clicks, their rhythm, and their tempo [14, 15] (Fig. 1A). For example, the 4-regular (4R2) type refers to a pattern of four evenly spaced clicks, whereas the 1+3 type refers to two clicks separated by a longer pause followed by two clicks in quick succession. Coda types are thus standardized rhythmic patterns, but individual vocalizations of a given coda type exhibit micro-variations around that pattern.
The set of vocalized coda types (coda usage) combined with how frequently each is vocalized (coda frequency) makes up a vocal repertoire (Fig. 1B). For example, the 4R2 coda is used by many sperm whales, but other coda types are more specific in their usage or frequency to certain groups of sperm whales. While there is evidence of individual variation in vocal repertoires [16, 17, 18], sperm whales belonging to the same social unit—a stable, matrilineally-based group of whales, share a common vocal repertoire that is stable across years [17, 18, 19]. Social units that share substantial parts of their repertoire are said to be part of the same vocal clan [20, 21]. There is clear social segregation between members of different clans, even when living in sympatry, and thus clans mark a higher level of social organization, which appears to be defined on the basis of cultural vocal markers [20, 21, 11] (see Table 1 for a summary of the key concepts).
The clan specific and frequent usage of certain coda types, termed identity codas [15, 11], align with the expectations for symbolic markers of group membership [9]. Furthermore, quantitative evidence that sperm whales themselves use identity codas as such markers has recently emerged: the more two clans overlap in geographic space, the more different their identity coda usage is [11]. This is consistent with computational models [9] of the evolution of symbolic marking, which predict that differences between cultural norms will be starkest when inter-group interactions are more common (e.g., in boundary or overlap regions).
All remaining coda types have been referred to as non-identity (non-ID) codas and constitute a very large fraction of sperm whales’ total number of coda utterances. In fact, the total number of emitted non-ID codas accounts for more than 6 out of 10 codas (see SM Section 1.1 for the counts per clan and per coda type). This begets the question: if ID codas are used as clan identity signals, what can be said about the remaining 65% of codas?
Here, we introduce a novel descriptive framework that focuses on the subcoda structure, that is, the rhythmic micro-variations of intervals between clicks within codas (Fig. 1C). This framework, formally encoded in what we call a “subcoda tree”, captures how codas are uttered: a vocal style. We find that variations in this vocal style, even for a single coda type, identify an individual’s social unit and clan, effectively fingerprinting vocal repertoires. With this, we add a new dimension with respect to previous approaches based on which codas are said—vocal repertoires. Thus, we propose a new concept of vocal identity of sperm whales that comprises both vocal style and vocal repertoire.
By applying our modeling framework to acoustic data from the Atlantic and Pacific Oceans, we obtain two main results. First, we partition sperm whale populations into vocal style-defined clans, which we find to recapitulate the previously defined vocal repertoire clans. This confirms that our method does capture meaningful speech characteristics. Second, and crucially, we find that the vocal style of non-ID codas is more similar for more sympatric clans, i.e. clans whose territory overlaps more spatially. In contrast, we do not find an effect of sympatry on the similarity of vocal styles when studying only ID codas. This suggests that geographic overlap induces vocal styles to become more similar between clans, without jeopardizing each clan’s acoustic identity signals. Our results strengthen previous results on the use of ID codas as symbolic markers, while supporting cultural transmission and social learning of vocalizations among whales of different clans, as predicted by theoretical models [22].
Results
Subcoda structure captures variability in sperm whale communication
We model the internal structure of codas, in terms of rhythmic variations at the level of clicks, by using variable length Markov chains (VLMCs). Our analytical pipeline is illustrated in Fig. 1C. We build each VLMC in two main steps. We first convert codas, naturally represented as sequences of continuous, absolute, inter-click intervals (ICIs), to sequences of discrete ICIs (dICIs), by discretizing time into bins. In this way, each dICI represents a narrow range of possible ICI values. The bins have a fixed width (or resolution) St and thus implicitly correspond to the temporal resolution of our representation (see Methods for details on the optimal choice of St). Note that although ICIs have units of time (seconds), dICIs are (unit-less) symbols (e.g. A, B, C, etc.), representing multiples of St (and so the smaller St, the more the symbols). For example, the shortest ICIs will be mapped to the symbol A whereas longer ones will be mapped to symbols further down the alphabet. Hence, each coda (a sequence of ICIs) is mapped to a sequence of discrete symbols (a sequence of dICIs). The second part of the pipeline focuses on modeling the internal structure of codas in terms of dICI sequences. Essentially, we want to estimate transition probabilities from a dICI sub-sequence to the next dICI (Fig. 1B). A standard way would be to describe this using k-order Markov chain models, which encode information on previous sub-sequences up to k steps in the past of the sequence. However, it is possible that different sequences of dICIs contain different amounts of information or memory regarding potential next dICIs. This is akin to what happens with words (e.g., a word beginning with “re” can continue in more ways than one starting with “zy”). To account for this possibility while also retaining only the most compressed statistical representation of how codas are structured in terms of dICIs, we employ VLMCs.
VLMCs are generalizations of standard (fixed-memory) Markov chains that allow sub-sequences of dICIs of variable lengths. Longer sequences are kept only if they are significantly more informative in predicting the next dICI than random chance, yielding an optimally compressed representation (see Methods for details on model fitting and selection, including the optimal choice of St). Furthermore, VLMCs naturally have a tree structure (see Fig. 1C), because of the natural order between sequences and their sub-sequences. In particular, each node represents a sub-sequence of dICIs, and is equipped with a probability distribution of transitions to the next dICIs. The origin node corresponds to the empty sequence, leaf nodes correspond to the longest sequences, and all nodes forming the branch in between correspond to the sub-sequences of that leaf node. Thus, we call VLMCs fitted to coda ICI data subcoda trees.
Note that dICI sequences encode rhythmic variations within codas. Indeed, a coda type is a standard rhythmic pattern that can be realized with variations in its ICIs and thus in its dICIs too. For example, the 4R2 coda type can be vocalized as BCC but also as CBB (in a representation with, say, 26 symbols). In that sense, subcoda trees, through the dICIs sequences that they contain and their transition probabilities, capture information about a vocal style. Four more features of subcoda trees are noteworthy: (i) because the method’s input is a set of codas, we can build subcoda trees for repertoires corresponding to different social scales, from individual sperm whales, to social units, all the way up to vocal clans; (ii) the difference between different subcoda trees can be measured using a probabilistic distance (see Methods), which we can use to compare subcoda trees across sperm whale clans; (iii) certain features of the vocal style can be quantified via metrics on the subcoda tree: for example we can define a complexity of the vocal style measured by an entropy on the tree; and (iv) subcoda trees can also be used as generative models, to create new synthetic codas in the form of dICI sequences to train downstream machine learning models.
Vocal style recovers vocal clan structure
The information about vocal style contained in subcoda trees is sufficient to recover the social structure of sperm whales (social units and clans). We show this in two ways. First, we analyze a dataset from sperm whales in Dominica (Dominica dataset) [21]. This dataset has rich annotations (coda type annotations, identity of recorded whales, social relations of recorded whales) which makes it particularly useful for validation. Specifically, the sperm whales in the Dominica dataset are divided into well known social clans, each composed of several social units,each with its own specific vocal repertoires, and thus can be defined as two different vocal clans. For each social unit in this dataset, we aggregate the individual whales’ coda samples and build a subcoda tree. Computing the distance between these trees (see Methods), we find that the distances between social units within the same clan are significantly smaller than between clans (Fig. 2A). We also find that an agglomerative clustering (average linkage, see Methods for details) on the distance between the subcoda trees correctly clusters social units into their respective clans (Fig. 2B). Without a priori knowledge of the clan memberships, we used vocal style to recover the existing classification of social units into two clans, which was previously done based on similarity between vocal repertoires (i.e., coda types and usage) [21].
Second, we find that the subcoda structure of synthetic codas, generated from subcoda trees fitted on real data, closely reproduces that of real codas. To do this, we first train a simple classifier to assign codas to one of the two vocal clans, based on coda type. Variations of the same classifier, trained on the same real data, have been shown to discriminate between individual whales, social units, and clans with high accuracy [12]. We train the classifier on real codas, and then test it on both real and synthetic ones. The synthetic codas were generated using the subcoda tree of each clan, with a number of codas similar to that of the original dataset for a fair comparison (see Methods for details). We find that synthetic codas are correctly classified into their clans with an accuracy close (~ 85%) to that obtained on the real data (~ 90%, see Supplementary Materials Section 4).
Motivated by these results, we extend our analysis to a much larger dataset from the Pacific Ocean (Pacific dataset) [11]. This dataset is more sparsely annotated because of the breadth of its spatial coverage. We restricted our analyses to a well-sampled subset (n = 57 coda samples) of the full Pacific dataset (see Methods for details). Coda samples are only labeled by the spatial position at which they were recorded, but no information is available about the identity of the vocalizing sperm whales (see Methods for details). In fact, each repertoire likely contains codas from multiple individuals of a single clan. It has recently been shown that these coda samples can be divided into seven vocal clans based on their coda usage [11]. We use those clans as a benchmark for the following analysis.
Since there is no social unit-level information for this dataset, we fit a subcoda tree for each repertoire (i.e., all of the codas recorded on a single day in a single region). Trees are significantly more similar for coda samples belonging to the same vocal clan than for those belonging to different vocal clans (Fig. 3A). We also find that clustering coda samples based on vocal style returns a dendrogram that closely matches the one obtained from coda usage in [11] (Fig. 3B). The major exception we find is the Short clan (red), named because member whales produce short codas with very few clicks, for which anomalous results were previously reported as well [11]. In our case, this is due to the Short clan being less well localized in the space of trees, while the other clans have well-defined centroids (see Supplementary Materials Fig. 9 for a low-dimensional representation subcoda tree metric space).
Therefore, we find that sperm whale vocal clans in the Atlantic Ocean (Caribbean Sea) and Pacific Ocean can be identified by a vocal identity that encompasses both clan-specific vocal repertoire [21, 24, 20, 11] and vocal style as defined in this work.
Clan sympatry impacts vocal style of non-ID codas only
While interesting, the fact that both vocal repertoires and vocal styles discriminate between clans might imply that considering both could be redundant for vocal identity. However, we find that this is not the case when we consider the functional role of ID versus non-ID codas.
More precisely, different clans can share significant portions of their total range, overlapping across large swaths of ocean. Such sympatric clans exhibit a decreasing similarity of their ID coda usage with increasing clan overlap [11]. This means that the more two clans overlap in space, the more dissimilar their vocal repertoires are in terms of ID coda types and their usage frequency. This is consistent with the idea that ID codas are used as symbolic markers to delineate cultural boundaries between social groups [11, 9]. In contrast, non-ID coda usage do not show any relationship to clan overlap.
We find the exact opposite effect when considering vocal style. The similarity in vocal style for ID codas across clans does not depend on the level of clan overlap (Fig. 4a). In contrast, the similarity in vocal style for non-ID codas displays a clear and significant increase (i.e., decreasing subcoda tree-distance) as clan spatial overlap increases (Fig. 4b). In the Supplementary Materials (see Section 2.4.2), we show that the same results hold at the single coda type level, in addition to the whole clan level, along with an analysis of the confidence intervals. These results imply that the internal structure of codas is more similar for groups that likely spend more time in the same space, akin to accents aligning in human populations that share the same territory [25, 26]. This also highlights the complementarity of vocal repertoire and style: the trends are different precisely because the two concepts describe different aspects of whale speech.
Discussion
We have presented a general method for modeling animal communication systems and their complexity based on VLMCs. In the context of sperm whales, this new method allows the extraction of subcoda trees, which succinctly describe the internal temporal structure of codas. Previous work on the structure of sperm whale communication has largely focused on supra-level coda analyses: for example, by classifying codas into types, quantifying how often different types are used, and distinguishing between individual whales, social units, or clans based on those counts [18, 27]. Here, we adopted a more fine-scale approach by investigating potential structure within codas. To do so, we used VLMCs to model the transition probability of observing a specific ICI given the previous ones. A VLMC, or here a subcoda tree, encodes all those probabilities but only for dICI sequences that are informative—other sequences are automatically discarded. As such, a subcoda tree is a statistically validated representation of the internal memory structure of codas at the level of sequences of clicks. It contains information about important rhythmic variations and transitions between them: a vocal style.
Using such representations, we propose a novel concept of vocal identity for sperm whales composed of vocal repertoire (what they say) and vocal style (how they say it), the latter being captured by our framework. We find that: (i) vocal styles vary between social units and clans, and can be used to distinguish them; (ii) the similarity of clan vocal styles for non-ID codas increases with increasing spatial overlap, while no change occurs for ID codas; and (iii) social learning across symbolic cultural boundaries most parsimoniously explains the observed trends.
Vocal style recovers hierarchical social structure
Using the Dominica dataset, sperm whales had previously been divided into two vocal clans, based on their vocal repertoires and observed social interactions [21]. In our study, comparing the vocal styles of those same whales led to the same assignment of social units to two vocal clans. Similarly, for the Pacific dataset, clustering based on vocal styles yielded clans that were in good agreement with those previously defined based on vocal repertoires [11] (Supplementary Materials for an extended comparison). The difference between the two partitions was mainly due to the Short clan, which was more spread out in subcoda tree space than the other clans, causing overlap with other clans that showed less variability. This variability could be linked to the fact that Short clan whales typically make codas with very few (e.g., three or four) clicks, leading to subcoda trees with very few nodes. In Ref. [11], the authors observe a similar lack of uniformity in coda usage of the Short clan.
Identity and non-identity codas show different trends
For ID codas, we show that the similarity between clan vocal styles is not affected by spatial overlap, while it has recently been shown that the similarity between clan vocal repertoires decreases with overlap [11]. This means that spatial overlap does not affect how whales produce ID codas (in terms of their fine-scale rhythmic structure; our results) but does affect how often they produce them. In contrast, for non-ID codas, we show that the similarity between vocal styles increases with spatial overlap between two clans, while no change was observed for vocal repertoires in previous work on the same dataset. In other words, increasing spatial overlap is correlated with more similar fine-scale rhythmic structure of non-ID codas produced by whales from different clans (our results), but does not affect how often non-ID codas are produced. Our study thus supports and nuances the results of Hersh et al. [11]. We provide further support for selection acting to produce unambiguous, recognizable identity signals in the ID codas. However, ID codas only account for 35% of the total vocalizations; the remaining 65% of codas have traditionally been lumped into a catch-all category (i.e., non-ID codas) and their function remains enigmatic (these numbers are an average over the Pacific clans, and go up to 93% for non-ID codas when counting number of coda types instead of number of codas emitted, see SM 1.1 for details). We could still discriminate among clans using non-ID coda vocal style; however, the increased similarity of non-ID coda vocal styles between clans with greater spatial overlap, as demonstrated here, suggests that non-ID codas are likely vocal cues and not identity signals like the ID codas. Accordingly, vocal repertoire and vocal style capture different and complementary information on sperm whale communication, and should be considered in tandem in future studies.
Evidence for social learning across cultural boundaries
There are several potential mechanisms driving the similarity in non-ID coda vocal styles—but not ID coda vocal styles—across spatially overlapped clans: environmental variation, genetics, and/or social learning.
Local adaptation to specific ecological conditions can lead to geographic variation in acoustic signals [28]. If environmental pressures alone were responsible for the trends we observe in sperm whales, this would imply that (i) more spatially overlapped clans experience more similar environments, (ii) non-ID coda vocal style is impacted by or dependent on environmental parameters, and (iii) ID coda vocal style is not impacted by/dependent on environmental parameters. Although the first point is somewhat intuitive, to date there is no evidence that coda production systematically varies with environment. In fact, clans are recognizable across ocean basins, making local adaptation an unlikely driver of the observed trend in non-ID coda vocal style.
If genetic relatedness were responsible, this would imply that (i) more spatially overlapped clans are more genetically related, (ii) non-ID coda vocal styles are genetically inherited, and (iii) ID coda vocal styles are not genetically inherited. If all three requirements were met, then the observed similarity in non-ID coda vocal styles for more spatially overlapped clans could be due to genetic determination under a general isolation by distance structure. However, research to date suggests this scenario is unlikely. Rendell et al. [29] found little evidence to support genetics as an explanation of differences in vocal dialects among clans in the Pacific Ocean. Furthermore, Alexander et al. [30] found that regional genetic differentiation in the Pacific Ocean is very low: while social group is important for explaining both mitochondrial and nuclear DNA variance, geographic region is not. This contrasts with results from the Indian Ocean, where region was the strongest predictor of mitochondrial DNA variance. Given that gene flow in sperm whales is largely male-mediated and that mitochondrial DNA haplotypes are broadly shared across the Pacific Ocean, it is unlikely that coda dialects are genetically determined [30, 31]. Agent-based models grounded in empirical data from Pacific Ocean sperm whales further support coda usage as socially learned, not genetically inherited [22]. To fully rule out a genetic explanation for our results, the analyses in [29] could be replicated for ID coda usage and non-ID coda usage separately. This would shed light on whether certain coda types are genetically inherited vs. socially learned, as has been suggested for some humpback whale (Megaptera novaeangliae) vocalizations [32].
The most parsimonious explanation for the observed similarity of non-ID coda vocal styles of clans with increasing spatial overlap is social learning across clan boundaries. This is remarkable, given that sperm whale clans belonging to different clans have rarely been observed physically interacting at sea [3]. However, that does not preclude the possibility that they are within acoustic range of each other [33] and that cross-cultural social learning opportunities arise. This explanation is compatible with (and bolsters) past work suggesting that ID and non-ID codas function differently in sperm whale communication, and further suggests that they experience different evolutionary pressures [22]. Whether social learning has facilitated stochastic (i.e., cultural drift) or deterministic (i.e., cultural selection) processes is more difficult to determine, and it is unclear whether the observed non-ID coda vocal style alignment has been neutral or adaptive [34, 28]. Importantly, these findings suggest that vocal learning in sperm whales may not be limited to vertical transmission from related adults to young kin, but that horizontal and/or oblique social learning from outside the natal social unit might also be occuring.
Vocal identity in sperm whales is thus consistent with both cultural selection on ID codas to maintain discrete signals for vocal recognition in sympatry, and social learning between clans leading to a vocal style more similar to that of other whales with which they are in acoustic contact more frequently. This highlights a more complex system of transmission in which clan identity is maintained through selection, while gradual change over time may occur within and across clans for vocalizations which do not function in social recognition and thus may create similar vocal styles.
Future directions
Our results can be expanded in multiple ways in future work. The first, and the simplest conceptually, would be to conduct the present analysis on a larger dataset. More codas would improve the quality of the statistical analyses and ensure that all codas are represented in realistic proportions for each clan. Moreover, longitudinal datasets might provide direct evidence to discriminate between the social learning hypothesis and competing ones (e.g. drift in vocal style). Similarly, confirmations could emerge from large scale genetic datasets addressing the issues of phylogenetic relatedness (or lack thereof) in clans that are closer in vocal style distance. Such datasets do not exist at present, but efforts towards automated and semi-automated collection techniques are underway (e.g. Project CETI [35]). Second, from a methodological perspective, we could add spectral information (in terms of acoustic frequencies) to the temporal information currently used. Although sperm whale acoustic communication seems mostly based on rhythm, spectral features of individual clicks may convey additional information. This possibility could be incorporated into our method by labeling the dICIs according to the frequency content of the associated click (or by extending the available “alphabet” for the VLMC). Third, it would be interesting to investigate in more detail the function of non-ID codas. Indeed, even though ID codas were only recently formally named for the first time, they have been the primary focus of sperm whale coda research for decades. As previously mentioned, non-ID codas are a catch-all category for anything that is not an ID coda, but that does not mean that all non-ID codas function in the same way. To start to unveil their function, we need to consider the context (behavioral, environmental, etc.) in which different non-ID codas are produced [36]. The pattern we documented may or may not apply to all non-ID codas, but it is at least strong enough that we detect the relationship with clan spatial overlap when collectively considering all non-ID codas.
Methods
Acoustic data
In social situations, sperm whales acoustically communicate through short bursts of clicks with recognizable patterns based on rhythm and tempo referred to as codas. Codas are generally represented as sequences of ICIs, equivalent to a time series of click onsets.
We analyzed two datasets in the present study. The Dominica dataset contains 8719 annotated codas recorded in the Atlantic Ocean off the island of Dominica between 2005 and 2019. The codas come from 12 social units grouped into two vocal clans (EC1 and EC2). The Pacific dataset was collected between 1978 and 2017 at 23 locations in the Pacific Ocean (the recording methods are available in the supplementary materials of [11]). The codas were divided into coda samples according to their recording day and each repertoire was assigned a single vocal clan inferred in [11]. When considering a clan-level analysis (Fig. 3) all coda samples were used to compute the subcoda trees (23555 codas). However, when analysing at a coda samples level (Fig. 4), we discarded coda samples with less than 200 codas with statistical inference in mind, resulting in a final count of 57 coda samples (17046 codas) for the Pacific.
Representation of sperm whale communication as discrete inter-click intervals
As a preliminary step, we discretized the (continuous) ICI values into bins of width δt seconds. In other words, we represented the continuum of ICI values by a finite set of discrete ICIs (dICIs) based on the duration of the ICI. The bin width δt controls the temporal resolution of the representation: a higher value of δt implies a coarser representation with fewer dICIs. We also imposed an upper bound tmax: any ICI value greater than that was truncated to tmax. This ensured that the set of dICIs was finite. Note that although ICIs have units of time (seconds), dICIs are unitless (they represent time intervals). The resulting representation of ICIs as dICIs is a discrete random variable defined as
which takes values in the finite set . We represented the sequences of ICIs by sequences of dICIs from that finite set. Note that any ICI value above tmax is mapped to the dICI and therefore represents the end of a coda. We set tmax = 1 (longer than any ICI) and δt = 0.05 throughout the analysis (see Supplementary Materials section 3.3.2 for justification of this choice and section 3.4.3 for an analysis on the influence of this parameter).
Variable length Markov chains
We then modeled these dICI sequences using variable length Markov chains (VLMCs). VLMCs provide the large memory advantage of higher-order Markov chains when needed, without the drawback of having too many unnecessary parameters in the model.
Fitting a VLMC is the process of deciding how much memory is necessary to model specific sequences. The criterion for making this decision is the following: longer sequences are discarded if their distribution of transition probabilities is similar to that of shorter subsequences. This process is often called context tree estimation and consists of two steps.
The first step is to consider WD the set of all sequences of maximum length D (which we set to 10) and to assign the following probability distribution qw to each sequence:
that is, the probability of observing a state x ∈ χ given the sequence w.
The second step is to prune the sequences that do not add information. Take two sequences u, w ∈ WD, one being the suffix of the other w = σu. The information gained Hw by considering the longer sequence can be measured with a weighted Kullback-Leibler (KL) divergence DKL [37]. The longer memory sequence w is kept only if the information gain is greater than some threshold K [38, 39]
where N(w) denotes the length of sequence w. Sequences that satisfy this condition are called contexts and sequences that do not are discarded. A VLMC can be defined as the set of these contexts w and their associated probability distribution qw (see Supplementary Materials section 3.1 for details).
A VLMC can be visualized as a tree by representing each context w by a node and setting the root node as the context of length zero. Contexts that are subsequences of each other are then part of the same branches, which end with the longest contexts.
Quantitative Comparison of VLMCs
If two VLMC models T1 and T2 are built over the same finite set of dICIs χ, there exists a map ϕ1 : WD → T1 that maps any sequence of elements of χ into the longest sequence present in T1, and similarly for T2. This map also induces a map between the probability distributions of T1 and T2. Given two distributions over the same set χ, we can measure how different they are with the KL divergence. Therefore, it is possible to define a dissimilarity between T1 and T2 by considering the average KL divergence over all sequences of T1 and their map ϕ1(T2) ⊆ T1
Refer to the Supplmentary Materials section 3.4 for a more detailed explanation.
This results in a dissimilarity measure that captures not just the difference in emission distribution but also the structural differences of the associated context trees. When comparing the distribution of distances in Fig. 2A and Fig. 3A we performed a Kolmogorov-Smirnov test to test if the distances between social units/coda samples of the same clan and distances between social units/coda samples of different clans had come from the same distribution. For every pair, we can reject the hypothesis of the distances coming from the same distribution with 95% confidence.
Hierachical Clustering of VLMCs
The dendrograms in Fig. 2B and Fig. 3B were obtained by hierarchical clustering using average linkage on the set of subcoda trees (VLMCs). Since the distance is not symmetric, for agglomerative clustering we considered the symmetric distance:
Measuring clan overlap
We used the clan spatial overlap values from [11]. Briefly, given two clans A and B, and the coda samples associated to them, the amount of geographical overlap of A in B was measured as the fraction of coda samples belonging to clan A that were recorded within 1000 kilometers of at least one repertoire of clan B. One thousand kilometers is the approximate annual home range span of sperm whales in the eastern tropical Pacific [40, 41].
Statistical Testing
On Fig. 2 and Fig. 3 we compare the distributions of distances between subcoda trees of coda samples/social units of the same clan (within) and of different clans (between). The purpose is to assess whether these distributions originate from the same underlying population. We employ both the Kolmogorov-Smirnov test and the T-test. The observed p-values were well below 0.01 for all clans. This allows us to confidently reject the hypothesis that there is no difference between the vocal style between different clans. For more information check the Supplementary Materials in section 3.4.2.
To assess the existence of a relationship between clan overlap and vocal style similarity, we applied an ordinary least squares linear regression model (OLS). We show the resulting p values of the OLS statistical test at the bottom left of each plot of Fig. 4 along with the observed r2 value. To assess whether there is true difference between the two cases, we also bootstrapped the linear regression calculation to obtain 95% confidence intervals for the slopes of the fits, resulting in both negative and positive values in the ID case, but only negative slope values for the non-ID case, thus confirming our interpretation.
Acknowledgements
This study was funded by Project CETI via grants from Dalio Philanthropies and Ocean X; Sea Grape Foundation; Rosamund Zander/Hansjorg Wyss, Chris Anderson/Jacqueline Novogratz through The Audacious Project: a collaborative funding initiative housed at TED. TAH was supported by Max Planck Group Leader funding to Andrea Ravignani of the Max Planck Institute for Psycholinguistics. The Dominica coda dataset originates from The Dominica Sperm Whale Project which was supported by a FNU fellowship for the Danish Council for Independent Research supplemented by a Sapere Aude Research Talent Award, a Carlsberg Foundation expedition grant, a grant from Focused on Nature, two Explorer Grants from the National Geographic Society (all to SG), and supplementary grants from the Arizona Center for Nature Conservation, Quarters For Conservation, the Dansk Akustisks Selskab, Oticon Foundation, and the Dansk Tennis Fond. Further funding was provided by Discovery and Equipment grants from the Natural Sciences and Engineering Research Council of Canada to Hal Whitehead (Dalhousie University) and a FNU large frame grant and a Villum Foundation Grant to Peter Madsen (Aarhus University). The publicly accessible Pacific Ocean sperm whale coda dataset we used in this study emanates from the Global Coda Dialect Project, a consortium of scientists conducting sperm whale acoustics research worldwide. Members of the consortium who contributed to the Pacific Ocean dataset include: Luke Rendell, Mauricío Cantor, Lindy Weilgart, Masao Amano, Steve M. Dawson, Elisabeth Slooten, Christopher M. Johnson, Iain Kerr, Roger Payne, Andy Rogan, Ricardo Antunes, Olive Andrews, Elizabeth L. Ferguson, Cory Ann Hom-Weaver, Thomas F. Norris, Yvonne M. Barkley, Karlina P. Merkens, Erin M. Oleson, Thomas Doniol-Valcroze, James F. Pilkington, Jonathan Gordon, Manuel Fernandes, Marta Guerra, Leigh Hickmott and Hal Whitehead. We are grateful to Scott Baker and Alana Alexander for answering questions about sperm whale genetics.
Supplementary Materials
Evidence of social learning across symbolic cultural barriers in sperm whales
1 Data and Preprocessing
Sperm whales communicate vocally via clicks : short bursts of sound emitted in sequence. These clicks are combined into recognizable patterns called codas. Clicking sperm whales were recorded with a hydrophone, and clicks were detected in the resulting audio files by human experts. The data that we used consists of time sequences of inter-click intervals (ICIs), i.e. the times between two consecutive detected clicks—this is equivalent to having a time series of click onsets.
We used two datasets: the Dominica and Pacific datasets. The Dominica dataset contains 8719 annotated codas recorded in the Atlantic Ocean near Dominica. The codas come from 12 social units grouped into two vocal clans (EC1 and EC2). The Pacific dataset consists of around 23555 codas recorded between 1978 and 2017 in 23 Pacific Ocean locations. The codas were divided into coda samples according to their recording day and each coda sample was assigned a single vocal clan inferred in [11]. When considering a clan-level analysis all coda samples where used to compute the VLMC models. However when analysing at a coda sample level, we discarded coda samples with less than 200 codas with statistical inference in mind, resulting in a final count of 57 coda samples (17046 codas) for the Pacific.
To model the ICI sequences in the datasets, we represented the ICIs by a finite set of symbols (or states), in three main steps (Fig. S1). First, we denoted X a continuous random variable that represents an ICI:
Second, we imposed an upper bound tmax on the values taken by X. This was to make sure that we are modeling with a finite number of states. Specially, in situations here the maximum number of states have to be known. We set this value to 1 second to be sure that it is longer than any ICI.
Thirdly and finally, we discretized the values of the ICIs into a set of bins, akin to a histogram. We denote δt the width of these bins in seconds and call it the temporal resolution of the representation. Formally, we define
By construction, this defines a discrete random variable that takes values in the finite set . Note that each element of this set represents a range of ICI values of length δt seconds. Any ICI value above tmax is mapped to the symbol . It thus represents the end of a coda. We set the upper bound to tmax = 1 seconds and δt = 0.05 (see Section 3.3 for details about the choice of values).
We then modeled sperm whale communication sequences (Xi)i∈ℕ. In addition, for clarity, we will denote the elements of χ by letters of the Latin alphabet A, B, and so on. In terms of terminology, we will also refer to χ as an alphabet, and to its elements as symbols or states, interchangeably.
1.1 ID and non-ID codas
Some coda types are considered ID for some clans but non-ID for others. The Pacific dataset [11] has annotations specifying this label (ID or non-ID) for each coda and each clan. For each clan, we counted the total number of codas that are ID and the total number of codas that are not (see Table S1).
We also provide similar statistics for the count of different coda types (see Table S2).
2 The role of memory
A coda is a series of clicks emitted in fast succession. They have traditionally been identified by practitioners as building blocks for sperm whale communication. Whether or not codas themselves are composed by series of smaller collections of clicks is however an open question. We asked: are the codas the smallest such blocks, or is there structure at a scale shorter than codas—but longer than the individual clicks that constitute them? In order to answer this, we modeled sperm whale communication as higher-order Markov chains, that is, Markov chains with a memory h larger than or equal to one (but of fixed length)—see Section 2 for details. In other words, we assume that the probability of observing an ICI within given range—here referred to as symbol or state—depends on the h previous states. For a given h, we fit this Markov model to the data by estimating the transition probabilities from sequence of h state to any other state.
The results are summarized in Fig. S2, for a range of memory values h, and for two temporal resolutions for the binning of the ICIs. We note that there is a bifurcation between two different behaviors around h ≈ 3: transition probabilities go from very low (approximately random transitions) to very large (almost deterministic transitions). Indeed, for h < 3, these probabilities are very low [Fig. S2(a)] and all similar [Fig. S2(b)]. All possible next states are equally likely, but not very likely: this indicates underfitting. On the contrary, for h > 3, the transition probabilities are all close to one and all similar. Moreover, only a few of them are non-zero [Fig. S2(c)]. Given a sequence longer than three, only one next state can be observed: this indicates overfitting. Finally, for h ≈ 3, transition probabilities are heterogeneous: their average is between 0 and 1, their variance exhibits a peak, and more than one state can potentially be observed next. Moreover, the Akaike Information Criterion (AIC) displays a minimum around that value of the memory, which indicates that it provides a good trade-off between variance of the data explained and the number of parameters needed for the model.
This transition around h ≈ 3 suggests that there is structure at that level of memory, which is shorter than most coda types. This motivates our search for structure within codas. However, fixed-memory Markov Chains do not allow for different configurations to have different levels of memory, which leads to variable length Markov Chains (described in the next section).
3 Variable Length Markov Chains
Some states can be predicted with more or less memory of past states than others. This observation is the base motivation for introducing variable length Markov Chains (VLMCs) which go beyond the fixed-memory limit of traditional Markov chains. Take for example a state X2 that has the same probability of occurring knowing the last two states (x0x1 ∈ χ2) or only the last state (x1 ∈ χ1:
In this case, a shorter memory (h = 1) is sufficient and we do not need a longer one (h2).
In practice, VLMCs bypass the necessity of having (n − 1)nh parameters by allowing states to have unequal lengths (memory). Smaller lengths are preferred whenever the additional memory does not significantly change the distribution of transitions to the next possible states.
3.1 Building a VLMC
For a Markov model of fixed order h, the set of possible states χh is composed of all possible sequences of length h. For a VLMC, however, states can be sequences of arbitrary length. The set of possible states is thus a subset of the set of all sequences that can be built from the alphabet χ, including the empty sequence χ0 = ∅. Let W denote this set.
In this project, because codas are typically constructed from a small number of clicks, we only consider finite length sequences (see [?] for non-finite VLMCs). In practice, we choose a maximum memory allowed D, which we set to D = 10, much larger than the typical coda length.
Fitting a VLMC is the process of finding some subset L ⊆ WD where the elements satisfy the condition: shorter states are preferred if their distribution of transition probabilities is similar to their longer length equivalents. This is generally called context tree estimation in the literature [?].
Probabilising the tree
We start with WD for some D which we take to be equal to 10. To each element of w ∈ WD we assign a probability distribution qw over the set χ as the probability of observing a state x ∈ χ given a previous sequence w.
Where P denotes the likelihood estimation computed as
where N (w) the number of occurrences of the sequence w.
Pruning the tree
Given two sequences u, w ∈ WD, we say that u is a suffix of w if w = σu for some other sequence σ of length ≥ 1. If σ ∈ χ we say that u is a parent of w. That is, u is a parent of w if u is a suffix of w and w is longer by only one letter.
In an intuitive way, u is a parent of w if w “looks into the past” one step further than u. At the core of the VLMC is measuring the information gain in using the longer memory u instead of its shorter memory parent w. If this information gain is not sufficient, then we discard the longer memory u. We measure the information gain in using the longer memory w instead of u with a weighted Kullback-Leibler (KL) divergence DKL [37]:
The longer memory sequence w is kept if and only if the information gain is greater than some threshold K [38, 39]. Refer to Section 3.3 for a discussion on the value of K. The set of all sequences that respect the above threshold are called contexts, and denoted by T:
A VLMC is the the Markov model with the set of states:
and with transition probabilities defined by qw for w ∈ T.
3.2 Model Selection
When modeling a process (Xi)i∈ℕ with a Markov model, the memory length h controls the trade-off between complexity and error. Higher memory values tend to result in models that generalize poorly. On the other hand, lower values of h fail to capture the patterns, resulting in a uniformly random model.
To choose an appropriate value of h, it is common to employ some statistic that measures the trade-off between precision in prediction and the number of parameters. A model with high predictability and a low number of parameters is favored. There is a wide range of metrics [?] and one of the most widely used is the AIC [?, ?]:
is the maximum likelihood of the sequence (Xi)i∈ℕ given the Markov model Mh with memory length h and k parameters (transition probabilities). The best model is indicated by the lowest AIC.
3.3 Parameter sensitivity
3.3.1 Information gain threshold K
The threshold value K represents the minimum information gain necessary to increment the memory of a given context by one. This value ultimately influences the depth and shape of the VLMC model. Low thresholds result in deep trees with many parameters and are prone to overfitting, whereas small threshold values cause trees to not expand past low values of memory and potentially fail to capture statistical dependencies (Fig. S4).
The best value of K should be the one that outputs the optimal VLMC model (i.e., the one that minimizes the AIC).
However, this search is done over the entire set of possible context trees. The problem of estimating the optimal context tree is an ongoing research area although many good methods have been proposed [39].
For some threshold K. The above dissimilarity is an expression of differences of deviances and as such follows an asymptotic distribution [38] with |χ|−1 degrees of freedom. As such we can set the K thresholds to represent quantiles of a χ2 distribution. We use the 0.95 quantile, meaning we keep the child whose additional memory exceeds the value:
3.3.2 Temporal resolution δt
The temporal resolution τt denotes the scale at which we discretize the continuous, absolute ICI values into bins. A small value might provide differentiation between clicks, but also burdens the VLMC models by increasing the number of parameters and states.
To select the most appropriate resolution parameter, one might be tempted to compare the AIC obtained from different VLMC models extracted from data at different resolutions. However, that is not possible since models fit on different data are not really comparable. Imagine the extreme case with a time resolution so large that all clicks are mapped to the same discrete symbol: any model fit on this data would achieve an optimal AIC value.
In our case, we compare the AIC obtained from the VLMC model with the AIC obtained from a fixed length Markov model of order 0 fitted to the same aggregated data. The intuition behind this approach is that the zero length Markov model represents how “easy” it is to predict the data. The best resolution would be the one where the difference between the AIC of our fit VLMC is the biggest when compared to the 0th order Markov model (Fig. S5).
3.4 Quantitative comparison of VLMCs
The KL divergence is one of the most used methods for measuring statistical dissimilarity between two distributions, mostly due to its connections to information-theory. Being a generative model, a VLMC is not a single probability distribution, but a set of distributions qw one for each context w ∈ T.
Given two VLMC models T1 and T2 over the same alphabet χ. Let p and q be their associated sets of transitions distributions, respectively. We define the divergence dKL(T1, T2) between them as the average KL divergence between the set of associated transition distributions.
However, there may not be a one-to-one map between qw and pw. In fact, more often than not T1 and T2 have a different number of contexts. As such, for every w ∈ T1 we associate u ∈ T2 where u is the longest suffix of w that belongs to T2. This results in a dissimilarity measure that captures not just the difference in emission distribution but also the structural differences of the associated context trees:
Divergence
Given two VLMCs (T1, (qw)w∈T1) and (T2, (pw)w∈T2) built over the finite alphabet χ we define the distance:
Where denotes the longest suffix of w ∈ T1 that belongs to T2.
3.4.1 Statistical testing on distribution of the distances
We fit a VLMC model on coda samples/social units from different clans on both the Pacific and Dominica dataset. For each clan we compute the distances between all VLMC models belonging to that clan (within) and between the models of the clan to the ones belonging to other clans (between) (Fig. S7). On each pair (within/between) we tested the distributions to check if they both came from the same populations. We employed both the Kolmogorov-Smirnov test and the T-Test. We also measured the effect size using Cohen’s method. The resulting statistics for both datasets can be found in the tables below.
3.4.2 Non-ID results by coda type
In this section we repeat the approach on comparing the geographical clan overlap with the VLMC distance on non-ID codas. In contrast to the main text, we segment each set of codas by coda type and note the slope, the p-value of the Pearson correlation and the p-value according to a Spearman correlation, Table S4. Just as the main text, when segmenting by coda type we observe that the vast majority of correlations is negative, i.e., geographically overlapped clans have a more similar communication. However, although most correlations are negative, only a small portion is significant. It is important to take into consideration that the amount of data used to fit each VLMC model is considerably reduced given the extra segmentation. Furthermore, coda types that were uttered exclusively by only two clans were also omitted as it is always possible to draw a line between two points, and thus a linear analysis makes little sense.
3.4.3 Stability under different resolutions
We also show that our results about the effect of sympatry on non-ID coda vocal styles hold for different values of the time resolution. That is, that the parameter preprocessing steps and the method parameters have little to no effect on the fundamental results of our approach. In Fig. S9 we repeat the analysis of the main text. We compare the geographical overlap with the distance between the VLMC of the pacific clans on both non-ID and ID codas.
We observe that regardless of the time resolution used in the method (for discretizing the continuous ICIs into discrete ICIs), our results hold. That is, there is never a significant correlation between overlap and ID codas and that there is always a negative correlation between clan overlap and non-ID codas.
3.5 Dependence on Coda Type
In this section we provide results highlighting the lack of correlation between VLMC (subcoda tree) similarity and coda type distribution. First, rythmic variations on how each coda type is constructed are present and are indicative of the clan (Fig. S11 and Fig. S10). For example, the way the clan EC1 vocalizes codas of type 8R is significantly different from the clan EC2 (Fig. S11).
In fact one can fit a VLMC on each coda type and compare each VLMC (segmented by coda type and clan) between both elements of the same clan and elements of different clans. We observe that there is a statistically significant difference between the distances of VLMC from different clans and VLMC of the same clan (Fig. S10). This indicates that whales vocalize different coda types in a clan-distinctive manner. Which also point to an independence between vocal style and coda type distribution.
3.5.1 Comparing Dendrograms
Using the distance between the VLMC trees it is possible to create a hierarchical plot of the coda samples. One can find it beneficial to compare our resulting hierarchical plot with the clan labels from [11] where the authors group the whales by coda type usage and divide the Pacific clans into the aforementioned 7 clans. However, comparing a dendrogram with a realized set of labels is not trivial. On the other hand, an effective comparison of two sets of labels can be achieved using the Adjusted Rand Score, or other entropy based metrics. The Adjusted Rand Score has a value of 0.0 for random labeling (independent of number of clusters) and 1.0 for clusters that match perfectly. The lowest possible score is −0.5 for exceptionally disparate clusterings.
To obtain two sets of labels we progressively cut the dendrogram obtained by the VLMC and compared the set of labels with the clan labels from [11]. At each cut we calculate the adjusted rand score (results in Fig. S12). We observed a maximum value of 0.5. This reiterates not only the concordance with the pre-existing vocal usage clans but also emphasizes that vocal style is capturing new information at a different, lower, scale.
3.6 Confidence Interval on relation between non-id coda style and clan overlap
An interval confidence for the slope for the result in Figure 4 can be achieved by subsampling the data (1000 times) and running the same linear regression analysis on the subsampled data. From the resulting distribution of regression slopes we observe that the 95% confidence intervals for the non-ID scenario contains only values with negative slope, while on the ID case, the confidence interval contains both negative and positive values (Fig. S13)
4 Classification of synthetic codas
The fixed-length Markov chains described above lack flexibility: a model with large memory h generalizes better but requires estimating an ever-increasing number of parameters. For this reason, we then used VLMCs, which combine the best of both worlds by determining the optimal memory needed for each transition individually. Essentially, we keep a transition probability with a longer memory P(Xi|xjxk) only if it changes the distribution sufficiently compared to a short memory one P(Xi|xj).
A VLMC can be naturally visualized as a tree, where the concept of order arises from the fact that shorter memory contexts are subsequences of longer ones. In Fig. S14A and B, we show examples of two VLMCs computed from data from a single sperm whale each. The visual structure of these trees can be related to the actual information-theoretic structure of these sperm whales’ communication. Indeed, the root node is represented in orange, and nodes that are depth h in the tree (that is h edges away from the root node) represent context (or sequences) of memory h. To verify that the structure we observe actually contains information, we need to compare it to the structure of a null model. To do this, we took the same ICI time series used to build the tree from Fig. S14A, and randomly shuffled its ICIs. This way, all temporal information is lost. This results in a tree that has no structure, as shown in Fig. S14C. The VLMC-indicated structure can thus be interpreted as coming from the sperm whale communication.
Having confirmed that our VLMCs capture some communication structure, we ask: What and how much structure does it capture? To answer this, we took advantage of two facts. First, the VLMCs can be used to generate new codas by generating sequences of states that correspond to ICIs. Indeed, like for any Markov model, we can start from the empty sequence, and start adding suffixes with probabilities defined by the model (see Methods). In other words, we can generate synthetic data. Second, in [12] the authors present an LSTM-based classifier capable of assigning a coda to a specific clan with over 90% accuracy. We trained it on the original ICI data used to build our trees, achieving similar accuracy as shown by the black curve in Fig. S15. To verify how much information our VLMCs capture, we used that trained classifier on the synthetic codas generated with our trees. Remarkably, it classified the generated data with between 70 and 80% accuracy, depending on the temporal resolution 5t (blue in Fig. S15). The fairly small difference in accuracy between the real and synthetic data indicates that a large part of the communication structure captured by the classifier in the real data is also captured by our VLMC models.
References
- [1]Culture in whales and dolphinsBehav. Brain Sci 24:309–324
- [2]The extension of biology through cultureProc. Natl. Acad. Sci 114:7775–7781
- [3]The cultural lives of whales and dolphinsUniversity of Chicago Press
- [4]The cultural transmission of bird songTrends Ecol. Evol 1:94–97
- [5]Primate archaeology reveals cultural transmission in wild chimpanzees (pan troglodytes verus)Philos. Trans. R. Soc. B 370
- [6]Associative mechanisms allow for social learning and cultural transmission of string pulling in an insectPLoS Biol 14
- [7]Cultural transmission and evolution: A quantitative approachPrinceton University Press
- [8]IntroductionEthnic Groups and Boundaries Little Brow
- [9]Shared norms and the evolution of ethnic markersCurr. Anthropol 44:122–130
- [10]The evolution of ethnic markersCult Anthropol 2:65–79
- [11]Evidence from sperm whale clans of symbolic marking in non-human culturesProc. Natl. Acad. Sci 119
- [12]Deep Machine Learning Techniques for the Detection and Classification of Sperm Whale BioacousticsSci. Rep 9
- [13]Sperm whale, the largest toothed creature on earthEthology and behavioral ecology odontocetes, Springer :261–280
- [14]Sperm whale codasJ. Acoust. Soc. Am 62
- [15]Using identity calls to detect structure in acoustic datasetsMethods Ecol. Evol 12:1668–1678
- [16]Coda communication by sperm whales (Physeter macro-cephalus) off the Galapagos IslandsCan. J. Zool 71:744–752
- [17]Individual vocal production in a sperm whale (physeter macrocephalus) social unitMar. Mammal Sci 27:149–166
- [18]Individual, unit and vocal clan level identity cues in sperm whale codasR. Soc. Open Sci 3
- [19]Spatial and temporal variation in sperm whale coda vocalizations: stable usage and local dialectsAnim. Behav 70:191–198
- [20]Vocal clans in sperm whales (Physeter macrocephalus)Proc. Royal Soc. B 270:225–231
- [21]Socially segregated, sympatric sperm whale clans in the Atlantic OceanR. Soc. Open Sci 3
- [22]Multilevel animal societies can emerge from cultural transmissionNat. Commun 6
- [23]Physeter clicksWhales, dolphins, and porpoises :510–527
- [24]Ocean nomads or island specialists? culturally driven habitat partitioning contrasts in scale between geographically isolated sperm whale populationsR. Soc. Open Sci 9
- [25]The evolution of tag-based cooperation in humans: The case for accentCurr. Anthropol 53:588–616
- [26]The origins and psychology of human cooperationAnnu. Rev. Psychol 72:207–240
- [27]Sperm whale codas may encode individuality as well as clan identityJ. Acoust. Soc. Am 139:2860–2869
- [28]Geographic variation in the acoustic traits of greater horseshoe bats: testing the importance of drift and ecological selection in evolutionary processesPLoS One 8
- [29]Can genetic differences explain vocal dialect variation in sperm whales, physeter macrocephalus?Behav. Genet 42:332–343
- [30]What influences the worldwide genetic structure of sperm whales (physeter macrocephalus)?Mol. Ecol 25:2754–2772
- [31]Sex-biased dispersal in sperm whales: contrasting mitochondrial and nuclear genetic structure of global populationsProceedings of the Royal Society of London. Series B: Biological Sciences 266:347–354
- [32]Allopatric humpback whales of differing generations share call types between foraging and wintering groundsSci. Rep 11
- [33]Male sperm whale (physeter macrocephalus) acoustics in a high-latitude habitat: implications for echolocation and communicationBehav. Ecol. Sociobiol 53:31–41
- [34]Dialect change in resident killer whales: implications for vocal learning and cultural transmissionAnim. Behav 60:629–638
- [35]Towards understanding the communication in sperm whalesiScience 25
- [36]Male sperm whale (physeter macrocephalus) coda production and coda-type usage depend on the presence of conspecifics and the behavioural contextCan. J. Zool 86:62–75
- [37]On information and sufficiencyAnn. Math. Stat 22:79–86
- [38]Variable length markov chains: methodology, computing, and softwareJ. Comput. Graph. Stat 13:435–455
- [39]Context tree selection and linguistic rhythm retrieval from written textsAnn. Appl. Stat 6:186–209
- [40]Analysis of animal movement using opportunistic individual identifications: application to sperm whalesEcology 82:1417–1432
- [41]Movements of sperm whales in the tropical pacificMarine Ecology Progress Series 361:291–300
- [1]Culture in whales and dolphinsBehav. Brain Sci 24:309–324
- [2]The extension of biology through cultureProc. Natl. Acad. Sci 114:7775–7781
- [3]The cultural lives of whales and dolphinsUniversity of Chicago Press
- [4]The cultural transmission of bird songTrends Ecol. Evol 1:94–97
- [5]Primate archaeology reveals cultural transmission in wild chimpanzees (pan troglodytes verus)Philos. Trans. R. Soc. B 370
- [6]Associative mechanisms allow for social learning and cultural transmission of string pulling in an insectPLoS Biol 14
- [7]Cultural transmission and evolution: A quantitative approachPrinceton University Press
- [8]IntroductionEthnic Groups and Boundaries Little Brow
- [9]Shared norms and the evolution of ethnic markersCurr. Anthropol 44:122–130
- [10]The evolution of ethnic markersCult Anthropol 2:65–79
- [11]Evidence from sperm whale clans of symbolic marking in non-human culturesProc. Natl. Acad. Sci 119
- [12]Deep Machine Learning Techniques for the Detection and Classification of Sperm Whale BioacousticsSci. Rep 9
- [13]Sperm whale, the largest toothed creature on earthEthology and behavioral ecology odontocetes Springer :261–280
- [14]Sperm whale codasJ. Acoust. Soc. Am 62
- [15]Using identity calls to detect structure in acoustic datasetsMethods Ecol. Evol 12:1668–1678
- [16]Coda communication by sperm whales (Physeter macrocephalus) off the Galapagos IslandsCan. J. Zool 71:744–752
- [17]Individual vocal production in a sperm whale (physeter macrocephalus) social unitMar. Mammal Sci 27:149–166
- [18]Individual, unit and vocal clan level identity cues in sperm whale codasR. Soc. Open Sci 3
- [19]Spatial and temporal variation in sperm whale coda vocalizations: stable usage and local dialectsAnim. Behav 70:191–198
- [20]Vocal clans in sperm whales (Physeter macrocephalus)Proc. Royal Soc. B 270:225–231
- [21]Socially segregated, sympatric sperm whale clans in the Atlantic OceanR. Soc. Open Sci 3
- [22]Multilevel animal societies can emerge from cultural transmissionNat. Commun 6
- [23]Physeter clicksWhales, dolphins, and porpoises :510–527
- [24]Ocean nomads or island specialists? culturally driven habitat partitioning contrasts in scale between geographically isolated sperm whale populationsR. Soc. Open Sci 9
- [25]The evolution of tag-based cooperation in humans: The case for accentCurr. Anthropol 53:588–616
- [26]The origins and psychology of human cooperationAnnu. Rev. Psychol 72:207–240
- [27]Sperm whale codas may encode individuality as well as clan identityJ. Acoust. Soc. Am 139:2860–2869
- [28]Geographic variation in the acoustic traits of greater horseshoe bats: testing the importance of drift and ecological selection in evolutionary processesPLoS One 8
- [29]Can genetic differences explain vocal dialect variation in sperm whales, physeter macrocephalus?Behav. Genet 42:332–343
- [30]What influences the worldwide genetic structure of sperm whales (physeter macrocephalus)?Mol. Ecol 25:2754–2772
- [31]Sex-biased dispersal in sperm whales: contrasting mitochondrial and nuclear genetic structure of global populationsProceedings of the Royal Society of London. Series B: Biological Sciences 266:347–354
- [32]Allopatric humpback whales of differing generations share call types between foraging and wintering groundsSci. Rep 11
- [33]Male sperm whale (physeter macro-cephalus) acoustics in a high-latitude habitat: implications for echolocation and communicationBehav. Ecol. Sociobiol 53:31–41
- [34]Dialect change in resident killer whales: implications for vocal learning and cultural transmissionAnim. Behav 60:629–638
- [35]Towards understanding the communication in sperm whalesiScience 25
- [36]Male sperm whale (physeter macrocephalus) coda production and coda-type usage depend on the presence of conspecifics and the behavioural contextCan. J. Zool 86:62–75
- [37]On information and sufficiencyAnn. Math. Stat 22:79–86
- [38]Variable length markov chains: methodology, computing, and softwareJ. Comput. Graph. Stat 13:435–455
- [39]Context tree selection and linguistic rhythm retrieval from written textsAnn. Appl. Stat 6:186–209
- [40]Analysis of animal movement using opportunistic individual identifications: application to sperm whalesEcology 82:1417–1432
- [41]Movements of sperm whales in the tropical pacificMarine Ecology Progress Series 361:291–300
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2024, Leitao et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.