Characteristics of vocalizations emitted by Wistar rats during delay fear conditioning with ten aversive foot-shocks.

A – some rats produced aversive 22-kHz vocalizations with typical features i.e. constant-frequency of <32 kHz, >300 ms duration – both values marked as dotted lines); example emission from one rat. B – some rats produced 44-kHz vocalizations with constant frequency of >32 kHz and long duration (>150 ms); example emission from one rat. C – rats which emitted aversive vocalizations during fear session, produced 50-kHz vocalizations during appetitive playback session the following day (full data published in1); representative data from same rat in A. D – the onset of long 22-kHz alarm calls typically occurred after first shock stimulus (vertical dotted lines mark time of shock deliveries in DE); note the gradual rise in peak frequencya, not exceeding 32 kHz (horizontal dotted line in DE); data from same rat as AC. E – in rats that emitted 44-kHz calls, the onset was usually delayed to after several foot-shocks; note the gradual rise in peak frequency of both long 22-kHz and 44-kHz vocalizations throughout training (comp. Fig. S1DE); data from same rat in B). F – call rate of long 22-kHz calls was higher than 44-kHz calls (*p < 0.05, **p < 0.01, ***p < 0.001; Mann-Whitney) and with different time-course – maximum number of 22-kHz calls at ITI-3 (higher than ITI-1, 2, 5-10; <0.0001– 0.0265 p levels); and higher number of 44-kHz calls at ITI-5-10, i.e. 5.8 ± 2.9 vs. ITI-1-4, i.e. 0.5 ± 0.2; p < 0.0001; all Wilcoxon); numbers of ITI (inter-trial-intervals) correspond to the numbers of previous foot-shocks, values are means ± SEM. G – long 44-kHz vocalizations had a higher incidence rate (14.1%) than short 22-kHz (8.9%) and 50-kHz calls (7.9%); values are calculated for sum of all vocalizations obtained during entire training sessions (there were fewer 50-kHz calls, i.e. 4.9%, when vocalizations prior to the first shock were not included). A-E: dots reflect specified single rat values. FG: n = 29; other results from these rats are previously published1,2.

Five subtypes (B-F) of high frequency 44-kHz aversive vocalizations.

A – standard aversive 22-kHz vocalization with peak frequency <32 kHz (peak frequency = 24.4 kHz). 44-kHz aversive vocalization subtypes: B – flat (constant frequency call; peak frequency = 42.4 kHz), C – step up (peak frequency = 39.5 kHz), D – step down (peak frequency = 52.2 kHz), E – insert (peak frequency = 38.5 kHz), F – complex (peak frequency = 46.3 kHz). G – percentage share of 44-kHz call-subtypes in all cases of detected 44-kHz vocalizations.

Clustering of ultrasonic vocalizations from fear conditioning sessions using two independent methods.

A – DBSCAN algorithm (ε = 0.14) clustering of vocalizations from all fear conditioning experiments (Wistar rats, n = 138; SHR, n = 80), silhouette coefficient = 0.198, two clusters emerge, cluster of green dots n = 77,243 (due to high generality of cluster average peak frequency and duration deemed redundant), cluster of red dots n = 5,646 (average peak frequency = 43,826.6 Hz, average duration = 0.524 s), some calls were not assigned to any cluster, i.e. outlier vocalizations, black dots, n = 4,139. BC – clustering by k-means algorithm and visualization of calls emitted by rats (n = 26) during trace and delay fear conditioning training, total number of calls n = 44,859; see also Fig. S3. B – topological plot of ultrasonic calls using UMAP embedding, particular agglomerations of calls labeled with their type or subtype. C – spectrogram images from DeepSqueak software superposed over plot B, colors denote clusters from unsupervised clustering, number of clusters set using elbow optimization (max number = 4), two clusters emerge.

Physiological and behavioral response to playback of 44-kHz calls (vs. 50-kHz and 22-kHz calls) presented from a speaker to naïve Wistar rats.

A – heart rate (HR); B – the number of emitted vocalizations. AB – gray sections correspond to the 10-s-long ultrasonic playback. Each point is a mean for a 10-s-long time-interval with SEM. CD – properties of 50-kHz vocalizations emitted in response to ultrasonic playback, i.e. number of calls (C) and duration (D) calculated from the 0-120 s range. A – 50-kHz playback resulted in HR increase (playback time-interval vs. 10-30 s time-interval, p = 0.0007), while the presentation of the aversive playbacks resulted in HR decrease, both in case of 22-kHz (p < 0.0001) and 44-kHz (p = 0.0014, average from -30 to -10 time-intervals (i.e. “before”) vs. playback interval, all Wilcoxon), which resulted in different HR values following different playbacks, especially at +10 s (p = 0.0097 for 50 kHz vs. 22-kHz playback; p = 0.0275 for 50 kHz vs. 44-kHz playback) and +20 s time-intervals (p = 0.0068, p = 0.0097, respectively, all Mann-Whitney). B – 50-kHz playback resulted also in a rise of evoked vocalizations (before vs. 10-30 s time-interval, p = 0.0002, Wilcoxon) as was the case with 44-kHz playback (p = 0.0176 in respective comparison), while no rise was observed following 22-kHz playback (p = 0.1777). However, since the increase in vocalization was robust in case of 50-kHz playback, the number of emitted vocalizations was higher than after 22-kHz playback (e.g. p < 0.0001 during 0-30 time-intervals) as well as after 44-kHz playback (e.g. p < 0.0001 during 0-10 time-intervals, both Mann-Whitney). Finally, when the increases in the number of emitted ultrasonic calls in comparison with before intervals were analyzed, there was a difference following 44-kHz vs. 22-kHz playbacks during 30 s and 40 s time intervals (p = 0.0420 and 0.0430, respectively, Wilcoxon). C – During the 2 min following the onset of the playbacks, rats emitted more ultrasonic calls during and after 50-kHz playback in comparison with 22-kHz (p < 0.0001) and 44-kHz (p = 0.0011) playbacks. The difference between the effects of 22-kHz and 44-kHz playbacks was not significant (p = 0.2725, comp. Fig. S4F; all Mann-Whitney). D – Ultrasonic 50-kHz calls emitted in response to playback differed in their duration, i.e. they were longer to 50-kHz (p = 0.0004) and 44-kHz (p = 0.0273, both Mann-Whitney) playbacks than to 22-kHz playback. * 50-kHz vs. 44-kHz, $ 50-kHz vs. 22-kHz, # 22-kHz vs. 44-kHz; one character (*, $ or #), p < 0.05; two, p < 0.01; three, p < 0.001; Mann-Whitney (AB) or Wilcoxon (CD). Values are means ± SEM, n = 13-16.

Variations of call frequency; shown in relation to call duration (ABC) and over ten subsequent aversive trials (DE) in Wistar rats.

ABC – Vocalizations plotted in relation to peak frequency (x axis) and duration (y axis). Each point corresponds to one vocalization. Vertical dotted line marks threshold value (32 kHz) between 22-kHz and 50-kHz calls. Horizontal dotted line marks threshold value (300 ms) between short and long 22-kHz calls3. Rat identifier is given in lower right corner; the number after dash indicates the number of conditioning trials. A – examples from four rats which emitted typical long 22-kHz calls (no 44 kHz calls). B – four typical long 22-kHz vocalizations with few long 22-kHz calls crossing the 32 kHz threshold. C – eight sample rats which emitted typical long 22-kHz vocalizations and atypical high-frequency aversive calls forming a distinct 44-kHz group. DE – frequencies of 22-kHz and 44-kHz vocalizations in Wistar rats over ten aversive trials: only rats that emitted 44-kHz calls in at least seven ITI are plotted (D); all Wistar rats which received 10 conditioning trials (E). Horizontal dotted line marks threshold value (32 kHz) between 22-kHz and 50-kHz calls. The numbers of inter-trial intervals (ITI) correspond to the numbers of the previous stimuli. D – peak frequency in subsequent ITI rose gradually (for long 22-kHz calls: p < 0.0001, Friedman, p = 0.0039, Wilcoxon; for 44-kHz calls: p = 0.0155, Friedman, p = 0.0977, Wilcoxon). E – peak frequency of long 22-kHz calls in subsequent ITI rose gradually (for long 22-kHz calls, p < 0.0001, Friedman, p = 0.0005, Wilcoxon); unable to determine for 44-kHz calls due to low n number. Values are means ± SEM, D: n = 9, E: n = 46.

Non-typical 44-kHz aversive vocalizations.

A, B – constant frequency calls with very high peak frequency (A, peak frequency = 62.9 kHz; B, peak frequency = 65.9 kHz, start peak frequency = 78.1 kHz). C, D – harmonic aversive vocalizations, where element with fundamental frequency (F0, lowest frequency of the vocalization) is not with maximum amplitude, i.e. peak frequency is determined from the higher call component (C, F0 = 27.8 kHz, peak frequency = 55.6 kHz; D, F0 = 40 kHz, peak frequency = 81.5 kHz). E, F – vocalizations with prominent duration but with modulated frequency (E, peak frequency = 69.3 kHz; F, peak frequency = 39.0 kHz). A, G – constant frequency calls from SHR (G, flat 44-kHz call, peak frequency = 42.4 kHz).

Clustering of ultrasonic vocalizations from rats emitting 44-kHz calls using UMAP projection and k-means.

A – topological plot of ultrasonic calls using UMAP embedding from rats emitting 44-kHz vocalizations during trace and delay fear conditioning training (n = 26), total number of calls n = 40,084, with spectrogram miniatures pointing to the general location from which they originated. B – comparison of unsupervised k-means clustering with different maximum possible number of clusters using elbow optimization (different clusters denoted by colors) done by DeepSqueak software, superposed over UMAP topological plot, number on the bottom left of the miniature denotes the maximum possible number of clusters set for elbow optimization, number on the bottom right denotes the resulting number of clusters after elbow optimization.

Behavioral response to playback of 44-kHz calls (vs. 50-kHz and 22-kHz calls).

AB – rats with implanted heart-rate transmitters (comp. Fig. 4), Wistar, n = 13-16; C-G – rats without transmitters, Sprague-Dawley, n = 15; AC – distance traveled; BD – time spent in the speaker’s half of the cage; the dotted horizontal line marks a 50% chance value for time in a side of the cage; E – number of emitted vocalizations; A-E – gray sections correspond to the 10-s-long ultrasonic presentation, each point is a mean for a 10-s-long time-interval with SEM. FG – properties of 50-kHz vocalizations emitted in response to ultrasonic playback, i.e. number of calls (F) and duration (G) in 0-120 s range. A-D – playback presentation resulted in increased motor activity in case of, especially, 50-kHz playback and 44-kHz playback. Also, all kinds of playback resulted in increased time spent in the half of the cage next to the speaker. E – 50-kHz playback resulted in a rise of the number of evoked vocalizations (average from -30 to -10 time-intervals aka before vs. 10-30 s time-interval, p = 0.0010) as was the case with 44-kHz playback (p = 0.0142), respectively, while no rise was observed following 22-kHz playback (p = 0.2271, all Wilcoxon). However, since the increase in vocalization was robust in case of 50-kHz playback, the number of emitted vocalizations was higher than both after 22-kHz playback (e.g. p < 0.01 during 0-20 time-intervals) and after 44-kHz playback (p = 0.0172, 0 s time-interval, all Mann-Whitney). Finally, when the increases in the number of emitted ultrasonic calls in comparison with before intervals were analyzed, there was a difference following 44-kHz vs. 22-kHz playbacks during the 40 s time interval (p = 0.0017, Wilcoxon, comp. Fig. 4B). F – During the 2 min following the onset of the playbacks, the rats emitted more ultrasonic calls during and after 50-kHz playback in comparison with 22-kHz (p = 0.0002) and 44-kHz (p = 0.0067) playbacks; also, the rats emitted more ultrasonic calls during and after 44-kHz playback in comparison with 22-kHz playback (p = 0.0369), comp. Fig. 4C; all Wilcoxon). G – Ultrasonic 50-kHz calls emitted in response differed also in their duration, i.e. they were shorter to 22-kHz (p = 0.0195) and 44-kHz (p = 0.0039) playbacks than to 50-kHz playback. The difference between the effects of 22-kHz and 44-kHz playbacks was not significant (p = 0.5469, comp. Fig. 4D; all Wilcoxon). * 50-kHz vs. 44-kHz, $ 50-kHz vs. 22-kHz, # 22-kHz vs. 44-kHz; one character (*, $ or #), p < 0.05; two, p < 0.01; three, p < 0.001; Mann-Whitney (AB) or Wilcoxon (CD). Values are means ± SEM.