Auditory tasks such as understanding speech and listening or dancing to music involve continuous tracking of events in time, and rely on our ability to allocate and adjust attention to the rhythmic cues within complex auditory signals. However, listeners’ tracking of and attention to rhythmic cues can fail when the signal is temporally disorganized1, or when we get older2. These failures of attention result in, for example, reduced speech comprehension2 as well as diminished ability to solve the “cocktail party problem”3. However, speech perception4 and production of musical sequences are improved when stimuli are presented at specific rates5, 6, indicating that these abilities might be “restored” under certain conditions. Here, we aimed to understand factors that facilitate and impede auditory rhythm processing from two different perspectives: those factors that arise from properties of the external world (the stimulus) and those that stem from individual differences (the perceiver). Specifically, we tested how the rate of the stimulus and the rhythmic context in which a stimulus was presented affected its perception and production, and how temporal adaptation abilities change with advancing age. We found (1) a range of rates specific to each individual that yielded best performance, and (2) deteriorating performance under the challenge of switching between stimulus rates that was further amplified by age.

Two main theoretical approaches explain how we perceive time and rhythm. A timekeeper account proposes that the duration between two events is represented by the count of accumulated pulses that are generated by an internal pacemaker7. An oscillator account proposes that biological systems possess internal oscillations, i.e., rhythms, that adjust their phase and period to the temporal regularities in the external signal. The synchronization between the internal and external rhythms, termed entrainment, is the underlying mechanism for the perception of time and rhythm.

Here, we adopted the entrainment perspective, and focused on two key properties of internal oscillators: their preferred rate and their flexibility. Preferred rate is defined as the default rate of an oscillator8 or group of nested oscillators9 in the absence of any input. Preferred rate is also referred to as the natural frequency or eigenfrequency in different literatures. Oscillators accomplish synchronization to periodicities in the external signal better when the signal’s rate is similar to the oscillator’s preferred rate (or harmonics of the preferred rate10) than when it is dissimilar11. The range of rates around the oscillator’s preferred rate within which synchronization is possible is referred to as the entrainment region12. Theoretically, knowing the preferred rate of an individual’s internal oscillator would allow us to predict the rates at which they would most successfully interact in a real-world listening situation.

One common method to estimate preferred rate is the spontaneous motor tapping (SMT) task, where participants are asked to tap their finger1214 or a drumstick8 on a desk or a sensor at a “comfortable rate”. The preferred rate estimate, spontaneous motor tempo, is the mean or median of the intervals between the individual taps. SMT estimates tend to cluster around 500-600 ms for adults12. One potential shortcoming of using SMT as a direct measure of an internal oscillator’s preferred rate is that SMT is a “preference” measure, meaning that it is generated and measured in the absence of any interaction with the environment. Although this is indeed the definition of preferred rate, a stronger test of the degree to which SMT reflects the preferred rate of an internal oscillator would be to observe successful synchronization within – but not outside of – an entrainment region. SMT does predict timing preference and performance in other tasks: participants tend to prefer stimulus rates (i.e., preferred perceptual tempo, PPT12) closer to their SMT12, drift back to their SMT during continuation tapping in synchronization-continuation paradigms5, 15, and over- and-underproduce stimuli that are faster and slower than their SMT, respectively5, 6. However, in paradigms that involve comparison of individuals’ rate preferences12 and tapping performance5, 16 across stimulus rates, stimulus conditions are tailored to individuals’ SMT and are low in number. This results in a resolution that is too poor to observe an entrainment region, and often a confounds SMT with the global mean stimulus rate in the experiment17. We have previously proposed a synchronization-continuation paradigm where individuals’ tapping behavior on a finely-sampled broad range of stimulus rates was assessed. We estimated preferred rate as the stimulus rate with minimum tapping errors during continuation tapping18. However, estimating preferred rates based on a tapping paradigm cannot disentangle preferred rates of an auditory oscillator, a motor oscillator, or a coupled oscillatory system whose preferred rate would be influenced by the preferences and coupling strengths of its components1. Thus, here we applied the fine rate sampling to a perceptual paradigm (Experiment 1), estimated preferred rates in perceptual and motor versions of the paradigm with same stimulus rate conditions (Experiment 2), and compared the estimates to individuals’ SMT and PPT (Experiment 2).

We define flexibility as the internal oscillator’s ability to adapt to rate changes in the external sound signal18. The logic is as follows: upon encountering a new rate, the oscillator gradually updates its phase and period to each upcoming interval. From a dynamical systems perspective, flexibility can be conceptualized as a complement to “stiffness”, and might be quantified based on the presence of hysteresis, which refers to a system’s tendency to stay in a previous state despite changes in stimulus parameters19. An inflexible oscillator would exhibit hysteresis and continue to respond in a way that reflects the properties of previously entrained stimuli. A fully flexible oscillator would not exhibit hysteresis as it would completely update its phase and period to the new stimulus, resulting in no discrepancy between the current stimulus and its internal representation. Thus, the extent to which timing performance would be affected by stimulus history is inversely related to the underlying oscillator’s flexibility.

Prior research reveals effects of preceding context (also referred to as serial dependence20, 21 and carryover effects22) on timing behavior in tasks with and without a motor synchronization component. Within individual trials of synchronized tapping paradigms, changes in stimulus rate (period perturbation), and stimulus onset times (phase perturbation) result in increased asynchronies between stimulus and tap onsets, more so for phase than period perturbations23, 24, and more so for sequences that speed up than those that slow down16, 24. Across trials, tapping rate in each trial is biased towards the previous trial’s stimulus rate18, 21. Temporal judgments in the absence of motor synchronization are also affected by the properties of stimuli presented in the preceding trial22, 25, 26 and throughout the experiment25, 27. This suggests that temporal judgments are affected by both local and global temporal context. The majority of studies that have revealed individual differences in proneness to history effects20, 28 have not aimed to explicitly estimate the extent and source of these individual differences, or have done so in shorter temporal contexts, using different operational definitions of flexibility than the one used here6. Finally, similar to methods proposed to estimate preferred rate5, 6, 12, 15, 18, 29, previous attempts to measure flexibility6, 16, 18 involved motor responses. Thus, we presented the same stimulus history to participants in two tasks, one with and one without the motor demands of synchronize-continue tapping. This design allowed us to assess the effects of the same predictor (trial-to-trial rate change) on performance in different tasks, and thereby to perform systematic comparisons of oscillator flexibility across perceptual and motor domains.

From the perceiver’s side, we chose to focus on how properties of internal oscillators change with advancing age. Studies assessing aging in rhythmic motor tasks reveal slowing of SMT12, 30, as well as the fastest rate that an individual can produce31. (See Ref. 32 for a review of aging in sensorimotor tasks). For tasks without a rhythmic motor component, older individuals perform worse than younger individuals in temporal-order judgments33, discrimination and reproduction of time intervals34, and gap detection35. Older adults also tend to prefer slower stimulus rates12, which manifests in a breakdown in understanding fast speech. These results suggest a decline in timing abilities in aging, as well as an overall slowing of the rates that older adults’ internal oscillators can synchronize with. Here, we hypothesized a reduction in oscillator flexibility with age. Neural entrainment to external auditory signals is aberrant3638, and less responsive to top-down attention in older than younger adults39. Moreover, older adults exhibit reduced neural adaptation40 and sensory gating41, suggesting an age-related decline in neural inhibition40 that leads to a reduced capacity of the auditory system to adapt based on context. Thus, we hypothesized that older adults would exhibit stronger hysteresis than younger adults.

The aim of the current study was to estimate individuals’ preferred rate and flexibility in rhythmic tasks with and without a motor synchronization component, and in both preference and performance contexts: here, preference refers to SMT and PPT, whereas performance refers to tasks that require listeners to either synchronize with or make a perceptual judgment about rhythmic stimuli. Moreover, we aimed to assess how internal oscillator properties, specifically oscillator flexibility, change with advancing age.

We conducted two experiments. The main goal of Experiment 1 was to develop methods to estimate preferred rate and flexibility in a paradigm without a motor synchronization component, as a complement to our recent tapping study18. The task was a duration discrimination paradigm where participants compared the duration of a single comparison interval to the duration of intervals making up a standard stimulus. We assessed the effect of stimulus history on responses by comparing performance across two sessions with the same finely-sampled pool of stimulus rates, one where we maximized and the other where we minimized the amount of rate change across trials. Experiment 2 involved shorter versions of the duration discrimination (Experiment 1) and paced tapping18 tasks with matched stimulus rates and histories, unpaced tapping tasks including SMT, and two tasks where individuals’ rate preferences (PPT) were measured.

In line with the preferred period hypothesis12, if SMT captures the preferred rate of common mechanisms underlying rhythm perception and production, we should see better performance around an individual’s SMT, as has previously been observed for motor tasks5, 6, 12, 15, 42. However, we did not necessarily expect a one-to-one correspondence between preferred rate estimates across tasks with and without a motor component, as individual differences in motor contributions to synchronization abilities are well documented43.

We hypothesized that larger trial-to-trial changes in stimulus rate would lead to poorer performance due to hysteresis, in that both tapping and duration-discrimination responses should reflect the properties of the preceding stimuli. Thus, we expected that larger changes between consecutive trials’ stimulus rates should decrease discrimination accuracy and increase tapping errors. We expected that the strength of these effects – the degree of inflexibility – should increase with age.

Experiment 1

Methods

Participants

Participants (N = 31) were recruited from the participant pool of Max Planck Institute for Empirical Aesthetics laboratories in Frankfurt, Germany. Written informed consent was obtained from all participants. The procedure was approved by the Ethics Council of the Max Planck Society and the Research Ethics Board at Toronto Metropolitan University in accordance with the Declaration of Helsinki. Out of 31 (age: M = 33, SD = 11) individuals who were recruited for the study, 27 participants (age: M = 33, SD = 12) completed both sessions. Upon completion of each session, participants received 7 euros for every 30 minutes of their participation (21 euros per session on average). Two participants volunteered to complete the study without compensation. Prior to the experimental sessions, participants completed an online survey. All participants self-reported normal hearing and proficiency in English.

Procedure

The study consisted of an online background survey that participants completed at home, and then two experimental sessions. During the inlab experimental sessions, participants completed two types of tasks. A series of unpaced tapping tasks, consisting of SMT44 and a ‘forced’ motor tempo (FMT) task, which was used to assess the range of free tapping rates within the participants’ motor abilities. The main task was duration discrimination, where participants judged whether a comparison interval was ‘shorter’ or ‘longer’ than the intervals making up a standard sequence. Details of all tasks are provided below. Sessions were separated by 4-19 days. A single session started with the SMT and FMT tasks. Participants then set the sound volume to a level that they found comfortable for completing the task. Then, participants were presented with instructions on a computer screen that explained the main task with text and figures. A practice block, simulating the duration discrimination task, followed the instructions (details below). All instructions were in English. Once participants indicated that they understood the task, the main task blocks were initiated. Finally, unpaced tapping tasks were repeated in the same order. Participants were debriefed upon their request, only after the second session. An individual session lasted 90 minutes on average.

Duration discrimination task

The main task was a duration discrimination paradigm, where participants judged whether a comparison interval was longer or shorter than the intervals making up an isochronous standard sequence, by pressing either the L (longer) or S (shorter) key on a computer keyboard. The task procedure is illustrated in Figure 1. In each experimental session, 400 trials of this task were presented, each consisting of a combination of the three main independent variables: IOI, DEV, and ΔIOI. We explain each of these variables in detail in the next paragraphs.

Design of the duration discrimination task in Experiment 1. Each trial consisted of an isochronous standard sequence of five sounds (four intervals), followed by silence and another pair of sounds. The comparison duration was either shorter or longer than the standard intervals and took on one of ten values (DEV) that were proportional to the inter-onset interval (IOI) between tones making up the standard sequence. The task was to press the S or L key to indicate whether the comparison interval was shorter or longer than the standard IOI. Over the course of 400 unique trials of a single session, IOI ranged from 2 to 998 ms. In linear-order sessions, IOI increased in each trial in the first 200 trials and decreased in the other half of the trials (or vice versa, counter-balanced across participants) in steps of 4 ms. In random-order sessions, change in stimulus rate between a given trial n and immediately preceding trial n-1 (ΔIOI) was maximized, and the distribution of ΔIOI ranged from -778 ms to +770 ms.

Stimuli were made up of 50-ms woodblock sounds; first, an isochronous standard sequence and then a comparison interval, separated by a silent gap. The interval between the 5 woodblock sounds making up the ‘standard’ isochronous stimulus sequence is referred to as IOI. Each trial’s IOI was drawn (without replacement) from a pool of all possible stimulus rates, linearly spaced between 200 to 998 ms in 2-ms steps. The silent interval between the last stimulus onset of the standard sequence and the first stimulus onset of the comparison pair was 6 times the standard IOI.

The comparison interval on each trial was longer or shorter than the standard IOI. DEV refers to the magnitude of the comparison interval’s deviation from the standard IOI. DEV took on one of ten levels, which were proportional to IOI: ± 2%, 7%, 11%, 16%, 20%. Each DEV level was presented 40 times in each session. Since IOI was unique on each trial, IOI and DEV were not fully crossed factors. Instead, the IOI dimension was divided into 40 bins, each consisting of 10 consecutive IOIs. The 10 DEV levels were randomly assigned to the 10 IOI values in each bin. The correspondence between IOI and DEV pairs was unique for each participant.

The way IOI changed from trial to trial differed between experimental sessions. Change in IOI between consecutive trials was referred to as ΔIOI. In one session, the ‘linear-order’ session, ΔIOI was always ±4 ms. In one half of the session, ΔIOI was fixed at +4 ms. That is, IOI was 200 ms in the first trial, 204 ms in the second, etc. In the other half of the session, ΔIOI was fixed at –4 ms. On the first trial, IOI was 998 ms, 994 in the second, etc. The starting point, 200 ms or 998 ms (in fast-start and slow-start conditions, respectively) was counterbalanced across participants.

In the other session, the ‘random-order’ session, ΔIOI was maximized, and the direction of the change (i.e., whether a trial was faster or slower than the previous) alternated on every trial. That is, if the stimulus IOI on one trial was faster than the previous (-ΔIOI), it would be slower (+ΔIOI) in the following trial, and vice versa. Note that stimulus IOI was stable within the standard sequence, and only changed between trials. Session order, that is, whether a participant experienced the linear-order or random-order session first, was counterbalanced across participants. An example trajectory of stimulus IOI within linear-order and random-order sessions across trials is illustrated in Figure 1.

In each session, participants completed 407 trials, presented in 8 blocks with 50 trials in the first block, and 51 trials in the remaining 7 blocks. Except for the first block, the first trial of each block repeated the IOI that was presented as the last trial of the preceding block; this enabled preservation of the between-trial histories across blocks between which participants were allowed to take short breaks. Before the main task, participants were instructed about the task, and practiced the task for at least 6 trials. Instructions included two example trials with IOI of 500 ms, one with DEV of +.3 and another with DEV of -.3, illustrating ‘comparison longer’ and ‘comparison shorter’ conditions, respectively. DEV was fixed at -.2 in half of the practice trials, and at +.2 in the other half. Two practice trials each were presented at fast, medium, and slow IOIs; randomly selected from ranges of [300 - 500 ms], [501 - 700 ms] and [701 - 900 ms], respectively. If participants failed on more than 3 of the first 6 practice trials, they completed another round of 6 practice trials. Both example and practice trials were randomly ordered within their respective blocks in each session.

The dependent variables were accuracy and bias. Accuracy coded whether a response on a trial was correct or not (1 = correct, 0 = incorrect). Bias, on the other hand, could take on one of three values per trial: in the case of a correct response, bias was 0. If the comparison interval in a trial was longer, and the participant’s response was ‘shorter’, bias in that trial was -1. Similarly, if participant’s response was ‘longer’ in a trial where comparison interval was shorter, bias was +1.

Unpaced tapping tasks

Unpaced tapping tasks consisted of a single SMT trial and two FMT trials, one each to estimate the ‘slowest’ and ‘fastest’ rates at which participants could maintain steady tapping. The unpaced tasks were repeated in the same order before and after completion of the duration discrimination task in both sessions. In the SMT task, participants were instructed to ‘tap on the desk at a rate that is comfortable to maintain’. In the FMT tasks, the instruction was ‘tap at the slowest rate that is comfortable to maintain’ (FMT-slowest) and to ‘tap at the fastest rate that is comfortable to maintain’ (FMT-fastest). Participants tapped for 30 seconds in the SMT task and FMT-fastest task, and 45 seconds in the FMT-slowest task. For all unpaced tapping tasks, the dependent measures were tapping rate (median of the produced intervals) and coefficient of variation.

Apparatus

Stimuli were generated and presented on a Windows desktop computer, using the Psychophysics Toolbox extensions45, 46 for Matlab. Auditory stimuli were presented via Beyerdynamics 880 pro headphones. The audio signal was presented and recorded by an RME Fireface UC soundcard. All instructions were presented on an ASUS VG24QE LCD screen. Keypress responses for the duration discrimination task were collected on a USB keyboard. Tapping responses for the unpaced tapping tasks were recorded via a Schaller Oyster S/P contact microphone at a sampling rate of 44100 Hz. The contact microphone was attached on the right half of the desk by default. Prior to the sessions, participants were asked to specify if they would like the microphone to be moved to the left half of the desk. None of the participants requested a relocation of the microphone.

Background survey

Prior to the first experimental session, participants completed an online survey. The survey consisted of two parts: the first part included questions about participants’ demographics, language skills, hearing abilities, and psychological disorders. The second part was ‘The Goldsmiths Musical Sophistication Index’, ‘Gold-MSI’47. The survey language was English by default, with an option to change the language to German. One question in the Gold-MSI was removed from the analyses due to contrasting Likert coding between the different languages in which the survey was completed.

Analysis

Data cleaning and exclusion criteria

The raw format of the tapping data was audio, since tapping responses were collected by a microphone. Individual taps were extracted from the audio files after visual inspection of the soundwave of each trial to set the noise floor for the recording on that trial. All peaks that exceeded the noise floor were retained. Inter-tap intervals (ITIs) were calculated as the difference between neighboring taps’ timestamps. We developed an automated procedure that detects and removes single-trial ITI outliers while accounting for drift that may have occurred within tapping trials. The script first marked the ITIs whose deviation from the median ITI exceeded 3x the median absolute deviation (MAD) of all ITIs in the respective trial. Then, it fitted a linear regression to the unmarked ITIs as a function of tap count. Finally, it removed any ITI that was smaller than half or larger than 1.5 times the predicted ITI.

Exclusion criteria for the main task were (1) a decrease in accuracy with increasing absolute DEV, and (2) chance level performance for both deviation directions (trials where comparison interval was shorter, and those where it was longer). To assess the first criterion at the participant level, we fitted separate models to each individual’s single-session data where accuracy was predicted by absolute deviation of the comparison interval for either shorter (|-DEV|) or longer (|+DEV|) comparison conditions. The models were fitted using Matlab’s fitglm function, with the response variable distribution specified as ‘binary’, and link function specified as ‘logit’, since the response variable, accuracy, was binary. Next, we compared the slopes (β) obtained from the separate models where either or predicted accuracy against zero, using one-tailed one-sample t-tests. All participants had positive slopes for both directions in both session types, indicating that the probability of correct response increased with |DEV| in all conditions. To test for chance level performance, for each session type, we split all trials into negative and positive DEV conditions and compared each group of trials’ accuracy against a mean of .5, using one-sample t-tests. Results showed that none of the participants had chance-level performance for both deviation directions.

Finally, before applying group-level statistics such as t-tests and correlations, any datapoint that fell outside of the interquartile range was excluded from the respective distributions.

Preferred rate estimates

We conceptualized individuals’ preferred rates as the stimulus rates where duration-discrimination accuracy was highest. To estimate preferred rate on an individual basis, we smoothed response accuracy across the stimulus-rate (IOI) dimension for each session type, and obtained the IOI where accuracy was maximum. For smoothing of binary accuracy data, we used the smoothdata function in Matlab. By default, this function outputs the moving average of the neighboring data points within a specified window size. Here, we used ‘gaussian’ as the method for smoothing, which calculates the Gaussian-weighted moving average over each window. Both moving average and gaussian smoothing are forms of convolution, where each data point in a given window (number of elements) is multiplied by the specified array of numbers, namely, the ‘mask’48. In moving average method, the mask is flat, giving the weight of 1 to each element. Gaussian-weighted moving average gives higher values into the midpoint of the window, which enhances the fluctuations in the data that are the focus of the current analysis.

As we were interested in a single-point maximum accuracy for each individual and session, we optimized the window size for each session type such that the smoothed data revealed a single global maximum. Starting from a window size of 10 elements, for each window size, we recorded the IOIs with the maximum accuracy value in each dataset. An illustration of the optimization for an example participant’s dataset is shown in Supplementary Figure S1A. For small windows, smoothed data included IOI multiple values where accuracy was 1, especially in the linear-order sessions. The optimization procedure revealed that, to obtain a single global maximum for each individual’s dataset, accuracy should be smoothed by windows of 26 elements in the random-order sessions and 48 elements in linear-order sessions, as shown in Supplementary Figure S1B. To equalize the smoothing across the variables of accuracy and IOI, we also smoothed IOI with the same window size. Estimates of preferred rate were taken as the smoothed IOI that yielded maximum accuracy.

To compare the preferred rate estimates between session types, we first conducted a paired-samples t-test. Then, we assessed the correspondence between the estimates. However, conventional correlation methods are not able to capture possible harmonic relationships between variables. Thus, we used a permutation test18 that accounted for the harmonic structure in data, in addition to the assessment of one-to-one correspondence between the datapoints. The test first calculates the perpendicular distance of the data points to the closest line among the y = x, y = 2*x and y = x/2 theoretical lines (referred to as residuals here, as in Ref. 18) whose sum quantifies how much the datapoints deviate from a total harmonic correspondence. Then, the test shuffles the Y-axis values with respect to the X-axis values 1000 times and calculates summed residuals for each permutation. The p-value is the percentage of summed residuals smaller than the initial value computed from original data.

Flexibility estimates

We hypothesized that larger trial-to-trial changes in stimulus rate would reduce accuracy. To test this hypothesis, we first compared participants’ average accuracy between session types, using a paired-sample t-test Then, we assessed the effect of absolute rate change (|±ΔIOI|) on accuracy for each individual. To do so, we fitted generalized linear models to each participant’s random-order session data, and obtained slopes (β) that quantified the strength of the |±ΔIOI| effect for each participant. The models were fitted using Matlab’s fitglm function, with the distribution of the response variable specified as ‘binary’, and link function specified as ‘logit’, since the response variable accuracy was binary. We also fitted separate models for trials where the stimulus was faster or slower than the previous trial’s stimulus, thus the predictor was either |-ΔIOI| or |+ΔIOI|, respectively. The model formula is shown in Equation (1). Next, using one-tailed one-sample t-tests, we tested whether models’ β were smaller than zero, which would confirm a decrease in accuracy as a function of |-ΔIOI| or |+ΔIOI|. The resulting β values, which quantified individuals’ ability to adapt to changes in stimulus rate from one trial to the next, served as our single-individual estimate of oscillator flexibility.

where Y is binary accuracy and X is the amount of rate change in trials that were faster than previous (|-ΔIOI|) or in trials that were slower (|+ΔIOI|).

Finally, to investigate whether responses were affected by the previous trial’s stimulus, we computed participants’ average bias in trials where stimulus was faster than the previous one (|-ΔIOI|), and in trials where it was slower (|+ΔIOI|). Then, we compared the distribution of average bias values against zero, using one-sample t-tests. Non-zero positive bias indicated that participants incorrectly responded as ‘comparison interval was longer’ in trials where comparison interval was in fact shorter than the standard interval; and non-zero negative bias indicated the opposite.

Results

We first assessed whether accuracy increased with increasing DEV. Comparison of the distribution of slopes (β) against zero showed that for both DEV directions, β were greater than zero. Descriptive and inferential statistics are shown in Table 1. Next, we compared participants’ average accuracies from ‘comparison shorter’ (|- DEV|) and ‘comparison longer’ (|+DEV|) conditions. Although average accuracy from the latter conditions was higher in both sessions, these differences were nonsignificant for both sessions.

Descriptive statistics and test results for comparison of Beta estimates against null distributions in Experiment 1 analyses.

Preferred rate estimates

We expected that accuracy should depend on IOI differently for each participant, and estimated individuals’ preferred rate as the IOI where smoothed accuracy was maximum. Between-session comparisons showed that estimates did not significantly differ between sessions (p=.129). When we directly compared preferred rate estimates from the two session types (Fig. 2A), we found that for most participants, the estimates were numerically close each other. Interestingly, for some participants, estimates from one session were close to double or half of those from the other session, suggesting a harmonic relationship between the estimates. We applied a permutation test that accounted for the harmonic structure of the data, and found a significant relationship between estimates from two session types (p=.008, Fig. 2A).

Main findings of Experiment 1. A Left: Each circle represents a single participant’s estimate from random-order session (x axis) and linear-order session (y axis). The histograms across the axes show the distributions of estimates for each session type. The dotted lines represent 1:2 and 2:1 ratio between the axes, and the straight line represents one-to-one correspondence. Right: permutation test results. The distribution of summed residuals (distance of data points to the closest y=x, y=2*x and y=x/2 lines) of shuffled data over 1000 iterations, and the summed residual from original data (dashed line) which fell below .008 of the permutation distribution. B Average accuracy from random-order (left, green) and linear-order (right, blue) sessions. C Effects of |±ΔIOI| on responses in Experiment 1 duration discrimination paradigm. Left: Flexibility estimates. Each circle represents an individuals’ slope (β) obtained from logistic models, fitted separately to conditions where |-ΔIOI| (left, green) or |+ΔIOI| (right blue) predicted accuracy, with greater values (arrow’s direction) indicating better oscillator flexibility. The distribution β from both conditions were smaller than zero, indicating a negative effect of between-trial absolute rate change on accuracy. Right: Participants’ average bias from |-ΔIOI| (left, green), and |+ΔIOI| (right, blue) conditions. Box plots in B and C panels show median (black vertical line), 25th and 75th percentiles (box edges) and extreme datapoints (whiskers).

Flexibility estimates

Average accuracy (Figure 2B) was higher in linear-order (M = 0.834, SD = 0.039) sessions, where |±ΔIOI| was predictable and always small (±4 ms), than in random-order (M = 0.695, SD = 0.072) sessions, where |±ΔIOI| was unpredictable and often large, t(24) = 12.5964, p < .001). β from model fits were significantly smaller than zero for both |-ΔIOI| and |+ΔIOI| conditions, and we found no significant differences between β from the former and latter conditions, showing that the probability of giving a correct response decreased with the amount of rate change across trials, regardless of whether a stimulus was faster or slower than the previous trial. Descriptive and inferential statistics are provided in Table 1. The distributions of β from individual fits are shown in Figure 2C. To investigate the source of the negative relationship between |±ΔIOI| and accuracy, we analyzed how rate change affected bias. In both session types, participants’ average bias from faster-than-previous (|-ΔIOI|) conditions was significantly smaller than zero (random-order session: M = -0.179, SD = 0.144, t(26) = -6.4487, p < .001; linear-order session: M = - 0.065, SD = 0.078, t(26) = -4.3159, p < .001); and average bias from slower-than-previous (|+ΔIOI|) conditions was significantly greater than zero (random-order session: M = 0.195, SD = 0.096, t(26) = 10.5406, p < .001; linear-order session: M = 0.063, SD = 0.046, t(23) = 6.6472, p < .001), as shown in Figure 2C. These results indicate that participants perceived longer comparison intervals as shorter on the trials where stimulus was faster than the previous trial; and vice versa on trials where stimulus was slower.

Unpaced tapping

Individuals completed a series of unpaced tapping tasks in the beginning and in the end of each session. Here, we focused on tapping rate from the spontaneous motor tempo (SMT) task. We first compared individuals’ SMT before and after sessions. For both random- and linear-order sessions, SMT from before and after the session correlated and were not significantly different. Given the consistency of the measure, we averaged participants’ SMT within sessions and compared the mean SMT across session types. We found a strong correlation between tapping rates from the random- and linear-order sessions. Test results of the unpaced tapping analyses are provided in Table 2.

Descriptive statistics of unpaced tapping measures in first and second experiments, and test results for pairwise comparisons.

Discussion

The results of Experiment 1 showed that discrimination accuracy systematically increased with the difference between standard and comparison intervals (DEV), and decreased with the difference in stimulus rate between consecutive trials (|±ΔIOI|). Accuracy showed a nonlinear relationship with IOI: we observed improved accuracy at an individual-specific range of stimulus rates and in cases at their (sub)harmonics (IOI).

For most participants, preferred rate estimates from two session types were similar, and for some participants, estimates from random-order sessions were close to double the estimates from the linear-order sessions (see Figure 2A). Correspondence between estimates from the two session types shows the reliability of the paradigm and robustness of the methods we developed for the preferred rate estimation since we were able to obtain similar estimates in repeated measurements, and under conditions with major differences in stimulus history and task difficulty. The current findings support three key predictions of the entrainment account. First, similar estimates of preferred rate under different temporal contexts and repeated measurements suggest improved timing abilities in situations with smaller detuning between the oscillator’s preferred rate and the stimulus rate11. Second, that the estimates from the more challenging random-order session were narrower while preserving the correspondence to those from other conditions indicates that the internal oscillators were able to adaptively10, 49 entrain to the range of rates around their preferred rate, i.e., their entrainment region12. Finally, the harmonic relationship between the estimates from the two session types suggest the oscillator’s ability to respond to multiple nested rates, either due to the circular nature of oscillators49 or by involvement of multiple nested oscillators in rhythmic entrainment9.

Two sets of results confirmed the presence of history effects on timing performance. Accuracy was lower in random-order sessions, where absolute rate change (|±ΔIOI|) was maximum, than in linear-order sessions where it was minimum. Moreover, accuracy in random-order sessions decreased as rate change increased. The difference in discrimination accuracy between sessions cannot be attributed merely to the effects of the global context, given that the global context was identical across session types. If the duration representations were drawn towards the mean of the rates presented in the session (‘the central tendency effect50’), accuracy would be similar between the sessions with identical global means. Instead, we observed a drastic decrease in accuracy in the random-order session, which suggests a stronger influence of local than global context in the current paradigm. The analyses of bias confirmed this explanation by showing that internal duration representations on a given trial were biased towards the previous stimulus rate. Interestingly, rate change across trials affected bias even when it was small and fixed.

Experiment 2

Methods

Participants

32 participants were recruited from the participant pool of Max Planck Institute for Empirical Aesthetics laboratories. The procedure was approved by the Ethics Council of the Max Planck Society and the Research Ethics Board at Toronto Metropolitan University and was in accordance with the Declaration of Helsinki. Participants signed an informed consent prior to the session and received 21 euros on average euros as compensation after completing the session. Prior to the experimental sessions, they also completed an online survey. We targeted a uniform age distribution (M = 50, SD = 17): within the range of 20-80 years of age, we recruited 5 or 6 participants from each 10-year age bin.

Procedure

The study consisted of an online background survey; a series of unpaced tapping tasks including the SMT, two PPT tasks, duration discrimination and paced tapping tasks. Participants’ hearing thresholds were measured using standard puretone audiometry. The experiment procedure is illustrated in Figure 3A. Details of all tasks are provided below.

Experiment 2 (A) timeline, and illustrations of the duration discrimination (B), paced tapping (C), slider (D) and keypress (E) tasks.

Participants completed an online survey prior to the session. The lab session started with the SMT and FMT tasks, respectively. Then, participants were asked to set the sound volume to be used in the auditory tasks throughout the experiment using a slider that they clicked with a mouse. The experiment proceeded with the slider PPT task, the keypress PPT task, then the duration discrimination and paced tapping tasks; and finally, with repetitions of the SMT, FMT and slider tasks. The order of the keypress, duration discrimination and paced tapping tasks was pseudo-randomized for each participant and all 6 order combinations were counterbalanced. Prior to each task, participants were presented with instructions on the screen. Between each task, participants were allowed to have short breaks. Upon completion of the experiment, participants were moved to another booth in the laboratory room to complete a pure-tone audiometry measurement. An individual session including audiometry lasted 90 minutes on average. Instructions were in German.

Duration discrimination task

The stimuli for the duration discrimination task were the same as in Experiment 1. The conditions differed from Experiment 1 random-order sessions in in three aspects: here, the pool of stimulus rates was linearly spaced between 200 to 1000 ms in 10-ms steps, comparison interval deviated from standard IOI at a fixed amount of DEV=±13%, and there were two repetitions of each stimulus rate. For determining the spacing for IOI, we performed a bootstrapping analysis on data from our previous study, from which the current paced tapping paradigm was adapted18. The analysis revealed that 10 ms was the optimum step size that produced similar values to the original preferred rate estimates, while also preserving the between-session harmonic correlation. Details of the bootstrapping analysis are provided in Supplementary Materials.

We selected the fixed deviation for comparison intervals as follows. First, we estimated thresholds for negative and positive deviations from Experiment 1. To do so, for each participant’s (N=27) random-order session data, we averaged the accuracy at each deviation level, separately for negative and positive deviations. We fitted psychometric curves to the mean values and obtained the deviation amount that yielded 75% predicted accuracy from the fitted curve. From the resulting distributions of thresholds for negative and positive deviations, we removed outliers by excluding any value that exceeded 3x the median absolute deviation (MAD) of all threshold values in the respective distribution. Finally, we took the mean threshold value across participants and deviation directions. We then piloted the task on a small sample to confirm that the value of 13% was appropriate to be used in the duration discrimination task in Experiment 2 that would give an approximate accuracy of 75%.

The task (Fig. 3B) consisted of two blocks with complementary DEV conditions. Participants were presented with all 81 stimulus rates in the same order in each block. However, if the comparison interval for a given stimulus rate was longer in the first block, it was shorter in the second, and vice versa. As in Experiment 1 random-order sessions, the change in IOI between consecutive trials (ΔIOI) was maximized, and the direction of the change alternated on every trial. For each participant, we generated a unique stimulus order which was constant across the blocks and was also used in the paced tapping task.

The instructions of the task included two example trials, and participants practiced the task for at least 6 trials. The properties and the procedure of the example and practice trials were identical to those in Experiment 1.

Paced tapping task

The task (Fig. 3C) was a shorter version of the synchronization-continuation paradigm we developed in a previous study18. On each trial, participants were presented with an isochronous stimulus sequence of 5 sounds, followed by silence. Sound stimuli were the woodblock samples used in Experiment 1. Participants were instructed to start tapping to the stimulus as soon as possible, and to continue tapping at the same rate once the sounds ceased, until the end of the trial, which was signaled by a change in the screen color. For each participant, the stimulus rates as well as their order were identical to those generated for the duration discrimination task. In these matched stimulus conditions, IOI ranged from 200 ms to 1000 ms in 10-ms steps. Allowed duration for continuation tapping was 6 times the stimulus IOI for fast (IOI < 300 ms) stimuli, and 7 times the IOI for slow (IOI > 300 ms) stimuli. Prior to the task, participants completed 6 practice trials, with specifications described in Ref.18.

Unpaced tapping tasks

The procedure for the spontaneous motor tempo (SMT) task and ‘forced’ motor tempo (FMT) tasks were identical to those in Experiment 1.

Slider task

The slider task was a PPT task where participants dynamically adjusted the rate of stimulus sequences comprising the same woodblock samples used in Experiment 1. Each trial started with an isochronous stimulus sequence, and participants were presented with the instructions at the top of the screen. A horizontal slider (Fig. 3D) was displayed with labeled endpoints “schnell” (fast) and “langsam” (slow). Moving the mouse changed the indicator of the slider, marked in red; and each left-click produced an isochronous stimulus sequence with the selected rate. A right mouse click saved the final rate and terminated the trial. Participants completed two blocks of 8 trials of the task. In each block, the start-rate of the stimulus sequence was 200 ms in half of the trials and 1000 ms in the other half. The location of the labels also differed between trials, and the “fast” label was on the left end in half of the trials, and vice versa in the other half. Label locations and start-rates were counterbalanced within each block, and their combinations were ordered randomly.

Keypress task

The keypress task was also a PPT task where participants indicated their preferred rates by stopping stimulus sequences with dynamically changing rates. Stimulus samples making up the sequences were the woodblock samples used in Experiment 1. Each trial started with a stimulus sequence, and participants were presented with the instruction text on the top, and a dynamic figure on the middle of the screen that indicated the time left to respond. If no response was given during the stimulus, the trial was repeated. Stimuli started fast (IOI = 200) in half of the trials and slow (IOI = 1000) in the other half and increased or decreased by 10 ms in each interval, depending on the start-rate. That is, the stimulus got slower in each interval on fast-start trials, and vice versa on slow-start trials. Participants completed 6 trials of the keypress task. The order of the stimulus conditions was randomized. Figure 3E illustrates a fast-start condition of the keypress task.

Design

The stimulus IOIs presented in all tasks that involved an auditory stimulus ranged from 200 ms to 1000 ms. Thus, IOI was an independent variable, on which rate preferences and performances were assessed to be compared across tasks. The order of stimulus IOI, and thus ΔIOI, was matched between duration discrimination and paced tapping tasks, from which independent variables of +ΔIOI| and |-ΔIOI| were derived. Other independent variables were DEV direction (i.e., whether comparison interval was shorter or longer than the standard) in duration discrimination task, repetition for SMT, FMT and slider tasks; and start rate for slider and keypress tasks.

Dependent variables were the tapping rate in SMT and FMT, selected rate in slider and keypress, accuracy and bias in duration discrimination, and signed or absolute values of tempo-matching-errors, TME, in paced tapping tasks.

Apparatus

Apparatus for the presentation of sound stimuli, and collection of tapping and keyboard responses were identical to those of Experiment 1. Additionally, participants used a mouse for giving responses in the slider task, and for setting the desired sound volume. The background survey was a German translation of the survey used in Experiment 1. We conducted Experiment 2 in German given that the participant sample consisted of older individuals who were less likely to fluently speak English than the mostly-student sample we recruited in Experiment 1.

Analysis

Data cleaning and exclusion criteria

As Experiment 2 involved multiple tasks, participants were excluded from only the respective tasks where their performance met the exclusion criteria.

The duration discrimination task in Experiment 2 had two exclusion criteria: (1) chance-level performance in both DEV directions, as in Experiment 1 and (2) ceiling performance in overall response accuracy (average accuracy > .95). Two participants were excluded based on the first criterion; one participant was excluded based on the second.

On the trial-level, the paced tapping task had two exclusion criteria: first, any ITI that was smaller than half or bigger than 1.8 times the stimulus IOI was excluded. From the remaining ITIs, outliers were detected by the script described in data cleaning and exclusion criteria for unpaced tapping tasks under Experiment 1 Methods section. On the participant-level, criteria were incompatibility between stimulus rate and tapping rate, and low number of tapping intervals on average. To test the first criterion, we fitted models to overall task data where the tapping rate (i.e., the median of all intervals in a given trial after trial-level data cleaning) was predicted by stimulus IOI and obtained slopes. 2 participants were excluded as they had slopes smaller than .5. One participant was excluded based on the second criterion, as the average number of intervals they produced across trials was smaller than 7.

The data cleaning procedure of unpaced tapping tasks was identical to that described for Experiment 1. In the slider task, we recorded whether participants listened to the different stimulus rates by clicking on the different locations on the slider. Exclusion criterion was not testing the stimulus rates on more than 75% of the trials by producing a minimum of one mouse click, which suggested that the participant did not engage with the task. One participant was excluded from the slider task based on this criterion. From the remaining participants’ data, any trial without a mouse click was removed from further analyses. No exclusion criterion was defined for the keypress task.

Finally, before applying group-level statistics such as t-tests and correlations, any datapoint that fell outside of the interquartile range was excluded from the respective distributions.

Outcome measures

The outcome measures from the duration discrimination task were accuracy and bias. Response coding was same as in Experiment 1. Since the duration discrimination task in Experiment 2 included two repetitions of each IOI (presented in different blocks with different DEV directions), accuracy and bias were averaged across IOI repetitions.

For each trial in the paced tapping task, we calculated the tempo-matching error, TME, following the analysis in our previous study 18. As shown in Equation (2), TME was the difference between tapping rate (median inter-tap interval of all taps in a trial) and stimulus IOI, normalized by stimulus IOI.

where k is the trial index and n is the maximum number of intervals in a single trial.

A positive TME indicated that the tapping rate was slower than stimulus rate, and a negative TME indicated that it was faster.

For the unpaced tasks, the outcome measure from each trial was the tapping rate, calculated as the median ITI after trial-level data cleaning. From each trial of the SMT task, we also obtained the coefficient of variation (CV), calculated as the standard deviation of all intervals divided by their mean. We further compared SMT across repetitions of the same task throughout the experiment using Pearson correlations and paired-samples t-tests.

The slider task had two start-rate conditions and two repetitions throughout the experiment (before and after main tasks). The dependent measure for each trial was the median of all final responses. We assessed the main effects and interactions of start-rate and repetition on slider responses across participants, using a repeated measures analysis of variance. We calculated the rate preference on each trial of the keypress task as the presented stimulus’ rate at the time of the keypress. The summary measure for each startrate was the median of all rate preferences in trials with same start-rate.

Preferred rate measures

Experiment 2 involved various tasks by which we aimed to estimate individuals’ preferred rate. For the SMT task, we estimated preferred rate as median tapping rate; for the slider and keypress tasks (PPT), we averaged participants’ indicated preference across conditions and repetitions. For both the duration discrimination and paced tapping tasks, we estimated preferred rate as the stimulus IOI yielding peak performance as follows.

Best-performance rates in the duration discrimination task were calculated by smoothing accuracy as a function of stimulus rate, as in Experiment 1. After excluding the study-specific outliers on the participant level, for each participant, we smoothed accuracy using ‘gaussian’ method in smoothdata function in Matlab. Following the optimization procedure used in Experiment 1, we assessed the window size that revealed a single-point maximum accuracy for each participant. The optimum window was 13 elements, which was used to smooth both the accuracy and IOI values in each participant’s dataset.

The dependent measure in paced tapping task was TME, which was a signed, proportional error measure. Best-performance rates in this task were the conditions where participants tapped with the least errors, quantified by the absolute TME, |TME|. Since the paced tapping task shared the stimulus rate conditions with duration discrimination task, we used the optimum window size obtained for the duration-discrimination task for smoothing |TME| so that the estimates would be maximally comparable across tasks.

Flexibility measures

Experiment 1 in the current study and the findings of our previous study18 showed robust effects of stimulus history on rhythm perception and production. As in those analyses, flexibility in Experiment 2 was defined as the ability to adapt to changes in rhythmic context.

In the duration discrimination task, we assessed flexibility by fitting logistic models to each participant’s data where accuracy was predicted either by |-ΔIOI| or |+ΔIOI|, as in Experiment 1. A negative slope obtained from the models indicated that the probability of giving a correct response decreased as the |±ΔIOI| increased. Similarly, in the paced tapping task, we fitted linear models where |TME| was predicted either by |-ΔIOI| or |+ΔIOI|. A positive slope from the models indicated that the absolute tempo-matching error increased with |±ΔIOI|. However, as a final step, we inversed the slopes obtained from the paced tapping so that more negative beta estimates indicated less flexibility.

We tested the hypothesis of a decrease in oscillator flexibility with advancing age by correlating age and slopes from each |±ΔIOI| condition (flexibility estimates) in duration discrimination and paced tapping tasks (Pearson correlation, one-tailed). Since these analyses involved multiple comparisons, we controlled for the false discovery rate (FDR), using The Benjamini–Hochberg method51, 52. To test whether overall performance decreased with age, we ran another series of correlations between age and average accuracy in duration discrimination task, and average |TME| in the paced tapping task, and FDR-corrected the p- values.

Additionally, we explored the relationship between individuals’ age and preferred rate estimates, by separate correlation analyses between age and preferred rate estimated from each condition and measurement of the slider and keypress (PPT) tasks, and preferred rate estimates from duration discrimination and paced tapping tasks. Since we defined no hypothesis for preferred rate and age relationships, we used two-tailed Pearson correlation and no correction.

Results

Unpaced tapping

Tapping rates from ‘fastest’ and ‘slowest’ FMT trials showed no difference between pre- and post-session measurements, and were additionally correlated across repeated measurements. Given the consistency of the measures, rates from each FMT task from first and second measurements were averaged for further analyses. Tapping rates from SMT task were also correlated across measurements. However, rates from the second measurement were significantly slower than those from the first measurement. SMT CV did not correlate across measurements (p = .072), and CV from the second measurement (M = 0.070, SD = 0.033) was significantly higher (t(26) = -2.5116, p = 0.018563 than CV from first measurement (M = 0.055, SD = 0.023). The results of the pairwise comparisons between tapping rates from all unpaced tapping tasks across measurements are provided in Table 2.

Preferred rate estimates

Individuals’ PPT was measured by the slider and keypress tasks. In the slider task, rate preferences from the same start-rate conditions were significantly correlated and showed no systematic differences across repeated measurements. Within the first measurement block, rates from slow-start conditions (M = 0.732, SD = 0.165) were slower than those from fast-start conditions (M = 0.658, SD = 0.167) (t(25) = -2.109, p = 0.045), although they were significantly correlated (r(24) =0.691, p < .001). Rate preferences from the second measurement showed no difference between the start rate conditions (p=.709) and were significantly correlated (r(27) =0.521, p = 0.004). A repeated-measures ANOVA revealed no main effects of start-rate (p = 0.169) or repetition (p = 0.865), and no interaction (p = 0.067). In the keypress task, rate preferences from the fast-start condition (M = 0.467, SD = 0.092) were significantly faster than those from the slow-start condition (M = 0.840, SD = 0.111) (t(28) = -13.8046, p < 0.001), and we found no correlation between rate preferences across conditions (p = .803). The distributions of rate preferences from separate conditions of the slider and keypress tasks are shown in Figure 4A. Preferred rate estimates from both the duration discrimination and paced tapping tasks, measured by the stimulus rates with best performance, correlated significantly with SMT. Moreover, we found no significant differences between estimates from either task and SMT. However, estimates did not correlate between duration discrimination and paced tapping tasks, and were slower (t(26) = -2.7817, p = 0.099) in the latter (M = 0.641, SD = 0.173) than in the former task (M = 0.541, SD = 0.175). In Figure 4B, estimates from the two performance tasks and SMT (first measurement) are illustrated. In general, estimates from both the paced and unpaced tapping tasks were slower than those from the duration discrimination task. However, the nonparallel nature of the lines that connect single-participant preferred rates for each task (Fig. 4B, left) indicates that the amount of “slowing” in the tapping tasks relative to the discrimination task varied across individuals. We reasoned that if the degree of slowing for each individual arises from a common source for both tasks, which we will call ‘the motor component’, the differences between estimates for the discrimination versus both tapping tasks should be consistent. We quantified the contribution of the motor component to preferred rates each tapping task by subtracting the duration discrimination task estimates, which yielded two difference scores (paced tapping – duration discrimination and SMT duration discrimination). These difference scores were significantly positively correlated, confirming that each individual had a consistent motor component contribution that slowed their preferred rate estimate in different tapping tasks in a similar manner.

Results of Experiment 2 preferred rate analyses. A Top: Estimates of preferred rate from each task condition. Box plots show median (black vertical line), 25th and 75th percentiles (box edges) and remaining data range (whiskers). The horizontal dashed lines represent the minimum and maximum stimulus rates presented in the experiment. Bottom: Pairwise correlations between preferred rates across tasks. For the slider and key-press tasks, boxes are colored to indicate fast-start (blue) and slow-start (pink) conditions. Correlations and p-values are reported for significant correlations only. B Relationship between the preferred rate estimates from the paced tapping, duration discrimination, and SMT (first measurement) tasks. Left: participants’ estimates from the three tasks. Each circle represents an individual’s preferred rate estimate, connected by lines between the tasks. Both circles and lines are color-sorted by individuals’ SMT, ranging from magenta (fast) to blue (slow). Right: correlation between the difference scores. Each circle represents a single participant’s difference score, namely, how different the estimates from SMT (x axis) and paced tapping (y axis) tasks were than those from the duration discrimination task. Straight black line represents the regression line, dashed lines represent 95% confidence intervals.

Rate preferences in the slider task correlated with SMT only in fast-start conditions from the first measurement, and in slow-start conditions from the second measurement. Rate preferences from the keypress task only correlated with those from slider task conditions (i.e., within PPT tasks), but not with any SMT measurement or estimates from the performance tasks.

Flexibility estimates

We hypothesized negative effects of stimulus history on performance in both perceptual and motor tasks. We found similar effects of stimulus history in both tasks. β obtained from the separate models quantifying the effect of |-ΔIOI| and |+ΔIOI| on accuracy in the duration discrimination task were both significantly smaller than zero, indicating that accuracy decreased as |±ΔIOI| increased, both in trials where the stimulus was faster and slower than previous (Fig. 5A). In the paced tapping task, β from models where |TME| was predicted either by |+ΔIOI| or |-ΔIOI| were significantly greater than zero, indicating that tempo-matching errors increased as a function of |±ΔIOI| (Fig. 5B). Paired-samples t-tests revealed no significant differences between the strength of the effect of |-ΔIOI| vs |+ΔIOI| in either task. However, β from models where |+ΔIOI| predicted |TME| were numerically smaller, and significantly more variable than those models where |-ΔIOI| predicted |TME|; the difference in variability was assessed using a Brown-Forsythe test (F(1,54) = 5.86718, p = .019). Descriptive statistics and test results for comparison of β estimates against zero are provided in Table 3.

Results of Experiment 2 flexibility analyses. A-D Effects of between-trial absolute rate-change on performance in Experiment 2 duration discrimination (A, C) and paced tapping (B,D) tasks. In A-B, each circle represents an individuals’ slope (β) obtained from models, fitted separately to conditions where |-ΔIOI| (left, green) or |+ΔIOI| (right, blue) predicted accuracy in duration discrimination (A) or |TME| in paced tapping (B) task. In C-D, box plots show average bias in duration discrimination (C) and average TME in paced tapping (B) tasks in |-ΔIOI| (left, green) and|+ΔIOI| (right, blue) conditions. In all panels, box plots show median (black vertical line), 25th and 75th percentiles (box edges) and extreme datapoints (whiskers). E-F Correlations between individuals’ age and the flexibility estimates from duration discrimination (E) and paced tapping (F) tasks. Straight black lines represent the regression line, dashed lines represent 95% confidence intervals. Histograms above each plot show the distribution of participants’ age after outlier corrections.

Descriptive statistics and test results for comparison of Beta estimates against null distributions in Experiment 2 analyses.

To investigate the direction of history effects on performance, we compared perceptual and motor biases in trials with negative and positive rate change. In conditions where the stimulus on the current trial was faster than the previous one, average bias (M = -0.166, SD = 0.094) was significantly smaller than zero (t(28) = -9.4985, p < .001, Fig. 5C); and average TME (M = 0.014, SD = 0.021) was greater than zero (t(27) = 10.587, p < .001, Fig. 5D). The opposite was the case in conditions with slower-than-previous stimulus, as average bias (M = 0.217, SD = 0.108) was greater (t(27) = 10.587, p < .001, Fig. 5C) and average TME (M = -0.013, SD = 0.018) was smaller (t(26) = -3.7556, p < .001, Fig 5D) than zero.

In the duration discrimination task, we also assessed the differences in responses to shorter versus longer comparison intervals as an indicator of how individuals responded to phase perturbations, by comparing accuracy in trials with |-DEV| and |+DEV|. Participants’ average accuracy from the latter conditions (M = 0.746, SD = 0.070) were higher (t(25) = -2.5536, p = 0.017) than those from the former conditions (M = 0.694, SD = 0.116).

Age-related changes in oscillator flexibility

One of the main goals of Experiment 2 was to compare the estimates of preferred rate and flexibility across individuals to assess the age-related changes in oscillator properties. We recruited our participant sample to have a flat age distribution, with participants ranging in age from 20 to 76 years old.

The results revealed significant correlations (FDR-corrected for multiple comparisons) only between individuals’ age and flexibility estimates from |-ΔIOI| conditions. β from logistic fits where |-ΔIOI| predicted accuracy in the duration discrimination task negatively correlated with age (r(27) = -0.525, p = 0.002, Fig. 5E). Similarly, we found a significant negative correlation between the inversed β from models where |-ΔIOI| predicted |TME|, and age (r(24) =-0.389, p = 0.025, Fig. 5F). The findings indicate that the ability to adapt to faster-than-previous rates decreased with increasing age.

Discussion

The results of Experiment 2 revealed correspondences between preferred rate measures from various tasks, and effects of stimulus history on performance that were stronger for older individuals. This is consistent with previous research assessing tapping behavior at stimulus rates near to or far from individuals’ SMT. During synchronization to6 or continuation of5, 12, 42 a rhythmic stimulus, individuals overproduce stimulus rates that are faster, underproduce those that are slower than their SMT. During continuation tapping, produced intervals have also been also shown to drift back towards individuals’ SMT5, 15. However, these previous paradigms have generally used a rough sampling of stimulus rates (e.g. 3)12, 15, 42, or those that predefine conditions around SMT5, 6. Here, we used a wide and finely-sampled range of stimulus rates that were unrelated to individuals’ SMT. Thus, that we found SMT to be the anchor rate with optimal rhythmic performance further supports the idea that perception and production of rhythms are governed by a common mechanism which responds similarly to a range of stimulus rates across various tasks. Most work comparing individuals’ timing performance across stimulus rates with respect to their SMT has made use of paradigms that involve a rhythmic motor component. The current study is the first that compared individuals’ duration discrimination abilities across intervals of a rhythmic stimulus with respect to their SMT.

Preferred rates from the preference tasks with and without a rhythmic motor component (SMT and PPT, respectively) were more similar than preferred rate estimates from performance tasks (duration discrimination and paced tapping) with and without rhythmic movement. Rate preferences from the same start-rate conditions of the slider task showed strong correspondence across repeated measurements. Interestingly, rates from the fast-start conditions showed the strongest correlation across measurements, and with SMT. We interpret this difference between the fast- and slow-start conditions as being in line with the scalar property of time perception53, in that absolute timing accuracy is generally more accurate for faster rates and shorter intervals. Moreover, this finding is supported by similar findings of increased discrepancy between SMT and PPT at slow, as compared to fast stimulus rates54. Preferred rates from the keypress task showed large differences between start-rate conditions, although rates from slow-start trials were correlated with those from most slider task conditions. Given that the keypress task involved no dynamical adjustment of stimulus rate, preferences may have been constrained to a smaller range of stimulus rates around the start rate; nonetheless, individual differences were still observable, and preferred rates were still consistent with those measured in the other PPT (slider) task.

Analyses focused on flexibility revealed that both duration discrimination and paced tapping performance were worse when rate change from one trial to the next was large, regardless of the direction of the change (i.e., whether stimulus was faster or slower than the previous one). In cases where stimulus in a given trial was faster than the previous, slower stimulus, participants tended to perceive longer comparison intervals as shorter and tap slower than the stimulus. In the opposite cases, they tended to perceive shorter comparison intervals as longer and tap faster than the stimulus. Thus, non-zero biases and signed tapping errors observed in response to rate changes suggest that internal representations and behavior in a given trial reflected the properties of the preceding trial; we will return to this point in the General Discussion. These findings are mostly in line with findings of Experiment 1 (current study) and those from our previous tapping study 18, and further emphasize the presence of history effects on timing performance. The finding of signed tapping errors supports the idea that oscillators gradually adjust their phase and period to a newly encountered stimulus, resulting in discrepancy between the stimulus interval and oscillator period during synchronization to a rhythmic stimulus10, 24, 49. However, in our previous study18, tapping performance was especially affected when stimulus rates were faster than the preceding trial. In that study, |TME| was calculated from only synchronization tapping for the flexibility analysis. Here, we calculated |TME| from all taps from both the synchronization and continuation segments of each trial due to the lower number of trials. That is, in our previous study, we focused only on the first produced intervals on each trial, whereas here we included intervals that were produced after participants had a longer period to adapt to the new stimulus rate.

A critical finding from the current study was that flexibility, estimated from the strength of the effect of |-ΔIOI| on performance in both tasks with and without a motor component, decreased with age. Reduced performance in timing tasks for ageing individuals is a common finding across perceptual33, 34, 39 and motor31, 32 tasks. However, overall timing performance measures, namely, task averages of duration discrimination accuracy and tapping errors showed no systematic relationships with individuals’ age, suggesting that age-related changes in rhythm perception might be specific to adaptive mechanisms rather than general timing abilities.

In addition to focusing on deviations in stimulus rate between trials, we also assessed how participants responded to within-trial deviations, that is, how much comparison interval deviated from the stimulus IOI. As in Experiment 1, however, significantly here, accuracy was marginally higher in conditions with longer compared to shorter comparison intervals. That this difference reached significance only in the current study may be due to the age of the participant sample, given the finding that adapting to faster, but not slower stimulus was more challenging for older individuals.

Of note is that the paradigm in Experiment 2 was derived from two multi-session experiments through a series of reliability and bootstrapping analyses. The longer versions of the duration discrimination (Experiment 1, current study) and paced tapping (synchronization-continuation paradigm in Ref.18) involved around 400 trials in each of the two sessions, between which the estimates of preferred rate and flexibility were also consistent. Thus, the current paradigm can be used to assess internal oscillator properties in clinical settings or with participant samples where concerns for task difficulty or fatigue may arise.

General Discussion

The goal of the current set of studies was to highlight factors that impact auditory rhythm processing. To this end, we ran two experiments, investigating the interplay between the properties of the external world (the stimulus) and the individual responding to the stimulus (the perceiver). Adopting an entrainment perspective which considers internal oscillators as the underlying mechanism for rhythm processing7, 55, we aimed to capture this interplay by characterizing the properties of internal oscillators, and to assess how they change with advancing age. Specifically, we estimated oscillators’ preferred rates and flexibility for each individual in perceptual and motor tasks, assessed the relationship between rate preferences and optimal rates for timing performance, and tested the hypothesis that oscillator flexibility diminishes as we age.

Experiment 1 was a perceptual paradigm, where individuals’ ability to discriminate between stimulus intervals over a wide range of finely-sampled stimulus rates was assessed in two temporal contexts: one that required rapid temporal adaptation, challenging oscillator flexibility, and one without such requirement. In Experiment 2, we combined shorter versions of the duration discrimination paradigm (Experiment 1) and a paced tapping paradigm (adapted from Ref.18), using matching stimulus conditions. Experiment 2 also involved a common measure of preferred rate, the ‘spontaneous motor tempo’ (SMT) task, and two ‘preferred perceptual tempo’ (PPT) tasks (slider, keypress) where individuals’ rate preferences were assessed. From the performance paradigms, we estimated preferred rate as the stimulus rates with best performance, indexed by maximum accuracy in the duration discrimination tasks, and minimum tempo-matching errors in the paced tapping task. We defined flexibility as the ability to adapt to changes in stimulus rate, which was inversely related to how much single-trial performance was affected by trial-to-trial changes in stimulus rate.

Preferred rate estimates

In the rhythmic entrainment literature, preferred rate is generally estimated by SMT. However, two main aspects of the SMT task motivated us to question its explanatory power for predicting individuals’ perceptual abilities in real-world listening situations. First, given that the task involves periodic motor actions, the relative contributions of an internal timekeeper versus constraints or resonances of an individual’s motor system to the produced tapping rate cannot be separated. Second, SMT is a preference measure, since it measures the rate at which individuals prefer to tap at, without introducing any interaction with a stimulus. Although there is evidence for positive relationships between SMT and rates yielding best timing abilities in paced tapping tasks5, 6, 15, rate preferences obtained from SMT task may not necessarily predict how individuals would perform at other auditory tasks, especially those that don’t involve periodic motor actions. Here, we aimed to bridge this gap and understand the potential predictive power of SMT for perceptual performance situations with higher ecological validity, by directly comparing SMT to ‘performance’ measures of preferred rate both with and without motor component.

The results of Experiment 2 revealed that the stimulus rates for which individuals showed better timing performance were indeed correlated with SMT. However, we did not find one-to-one correspondences between SMT and preferred rate estimates from the performance tasks, and estimates were not correlated across the performance tasks. SMT was more variable across participants than preferred rates estimated from either of the performance tasks, and preferred rates estimated from tasks involving a motor component (SMT, paced tapping) tended to be slower than those estimated from the duration discrimination task. We discuss two possible primary dimensions along which these tasks differ and how these might preclude directly predicting performance on one task based on the rate preference for another: involvement of the motor system, and indicating preference versus interacting with an environmental rhythm.

Both the unpaced (SMT) and paced tapping tasks required rhythmic motor responses, as compared to the duration discrimination task where perceptual judgments were assessed. We found that preferred rate estimates from both motor tasks were slower than for those obtained via duration discrimination. Interestingly, we found that the degree of ‘slowing down’ in the motor compared to the discrimination tasks was consistent within an individual: the degree of slowing from discrimination to SMT was correlated with the degree of slowing from discrimination to paced tapping. This suggests that the contribution of the ‘motor component’ to preferred rate is individually specific and quantifiable. This finding is in line with the proposal that perception and production of rhythms is governed by a system of multiple coupled oscillators1, 43, with the observed preferred rate in any task being jointly influenced by preferred rate of a perceptual (in this case, auditory) oscillator, preferred rate of a motor oscillator, and the coupling strength between these two nodes. Indeed, similar discrepancies between preferred rates of auditory and motor oscillators were observed in speech comprehension and were attributed to individual differences in auditory-motor coupling56. Under this assumption, we propose that the differences between preferred rate estimates from tasks with and without tapping (motor) responses, i.e., the degree of slowing when the motor component is added, will increase with the difference in eigenfrequencies of the perceptual and motor oscillators (their detuning), and decrease with increasing coupling strength.

The other difference between the tasks by which preferred rate was estimated was the requirement to interact with a stimulus rhythm in the performance tasks, whereas the SMT and PPT tasks only involved indicating a preference. Jones and McAuley argue that in the presence of a stimulus, the preferred rate can be ‘pushed around’ by the temporal context, given that the oscillators are adaptive and can perform within their entrainment regions25. Results of Experiment 1 confirmed this prediction by revealing an effect of temporal context on preferred rate: the distribution of estimates from the temporally-challenging condition was narrower than that from the condition that required minimal temporal adaptation. Thus, stimulus presentation in Experiment 2 duration discrimination and paced tapping tasks as opposed to SMT task may have contributed to the differences in preferred rate estimates. Additionally, in the paced tapping task, participants synchronized to the stimulus, which is shown to improve performance in tapping precision57, 58 and perceptual judgments59, 60, and thus may have contributed to the estimate differences.

Flexibility estimates

Another goal of the current study was to investigate the circumstances that negatively impacted timing abilities. Specifically, we focused on trial-to-trial changes in stimulus rate, and to what extent individuals were able to adapt to such changes, which was our definition of oscillator flexibility. In line with previous literature which reveals effects of stimulus history on perceptual22, 25-27 and motor16, 18, 21, 23, 24 responses, results of the current study showed that performance in duration discrimination and paced tapping tasks decreased as trial-to-trial changes in stimulus rate increased. Moreover, single-trial responses were biased such that they reflected the properties of the stimulus from the preceding trial. This set of findings is in line with predictions of oscillator models49. In a changing rhythmic context, the oscillator adapts to the newly encountered stimulus rate by gradually updating its phase and period10. The extent and time course of adaptation, however, will depend on the oscillator’s flexibility, which might be modeled via error correction parameters in commonly used models of interval timing10, 49 or synchronized tapping61. An inflexible oscillator’s period would adjust more slowly to a new rate, and so would continue to reflect the previously entrained rate, due to hysteresis. For the duration discrimination task, any comparison interval that is shorter than the oscillator’s period would be classified as ‘shorter’, and vice versa, regardless of whether the interval was indeed shorter than the intervals making up the standard, isochronous rhythm. This means that when the previous trial was faster than the current one, the oscillator period would be relatively short, and participants would be biased to judge comparisons as “longer”. Conversely, when the previous trial was slower than the current one, the oscillator period would be relatively long, and “shorter” responses would be more likely. The analysis of bias indicated that this was exactly the case for the current data. Similarly, tapping rates gradually updated from the preceding stimulus rate to a current one, resulting in tempo-matching errors in the direction of the previous stimulus rate. That is, when the previous trial was faster than the current one, tapping rates would underestimate the stimulus rate, and when the previous trial was slower than the previous one, tapping rates would overestimate the stimulus rate. Again, the TME analysis confirmed this to be the case.

Age-related changes in oscillator flexibility

A critical finding of the current study was an agerelated decline in a specific ability: temporal adaptation to faster-than-previous stimulus. In trials where the stimulus was faster than the previous one, accuracy in the duration discrimination task decreased, and tempo-matching errors in the paced tapping task increased as a function of the amount of rate difference between trials, more so for older individuals.

The timing literature reveals age-related changes in time perception, such as a decrease in the accuracy of temporal estimates62, and slower tapping rates in spontaneous12, 30, 63 or forced31 unpaced tapping tasks. These changes are generally attributed to slowing of the internal time-keeper mechanisms30, 33 or a reduction of attentional resources64. Moreover, studies comparing older and younger individuals’ preferences and performances in paced tapping paradigms reveal mixed results32. In the current study, we did not observe age-related changes in overall performance measures such as perceptual accuracy or tapping errors, and contrary to previous work we did not find a slowing of preferred rate no matter how it was estimated. Instead, these findings rather point to age-related changes in adaptive mechanisms underlying temporal processing. Studies assessing temporal adaptation abilities show that older individuals adapt their movements to temporal perturbations more slowly and less efficiently than younger individuals65, 66 and with less error correction67. We observed an age-related decline in temporal adaptation during both perception of and synchronization with auditory stimuli, suggesting a common source that affected the two means of responding.

Previous work has revealed age-related differences in neural entrainment to auditory rhythms. Most studies have focused on neural entrainment to amplitude modulated sounds, of which metronomic stimuli like those we used here are a special case, and found that older adults entrain more strongly and in a more stereotyped (less flexible) way3638. A similar pattern was observed for entrainment to the amplitude envelope of speech68, 69. A mixed pattern of results has been reported for frequency modulated sounds; however, the existing data suggest that these differences might depend on parameters such as modulation rate and depth39, 70, which we will not further address here. Moreover, older adults show less neural adaptation than younger adults in temporal contexts where stimulus rate changes gradually and predictably37. Another functional difference between younger and older brains, potentially relevant here, are findings on “neural noise”. Variability in brain activity as measured in the BOLD signal using functional magnetic resonance imaging is higher in younger than older brains, again suggesting inflexible and stereotyped neural activity. Indeed, neural noise is associated with faster and more consistent performance across a variety of cognitive tasks71, 72. Similarly, 1/f noise measured with EEG, associated with predictive processing in a lexical task, was lower for older than younger individuals73. Taken together, these results suggest that poorer performance in temporal tasks that involve prediction and adaptation might reflect less flexible, overly stereotyped neural responses in older adults. This might indicate a loss of flexibility in the generating oscillator(s).

An interesting aspect of the current findings was that adaptation to faster, but not slower stimulus rates was more difficult for older individuals. Oscillator models predict this asymmetry, with increased tapping asynchronies to speeding up compared to slowing down stimuli due to the ‘period adaptation function’ of the oscillator24. This was the case for the paced tapping paradigm (current study), as the effect of rate change on tapping errors was smaller and significantly more variable when stimuli slowed down as opposed to sped up, paralleling previous findings18. In the duration discrimination tasks, although the magnitude of the effect of rate change was similar for both rate-change directions, only adaptation to faster stimuli worsened with age. Though evidence shows reduced adaptation to time-compressed74 or artificially speeded2 speech in older individuals, further research is needed to address the sources of adaptation to fast versus slow stimuli in ageing.

Individual differences in internal oscillator properties

One advantage of the current approach is its focus on individual variability. Previous work on rhythm perception and production, as well as aging, has largely used traditional statistical approaches involving group or condition comparisons of central tendency measures. In these cases, variability is attributed to measurement error or noise. In the current work, we opted to view variability as potentially attributable to individual differences in internal oscillator properties that may in future work be shown to have predictive power for successful outcomes in realworld listening situations. Taking this approach focused on individual differences revealed several novel findings that would have otherwise not been accessible. First, we found correspondence between the rates individuals prefer to tap their finger at, listen to, and perform perceptual and motor tasks most accurately, all pointing to preferred rates of potentially coupled, perceptual and motor internal oscillatory systems. Second, we observed harmonic relationships between the preferred rates estimated from the duration discrimination paradigm under two different temporal contexts (Experiment 1); this is in line with the assumption that oscillators are capable of entraining to multiple stimulus rates within a temporal hierarchy49, 75, and further strengthens our choice to adopt an entrainment approach here. Finally, we found that oscillator flexibility decreased with age; this finding is supported by evidence from neural entrainment research and adds to the narrative regarding the effects of ageing on the auditory system.

The pared-down versions of the duration discrimination and paced tapping paradigms described in Experiment 2 were carefully designed based on analysis of their correspondence between Experiment 1 and our previous tapping study18 in terms of their main results. That is, we designed the Experiment 2 tasks to be the streamlined versions that would yield the same main results as their longer counterparts. The reasons for minimizing the duration of the tasks were (1) it allowed us to test and compare perception and production in a within-participant manner in a single session, and (2) it improved suitability for testing older adults, who we did not want to subject to an overly long or multi-session experiment. That the results of Experiment 2 replicated those from Experiment 1 and Ref.18 independently confirmed the robustness of the designs. Thus, we would propose that these minimized designs could be used in a more diagnostic capacity in future work to measure and test predictions about internal oscillator properties of older adults or a clinical population of interest.

Conclusion

To summarize, we developed a paradigm to estimate individuals’ internal oscillator properties. Performance in both duration discrimination and synchronized tapping tasks was best at a range of stimulus rates that was specific to each individual – their preferred rate – and was broadly consistent with preferred rates estimated from preference tasks (SMT and PPT). One important departure from this consistency was that involving a motor requirement slowed preferred rates, and we were able to quantify the contribution of this motor component, which was consistent within individuals across tasks. Performance decreased as a function of change in stimulus rate between consecutive trials. The extent to which individuals were able to adapt to the changes – oscillator flexibility – decreased with age, in accordance with research on neural entrainment and neural noise. Overall, the findings support the hypothesis that an oscillatory system with a stable preferred rate underlies perception and production of rhythms, and that this system loses its ability to flexibly adapt to changes in the external rhythmic context as we age.

Supplemental Materials

Illustration of the optimization procedure and parameter choices for smoothing accuracy in Experiment 1. A Bottom: An example participant’s linear-order session dataset. Each color represents an output of the smoothing function that uses a window size, ranging from 10 (yellow) to 50 (dark blue). Top: The number of maximum values on the smoothed accuracy for each window size. B Participants’ average number of curve maxima for random-order (blue) and linear-order (magenta) sessions. Arrows show the optimized window sizes for the session types, where each individual’s dataset had only one curve maximum (dashed line).

Details of the bootstrapping analyses for Experiment 2

The experiment in Kaya & Henry (2022) was a longer version of the paced tapping paradigm from Experiment 2 (current study). The IOI of the isochronous stimulus sequences were sampled from a range of 200 ms to 1000 ms with a step size of 2, and varied in each trial. We estimated up to 3 preferred rates for each individual by fitting curves to continuation tapping tempo-matching errors (|TMEcontinuation|), and obtaining the IOIs at the curves’ local minima. Estimates from two identical sessions that each participant completed showed strong correspondence and harmonic relationships, as measured by the permutation test described in Experiment 1 methods section.

For the bootstrapping analysis, we first downsampled each participant’s single-session data, with each even step size between 4 and 20 ms. That is, for the respective step size, we filtered data where IOI corresponded to the spacing value added to the smallest (200 ms) until the largest (1000 ms) IOI. (e.g., trials with IOI = 200, 204, 208 ms and so on,for step size of 4 ms). To each downsampled dataset, we performed the preferred rate estimation procedure we used in the experiment analyses. To assess the optimum step size which would represent the experiment’s findings, we assessed the correspondences between (1) preferred rate estimates from the original and downsampled datasets for each session and (2) estimates from downsampled datasets between sessions. In both steps, the correspondence between estimates was quantified by their harmonic difference (i.e., the sum of the datapoints Euclidian distances to the closest line among y=x, y=2x and y=x/2 lines). A smaller difference value indicated that the estimates subject to comparison were similar, or close to doubles or halves of each other. Harmonic differences obtained from the first and seconds steps of the bootstrapping analysis are shown in Supplementary Figure S2a and S2b, respectively. Together, the bootstrapping analyses showed that the average harmonic difference between estimates from original versus downsampled datasets was smallest at the step size of 10, where harmonic difference between downsampled sessions’ estimates was also small.

Results of the bootstrapping analysis. In A, each circle shows harmonic difference between preferred rate estimates from the original and downsampled datasets for session 1 (blue) and session 2 (magenta) and their average (dotted black line) at the respective step size. In B, each circle shows harmonic difference between preferred rate estimates from the downsampled session 1 and session 2 datasets at the respective step size.