Fractal cycles of sleep: a new aperiodic activity-based definition of sleep cycles

  1. Radboud University Medical Centre, Donders Institute for Brain, Cognition and Behavior, Nijmegen, Netherlands
  2. Child Development Center and Children’s Research Center, University Children’s Hospital Zürich, University of Zürich, Zürich, Switzerland
  3. Department of Child and Adolescent Psychiatry and Psychotherapy, Psychiatric University Hospital Zurich, Zürich, Switzerland
  4. Max Planck Institute of Psychiatry, Munich, Germany
  5. Klinikum Ingolstadt, Centre of Mental Health, Ingolstadt, Germany
  6. Semmelweis University, Institute of Behavioural Sciences, Budapest, Hungary

Peer review process

Revised: This Reviewed Preprint has been revised by the authors in response to the previous round of peer review; the eLife assessment and the public reviews have been updated where necessary by the editors and peer reviewers.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Adrien Peyrache
    McGill University, Montreal, Canada
  • Senior Editor
    Christian Büchel
    University Medical Center Hamburg-Eppendorf, Hamburg, Germany

Reviewer #1 (Public review):

In this study, Rosenblum et al introduce a novel and automatic way of calculating sleep cycles from human EEG. Previous results have shown that the slope of the non-oscillatory component of the power spectrum (called the aperiodic or fractal component) changes with sleep stage. Building on this, the authors present an algorithm that extracts the continuous-time fluctuations in the fractal slope and propose that peaks in this variable can be used to identify sleep cycle limits. Cycles defined in this way are termed "fractal cycles". The main focus of the article is a comparison of "fractal" and "classical" (ie defined manually based on the hypnogram) sleep cycles in numerous datasets.

The manuscript amply illustrates through examples the strong overlap between fractal and classical cycle identification. Accordingly, a high percentage (81%) can be matched one-to-one between methods and sleep cycle duration is well correlated (around R = 0.5). Moreover, the methods track certain global changes in sleep structure in different populations: shorter cycles in children and longer cycles in patients medicated with REM-suppressing anti-depressants. Finally, a major strength of the results is that they show similar agreement between fractal and classical sleep cycle length in 5 different data sets, showing that it is robust to changes in recording settings and methods.

The match between fractal and classical cycles is not one-to-one. For example, the fractal method identifies a correlation between age and cycle duration in adults that is not apparent with the classical method.
The difference between the fractal and classical methods appear to be linked to the uncertain definition of sleep cycles since they are tied to when exactly the cycle begins/ends and whether or not to count cycles during fractured sleep architecture at sleep onset. Moreover, the discrepancies between the two are on the order of that found between classical cycles defined manually or via an automatic algorithm.

Overall the fractal cycle is an attractive method to study sleep architecture since it dispenses with time-consuming and potentially subjective manual identification of sleep cycles. However, given its difference from the classical method, it is unlikely that fractal scoring will be able to replace classical scoring directly. By providing a complementary quantification, it will likely contribute to refining the definition of sleep cycles that is currently ambiguous in certain cases. Moreover, it has the potential to be applied to animal studies which rarely deal with sleep cycle structure.

Reviewer #2 (Public review):

Summary:

This study focused on using strictly the slope of the power spectral density (PSD) to perform automated sleep scoring and evaluation of the durations of sleep cycles. The method appears to work well because the slope of the PSD is highest during slow-wave sleep, and lowest during waking and REM sleep. Therefore, when smoothed and analyzed across time,there are cyclical variations in the slope of the PSD, fit using an IRASA (Irregularly resampled auto-spectral analysis) algorithm proposed by Wen & Liu (2016).

Strengths:

The main novelty of the study is that the non-fractal (oscillatory) components of the PSD that are more typically used during sleep scoring can be essentially ignored because the key information is already contained within the fractal (slope) component. The authors show that for the most part, results are fairly consistent between this and conventional sleep scoring, but in some cases show disagreements that may be scientifically interesting.

Weaknesses:

The previous weaknesses were well-addressed by the authors in the revised manuscript. I will note that from the fractal cycle perspective, waking and REM sleep are not very dissimilar. Combining these states underlies some of the key results of this study.

Author response:

The following is the authors’ response to the original reviews.

Reviewer #1 (Public Review):

Weaknesses:

The match between fractal and classical cycles is not one-to-one. For example, the fractal method identifies a correlation between age and cycle duration in adults that is not apparent with the classical method. This raises the question as to whether differences are due to one method being more reliable than another or whether they are also identifying different underlying biological differences. It is not clear for example whether the agreement between the two methods is better or worse than between two human scorers, which generally serve as a gold standard to validate novel methods. The authors provide some insight into differences between the methods that could account for differences in results. However, given that the fractal method is automatic it would be important to clearly identify criteria for recordings in which it will produce similar results to the classical method.

Thank you for these insightful suggestions. In the revised Manuscript, we have added a number of additional analyses that provide a quantitative comparison between the classical and fractal cycle approaches aiming to identify the source of the discrepancies between classical and fractal cycle durations. Likewise, we assessed the intra-fractal and intra-classical method reliability as outlined below.

Reviewer #1 (Recommendations For The Authors):

One of the challenges in interpreting the results of the manuscript is understanding whether the differences between the two methods are due to a genuine difference in what these two methods are quantifying or simply noise/variability in each method. If the authors could provide some more insight into this, it would be a great help in assessing their findings and I think bolster the applicability of their method.

(1) Method reliability: The manuscript clearly shows that cycle length is robustly correlated between fractal and classical in multiple datasets, however, it is hard to assign a meaningful interpretation to the correlation value (ie R = 0.5) without some reference point. This could be provided by looking at the intra-method correlation of cycle lengths. In the case of classical scoring, inter-scorer results could be compared, if the R-value here is significantly higher than 0.5 it would suggest genuine differences between the methods. In the case of fractal scoring, inter-electrode results could be compared / results with slight changes to the peak prominence threshold or smoothing window.

In the revised Manuscript, we performed the following analyses to show the intra-method reliability:

a) Classical cycle reliability: For the revised Manuscript, an additional scorer has independently defined classical sleep cycles for all datasets and marked sleep cycles with skipped REM sleep. Likewise, we have performed automatic sleep cycle detection using the R “SleepCycles” package by Blume & Cajochen (2021). We have added a new Table S8 to Supplementary Material 2 that shows the averaged cycle durations and cycle numbers obtained by the two human scorers and automatic algorithm as well as the inter-scorer rate agreement. We have added a new sheet named “Classical method reliability” that reports classical cycle durations for each participant and each dataset as defined by two human scorers and the algorithm To the Supplementary Excel file.

We found that the correlation coefficients between two human scorers ranged from 0.69 to 0.91 (in literature, r’s > 0.7 are defined as strong scores) in different datasets, thus being higher than correlation coefficients between fractal and classical cycle durations, which in turn ranged from 0.41 to 0.55 (r’s in the range of 0.3 – 0.7 are considered moderate scores). The correlation coefficients between human raters and the automatic algorithm showed remarkably lower coefficients ranging from 0.30 to 0.69 (moderate scores) in different datasets, thus lying within the range of the correlation coefficients between fractal and classical cycle durations. This analysis is reported in Supplementary Material 2, section ”Intra-classical method reliability” and Table S8.

b) Fractal cycle reliability: In the revised Supplementary Material 2 of our Manuscript, we assessed the intra-fractal method reliability, we correlated between the durations of fractal cycles calculated as defined in the main text, i.e., using a minimum peak prominence of 0.94 z and smoothing window of 101 thirty-second epochs, with those calculated using a minimum peak prominence ranging from 0.86 to 1.20 z with a step size of 0.04 z and smoothing windows ranging from 81 to 121 thirty-second epochs with a step size of 10 epochs (Table S7). We found that fractal cycle durations calculated using adjacent minimum peak prominence (i.e., those that differed by 0.04 z) showed r’s > 0.92, while those calculated using adjacent smoothing windows (i.e., those that differed by 10 epochs) showed r’s > 0.84. In addition, we correlated fractal cycle durations defined using different channels and found that the correlation coefficients ranged between 0.66 – 0.67 (Table S1). Thus, most of the correlations performed to assess intra-fractal method reliability showed correlation coefficients (r > 0.6) higher than those obtained to assess inter-method reliability (r = 0.41 – 0.55), i.e., correlations between fractal and classical cycle. This analysis is reported in Supplementary Material 2, section ”Intra-fractal method reliability” and Table S7. Likewise, we have added a new sheet named “Fractal method reliability” that reports the actual values for the abovementioned parameters to the Supplementary Excel file. For a discussion on potential sources of differences, see below.

(2) Origin of method differences: The authors outline a few possible sources of discrepancies between the two methods (peak vs REM end, skipped REM cycle detection...) but do not quantify these contributions. It would be interesting to identify some factors that could predict for either a given night of sleep or dataset whether it is likely to show a strong or weak agreement between methods. This could be achieved by correlating measures of the proposed differences ("peak flatness", fractal cycle depth, or proportion of skipped REM cycles) with the mismatch between the two methods.

In the revised Manuscript, we have quantified a few possible sources of discrepancies between the durations of fractal vs classical cycles and added a new section named “Sources of fractal and classical cycle mismatches” to the Results as well as new Tables 5 and S10 (Supplementary Material 2). Namely, we correlated the difference in classical vs fractal sleep cycle durations on the one side, and either the amplitude of fractal descent/ascent (to reflect fractal cycle depth), duration of cycles with skipped REM sleep/TST, duration of wake after sleep onset/TST or the REM episode length of a given cycle (to reflect peak flatness) on the other side. We found that a higher difference in classical vs fractal cycle duration was associated with a higher proportion of wake after sleep onset (r = 0.226, p = 0.001), shallower fractal descents (r = 0.15, p = 0.002) and longer REM episodes (r = 0.358, p < 0.001, n = 417 cycles, Table S10 in Supplementary Material 2). The rest of the assessed parameters showed no significant correlations (Table S10). We have added a new sheet named “Fractal-classical mismatch” that reports the actual values for the abovementioned parameters to the Supplementary Excel file.

(3) Skipped REM cycles: the authors underline that the fractal method identified skipped REM cycles. It seems likely that manual identification of skipped REM cycles is particularly challenging (ie we would expect this to be a particular source of error between two human scorers). If this is indeed the case, it would be interesting to discuss, since it would highlight an advantage of their methodology that they already point out (l644).

In the revised Manuscript, we have added the inter-scorer rate agreement regarding cycles with skipped REM sleep, which was equal to 61%, which is 32% lower than the performance of our fractal cycle algorithm (93%). These findings are now reported in the “Skipped cycles” section of the Results and in Table S9 of Supplementary Material 2. We also discuss them in Discussion:

“Our algorithm detected skipped cycles in 93% of cases while the hypnogram-based agreement on the presence/absence of skipped cycles between two independent human raters was 61% only; thus, 32% lower. We deduce that the fractal cycle algorithm detected skipped cycles since a lightening of sleep that replaces a REM episode in skipped cycles is often expressed as a local peak in fractal time series.”
Discussion, section “Fractal and classical cycles comparison”, paragraph 5.

Minor comments:

- In the subjects where the number of fractal and classical cycles did not match, how large was the difference (ie just one extra cycle or more)? Correlating cycle numbers could be one way to quantify this.

In the revised Manuscript, we have reported the required information for the participants with no one-to-one match (46% of all participants) as follows:

“In the remaining 46% of the participants, the difference between the fractal and classical cycle numbers ranged from -2 to 2 with the average of -0.23 ± 1.23 cycle. This subgroup had 4.6 ± 1.2 fractal cycles per participant, while the number of classical cycles was 4.9 ± 0.7 cycles per participant. The correlation coefficient between the fractal and classical cycle numbers was 0.280 (p = 0.006) and between the cycle durations – 0.278 (p=0.006).” Results, section “Correspondence between fractal and classical cycles”, last paragraph.

- When discussing the skipped REM cycles (l467), the authors explain: "For simplicity and between-subject consistency, we included in the analysis only the first cycles". I'm not sure I understood this, could they clarify to which analysis they are referring to?

In the revised Manuscript, we performed this analysis twice: using first cycles and using all cycles and therefore have rephrased this as follows:

_“We tested whether the fractal cycle algorithm can detect skipped cycles, i.e., the cycles where an anticipated REM episode is skipped (possibly due to too high homeostatic pressure). We performed this analysis twice. First, we counted all skipped cycles (except the last cycles of a night, which might lack REM episode for other reasons, e.g., a participant had/was woken up). Second, we counted only the first classical cycles (i.e., the first cycle out of the 4 – 6 cycles that each participant had per night, Fig. 3 A – B) as these cy_cles coincide with the highest NREM pressure. An additional reason to disregard skipped cycles observed later during the night was our aim to achieve higher between-subject consistency as later skipped cycles were observed in only a small number of participants.” Results, section “Skipped cycles”, first paragraph.

- The inclusion of all the hypnograms as a supplementary is a great idea to give the reader concrete intuition of the data. If the limits of the sleep cycles for both methods could be added it would be very useful.

Supplementary Material 1 has been updated such that each graph has a mark showing the onsets of fractal and classical sleep cycles, including classical cycles with skipped REM sleep.

- The difference in cycle duration between adults and children seems stronger / more reliable for the fractal cycle method, particularly in the histogram (Figure 3C). Is this difference statistically significant?

In the revised Manuscript, we have added the Multivariate Analysis of Variance to compare F-values, partial R-squared and eta squared. The findings are as follows:

“To compare the fractal approach with the classical one, we performed a Multivariate Analysis of Variance with fractal and classical cycle durations as dependent variables, the group as an independent variable and the age as a covariate. We found that fractal cycle durations showed higher F-values (F(1, 43) = 4.5 vs F(1, 43) = 3.1), adjusted R squared (0.138 vs 0.089) and effect sizes (partial eta squared 0.18 vs 0.13) than classical cycle durations.” Results, Fractal cycles in children and adolescents, paragraph 3.

There have been some recent efforts to define sleep cycles in an automatic way using machine learning approaches. It could be interesting to mention these in the discussion and highlight their relevance to the general endeavour of automatizing the sleep cycle identification process.

In the Discussion of the revised Manuscript, we have added the section on the existing automatic sleep cycle definition algorithms:

“Even though recently, there has been a significant surge in sleep analysis incorporating various machine learning techniques and deep neural network architectures, we should stress that this research line mainly focused on the automatic classification of sleep stages and disorders almost ignoring the area of sleep cycles. Here, as a reference method, we used one of the very few available algorithms for sleep cycle detection (Blume & Cajochen, 2021). We found that automatically identified classical sleep cycles only moderately correlated with those detected by human raters (r’s = 0.3 – 0.7 in different datasets). These coefficients lay within the range of the coefficients between fractal and classical cycle durations (r = 0.41 – 0.55, moderate) and outside the range of the coefficients between classical cycle durations detected by two human scorers (r’s = 0.7 – 0.9, strong, Supplementary Material 2, Table S8).” Discussion, section “Fractal and classical cycles comparison”, paragraph 4.

Reviewer #2 (Public Review):

One weakness of the study, from my perspective, was that the IRASA fits to the data (e.g. the PSD, such as in Figure 1B), were not illustrated. One cannot get a sense of whether or not the algorithm is based entirely on the fractal component or whether the oscillatory component of the PSD also influences the slope calculations. This should be better illustrated, but I assume the fits are quite good.

Thank you for this suggestion. In the revised Manuscript, we have added a new figure (Fig.S1 E, Supplementary Material 2), illustrating the goodness of fit of the data as assessed by the IRASA method.

The cycles detected using IRASA are called fractal cycles. I appreciate the use of a simple term for this, but I am also concerned whether it could be potentially misleading? The term suggests there is something fractal about the cycle, whereas it's really just that the fractal component of the PSD is used to detect the cycle. A more appropriate term could be "fractal-detected cycles" or "fractal-based cycle" perhaps?

We agree that these cycles are not fractal per se. In the Introduction, when we mention them for the first time, we name them “fractal activity-based cycles of sleep” and immediately after that add “or fractal cycles for short”. In the revised version, we renewed this abbreviation with each new major section and in Abstract. Nevertheless, given that the term “fractal cycles” is used 88 times, after those “reminders”, we used the short name again to facilitate readability. We hope that this will highlight that the cycles are not fractal per se and thus reduce the possible confusion while keeping the manuscript short.

The study performs various comparisons of the durations of sleep cycles evaluated by the IRASA-based algorithm vs. conventional sleep scoring. One concern I had was that it appears cycles were simply identified by their order (first, second, etc.) but were not otherwise matched. This is problematic because, as evident from examples such as Figure 3B, sometimes one cycle conventionally scored is matched onto two fractal-based cycles. In the case of the Figure 3B example, it would be more appropriate to compare the duration of conventional cycle 5 vs. fractal cycle 7, rather than 5 vs. 5, as it appears is currently being performed.

In cases where the number of fractal cycles differed from the number of classical cycles (from 34 to 55% in different datasets as in the case of Fig.3B), we did not perform one-to-one matching of cycles. Instead, we averaged the duration of the fractal and classical cycles over each participant and only then correlated between them (Fig.2C). For a subset of the participants (45 – 66% of the participants in different datasets) with a one-to-one match between the fractal and classical cycles, we performed an additional correlation without averaging, i.e., we correlated the durations of individual fractal and classical cycles (Fig.4S of Supplementary Material 2). This is stated in the Methods, section Statistical analysis, paragraph 2.

There are a few statements in the discussion that I felt were either not well-supported. L629: about the "little biological foundation" of categorical definitions, e.g. for REM sleep or wake? I cannot agree with this statement as written. Also about "the gradual nature of typical biological processes". Surely the action potential is not gradual and there are many other examples of all-or-none biological events.

In the revised Manuscript, we have removed these statements from both Introduction and Discussion.

The authors appear to acknowledge a key point, which is that their methods do not discriminate between awake and REM periods. Thus their algorithm essentially detected cycles of slow-wave sleep alternating with wake/REM. Judging by the examples provided this appears to account for both the correspondence between fractal-based and conventional cycles, as well as their disagreements during the early part of the sleep cycle. While this point is acknowledged in the discussion section around L686. I am surprised that the authors then argue against this correspondence on L695. I did not find the "not-a-number" controls to be convincing. No examples were provided of such cycles, and it's hard to understand how positive z-values of the slopes are possible without the presence of some wake unless N1 stages are sufficient to provide a detected cycle (in which case, then the argument still holds except that its alterations between slow-wave sleep and N1 that could be what drives the detection).

In the revised Manuscript, we have removed the “NaN analysis” from both Results and Discussion. We have replaced it with the correlation between the difference between the durations of the classical and fractal cycles and proportion of wake after sleep onset. The finding is as follows:

“A larger difference between the durations of the classical and fractal cycles was associated with a higher proportion of wake after sleep onset in 3/5 datasets as well as in the merged dataset (Supplementary Material 2, Table S10).” Results, section “Fractal cycles and wake after sleep onset”, last two sentences. This is also discussed in Discussion, section “Fractal cycles and age”, paragraph 1, last sentence.

To me, it seems important to make clear whether the paper is proposing a different definition of cycles that could be easily detected without considering fractals or spectral slopes, but simply adjusting what one calls the onset/offset of a cycle, or whether there is something fundamentally important about measuring the PSD slope. The paper seems to be suggesting the latter but my sense from the results is that it's rather the former.

Thank you for this important comment. Overall, our paper suggests that the fractal approach might reflect the cycling nature of sleep in a more precise and sensitive way than classical hypnograms. Importantly, neither fractal nor classical methods can shed light on the mechanism underlying sleep cycle generation due to their correlational approach. Despite this, the advantages of fractal over classical methods mentioned in our Manuscript are as follows:

(1) Fractal cycles are based on a real-valued metric with known neurophysiological functional significance, which introduces a biological foundation and a more gradual impression of nocturnal changes compared to the abrupt changes that are inherent to hypnograms that use a rather arbitrary assigned categorical value (e.g., wake=0, REM=-1, N1=-2, N2=-3 and SWS=-4, Fig.2 A).

(2) Fractal cycle computation is automatic and thus objective, whereas classical sleep cycle detection is usually based on the visual inspection of hypnograms, which is time-consuming, subjective and error-prone. Few automatic algorithms are available for sleep cycle detection, which only moderately correlated with classical cycles detected by human raters (r’s = 0.3 – 0.7 in different datasets here).

(3) Defining the precise end of a classical sleep cycle with skipped REM sleep that is common in children, adolescents and young adults using a hypnogram is often difficult and arbitrary. The fractal cycle algorithm could detect such cycles in 93% of cases while the hypnogram-based agreement on the presence/absence of skipped cycles between two independent human raters was 61% only; thus, 32% lower.

(4) The fractal analysis showed a stronger effect size, higher F-value and R-squared than the classical analysis for the cycle duration comparison in children and adolescents vs young adults. The first and second fractal cycles were significantly shorter in the pediatric compared to the adult group, whereas the classical approach could not detect this difference.

(5) Fractal – but not classical – cycle durations correlated with the age of adult participants.

These bullets are now summarized in Table 5 that has been added to the Discussion of the revised manuscript.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation