7 figures, 1 table and 3 additional files

Figures

Development and validation of models for predicting melatonin phase from transcriptomic samples.

mRNA abundance and melatonin data from 53 participants collected in four conditions (panels c, d, g and h) were partitioned into two groups: a training set (329 mRNA samples from 26 participants) to …

https://doi.org/10.7554/eLife.20214.002
Figure 2 with 4 supplements
Rhythmicity in the conditions and identification of molecular timetable genes.

(ad) Square of correlation value (r2) vs. rank of correlation as a measure of overall 24 hr rhythmicity in the transcriptome, separately and across conditions. For each transcript, the correlation …

https://doi.org/10.7554/eLife.20214.003
Figure 2—source data 1

Source data file for generating panels in Figure 2.

https://doi.org/10.7554/eLife.20214.004
Figure 2—figure supplement 1
Identification of correlation cutoff threshold for the construction of a molecular timetable.

(a) Effect of cutoff threshold on performance of molecular timetable models constructed using genes meeting a particular threshold. Performance assessed by R2 values following …

https://doi.org/10.7554/eLife.20214.005
Figure 2—figure supplement 2
mRNA abundance profiles for genes belonging to the phase marker gene list generated by our implementation of the molecular timetable model.

Data are shown separately for the four conditions. Color of lines correspond to that in Figure 1 of the main text; Light blue: ‘sleep in phase with melatonin’, Dark blue: ‘sleep out of phase with …

https://doi.org/10.7554/eLife.20214.006
Figure 2—figure supplement 3
mRNA abundance profiles for genes belonging to the phase marker gene list published in (Lech et al., 2016).

Data are show separately for the four conditions. Color of lines correspond to that in Figure 1 of the main text; Light blue: ‘sleep in phase with melatonin’, Dark blue: ‘sleep out of phase with …

https://doi.org/10.7554/eLife.20214.007
Figure 2—figure supplement 4
mRNA abundance profiles for genes belonging to the phase marker gene list published in (Hughey et al., 2016).

Data are shown separately for the four conditions. Color of lines correspond to that in Figure 1 of the main text; Light blue: ‘sleep in phase with melatonin’, Dark blue: ‘sleep out of phase with …

https://doi.org/10.7554/eLife.20214.008
Figure 3 with 5 supplements
Performance of one-sample models derived from the training set when used to predict circadian phase in the validation set.

(a, d and g) Predicted circadian phase of a blood sample vs. the observed melatonin phase for each sample in the validation set for the one-sample molecular timetable model (a), the one-sample …

https://doi.org/10.7554/eLife.20214.009
Figure 3—source data 1

Source data file for generating panels in Figure 3.

https://doi.org/10.7554/eLife.20214.010
Figure 3—figure supplement 1
Performance of one-sample molecular timetable models when used to predict the circadian phase of samples in the validation set.

Color of lines correspond to that in Figure 1 of the main text; Light blue: ‘sleep in phase with melatonin’, Dark blue: ‘sleep out of phase with melatonin’, Light green: ‘total sleep deprivation, no …

https://doi.org/10.7554/eLife.20214.011
Figure 3—figure supplement 2
Parameter selection for constructing a Zeitzeiger based model.

To build a Zeitzeiger predictor, given a training data set, there are two parameters that need to be optimized: ‘sumabsv’, which controls the number of features to be used, and ‘nSPC’, which …

https://doi.org/10.7554/eLife.20214.012
Figure 3—figure supplement 3
Phase profiles of genes in the phase marker lists for the molecular timetable and Zeitzeiger models.

Grey lines indicate the circadian phase of a gene’s maxima when Z-scored, where the number of grey lines equals the number of genes (features) used to construct the model, either the molecular …

https://doi.org/10.7554/eLife.20214.013
Figure 3—figure supplement 4
Selection of number of abundance features and latent factors and pseudo time-course of latent factor scores for Partial Least Squares Regression (PLSR)-based models.

(a) Leave-one-participant-out cross-validation performance of one-sample PLSR models applied to training set when using different combinations of n mRNA abundance features and T latent factors. (b) …

https://doi.org/10.7554/eLife.20214.014
Figure 3—figure supplement 5
Comparison of the number of samples used as input vs, accuracy for each circadian phase prediction method.

Difference in model performance, as measured by the proportion of predictions vs. cumulative error, when using a given number of samples (one sample, two consecutive samples, three consecutive …

https://doi.org/10.7554/eLife.20214.015
Performance of two-sample differential mRNA abundance-based models when used to predict the circadian phase in the validation set.

(a and d) Predicted circadian phase of a blood sample vs. the observed circadian melatonin phase for each sample in the validation set for the two-sample differential molecular timetable model (a) …

https://doi.org/10.7554/eLife.20214.017
Figure 4—source data 1

Source data file for generating panels in Figure 4.

https://doi.org/10.7554/eLife.20214.018
Figure 5 with 1 supplement
Circos plot for visual comparison of lists of biomarkers forming the generated circadian phase prediction models.

Circular tracks from outside in; (1) Name of model being compared (Molecular timetable, Zeitzeiger, PLSR one-sample, PLSR two-sample differential), where all models are constructed from the same …

https://doi.org/10.7554/eLife.20214.019
Figure 5—figure supplement 1
A glucocorticoid-driven network links many of the top-ranked PLSR genes.

Twelve of the top-25 ranked genes in the PLSR onesample and PLSR two-sample differential models (indicated by a red asterisk) are linked in a network that is driven by glucocorticoid signaling. …

https://doi.org/10.7554/eLife.20214.020

Tables

Table 1

Performance of trained models when used to predict the circadian phase of samples in the validation set. NS indicates not significant.

https://doi.org/10.7554/eLife.20214.016
Average
error
(minutes)
Standard
Deviation of
Error (hours:
minutes)
Circadian
variation of
error (P-value
of ANOVA)
Proportion of
samples with
 ≤2 hr error
R2 of
predicted
vs observed
phase
Genes-from (Lech et al., 2016) - one sample155:32NS28%0.28
Genes-from (Hughey et al., 2016) - one sample−25:23<0.0130%0.32
Timetable - one sample94:38<0.0140%0.49
Zeitzeiger - one sample−0.44:44NS36%0.47
Partial Least Square Regression - one sample−183:17NS54%0.74
Genes-from (Lech et al., 2016) - two samples204:050.0535%0.60
Genes-from (Hughey et al., 2016) - two samples−0.653:580.0341%0.63
Timtable - two samples113:38<0.0143%0.69
Zeitzeiger - two samples-23:36NS47%0.69
Partial Least Square Regression - two samples−162:39NS62%0.83
Genes-from (Lech et al., 2016) - three samples243:210.0545%0.73
Genes-from (Hughey et al., 2016) - three samples83:19NS47%0.74
Timetable - three samples-32:46<0.0151%0.82
Zeitzeiger - three samples43:03NS49%0.78
Partial Least Square Regression - three samples−112:15NS71%0.88
Timetable - Differential two samples−332:28NS71%0.78
Partial Least Square Regression-Differential two samples−181:41NS82%0.90

Additional files

Supplementary file 1

Comparison of phase marker lists.

(A) Correlation r values and relative rank of correlation value for genes in phase marker lists. Maximum correlation for a gene is based on the maximum correlation between the temporal profile of a feature targeting that gene and a cosine wave. Temporal profiles were constructed independently for each condition and across all conditions. Rank of a gene is based on the distribution of maximum r values for a specific condition. Columns in the file; (A) Probe name; (B) Gene Symbol (or probe name if no gene is assigned); (C) Binary values identifying a gene as present (1) or absent (0) in the list of genes forming the molecular timetable model generated here; (D) Binary values identifying a gene as present (1) or absent (0) in the list of genes forming the model of (Lech et al., 2016); (E) Binary values identifying a gene as present (1) or absent (0) in the list of genes forming the model of (Hughey et al., 2016); (F) Binary values identifying a gene as present (1) or absent (0) in the list of genes forming the Zeitzeiger model generated here; (G) The maximum correlation r value of a gene across all four conditions used in this study; H) The maximum correlation r value of a gene in the condition ‘sleep in phase with melatonin’; (I) The maximum correlation r value of a gene in the condition ‘sleep out of phase with melatonin’; (J) The maximum correlation r value of a gene in the condition ‘total sleep deprivation, no prior sleep debt’; (K) The maximum correlation r value of a gene in the condition ‘total sleep deprivation, prior sleep debt’; (L), (M), (N), (O), and (P) provide the ranking of the correlation r value in the corresponding condition(s) of columns (G), (H), (I), (J) and (K) respectively. (B) Comparison of gene lists derived from different phase marker models and/or analyses. Genes identified in at least one of the gene lists discussed in this work (as indicated by the key within the file). A value of 1 indicates presence in the list, a value of 0 indicates absence. (C) Features (probes) and corresponding gene symbols for the one-sample PLSR model. (D) Features (probes) and corresponding gene symbols for the two-sample differential PLSR model.

https://doi.org/10.7554/eLife.20214.021
Supplementary file 2

Results table for Functional enrichment analysis of feature lists and latent factors for both the one-sample and two-sample differential PLSR-based models.

Functional enrichment analysis outputs from using the Webgestalt functional enrichment analysis tool.

https://doi.org/10.7554/eLife.20214.022
Supplementary file 3

Demographic information for the participants within the training and validation data sets.

https://doi.org/10.7554/eLife.20214.023

Download links