Data-driven causal analysis of observational biological time series

  1. Alex Eric Yuan  Is a corresponding author
  2. Wenying Shou  Is a corresponding author
  1. Molecular and Cellular Biology PhD program, University of Washington, United States
  2. Basic Sciences Division, Fred Hutchinson Cancer Research Center, United States
  3. Centre for Life’s Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, United Kingdom
25 figures, 1 video, 1 table and 1 additional file

Figures

Causality.

(A) Definition. If a perturbation in X can result in a change in future values of Y, then X causes Y. This definition does not require that any perturbation in X will perturb Y. For example, …

Two independent temporal processes can appear significantly correlated when compared to an inappropriate null model.

(A) Densities of independent yeast and bacteria cultures growing exponentially are correlated. (B, C) Correlation between time series of two independent island populations can appear significant if …

Causality versus Granger causality.

(A) Granger causality is designed to reveal direct causes, not indirect causes. Although X causes Z, X does not Granger-cause Z because with the history of Y available, the history of X no …

SSR causal methods look for a continuous map from the delay space of a causee to the causer, and this approach becomes more difficult in the presence of noise.

(A) A toy 5-variable linear system. (B) Time series. The delay vector [Z(t),Z(t-τ),Z(t-2τ)] (shown as three red dots) can be represented as a single point in the 3-dimensional Z delay space (C, red dot). (C) We then …

Failure modes associated with SSR-based causal discovery.

Top row: Nonreverting continuous dynamics may lead SSR to infer causality where there is none. This example consists of two time series: a wavy linear increase and a parabolic trajectory. Although …

Illustration of the convergent cross mapping (CCM) procedure for testing whether X causes Y.

(A) Computing cross map skill. Consider the point X(T) denoted by the red dot (“actual X(T)” in ①), which we want to predict from Y delay vectors. We first look up the contemporaneous Y delay vector [Y(T),Y(T-1),Y(T-2)]

Performance of Granger causality and convergent cross mapping in a toy model with noise.

(A) The effect of a time points’s process noise, but not its measurement noise, propagates to subsequent time points. (B) We simulated a two-species community. The process noise terms ϵp1(t) and ϵp2(t), as …

Appendix 1—figure 1
Joint distribution, marginal distributions, and dependence between two random variables.

(A) A scatterplot of data associated with random variables Xi and Xj represents a ‘joint distribution’ (black). Histograms for data associated with Xi and for data associated with Xj represent …

Appendix 1—figure 2
Examples of random variables that are identically distributed or not identically distributed, and independent or not independent.

In the top row, Xi and Xj are identically distributed (projections of the scatter plot on both axes would have the same shape, as in Appendix 1—figure 1A). Note that in the top row of the rightmost …

Appendix 1—figure 3
Measurements taken from a mixed population may still be IID, as long as sampling is independent and random.

Consider a study in which physical activity is measured from a mixed population of low-activity male mice and high-activity female mice. For simplicity, suppose that the study uses only two mice. To …

Appendix 1—figure 4
When multiple identical and independent trials are available, the significance of a correlation between time series within a trial can be assessed by comparing it to correlations between trials.

(A) A thought experiment in which yeast and bacteria are grown in the same test tube, but follow independent dynamics. We imagine collecting growth curves from 25 independent replicate trials. (B) …

Appendix 2—figure 1
Violation of faithfulness condition due to precise cancellation of causal effects.

Although X has a direct causal effect on W, we assume here that this is exactly canceled out by an opposing influence via the indirect path of XYW. Thus, although the Markov condition does not …

Appendix 2—figure 2
Probability distributions alone can specify causal structure to varying degrees of resolution.

Consider a system of three and only three random variables X, Y and Z. Between each pair of variables, there are three possible unidirectional relationships: causation in one direction, causation …

Appendix 2—figure 3
Selection bias creates the false impression of dependence.

(A) DAG depicting the assumed causal relationship between math scores, writing scores, and admission to a certain college. (B) Math and writing scores in a fictitious student population are …

Appendix 2—figure 4
Causal discovery approaches designed for directed acyclic graphs (DAGs) can be applied to time series from systems with feedback.

(i) Consider a mutualistic system where A and B represent the population sizes of two species that mutually facilitate each other’s growth. (ii) When the role of time is ignored, the causal graph …

Appendix 3—figure 1
Intuition for random phase surrogate data methods.

Random phase surrogate data methods generate Ysurr by representing Y as a sum of sine waves (1), randomly shifting the phases of the component sine waves (2), and summing up the shifted sine waves (3).

Appendix 3—figure 2
Example of a covariance-stationary process.

(A) Ten replicate runs of the stochastic process described in Equation 4 with parameter choices a=0.6 and c=10. The noise term ϵt is a normal random variable with mean of zero and standard deviation of …

Appendix 3—figure 3
Whether a stochastic process is stationary depends on its entire ensemble of time series.

The top panel shows IID standard normal noise. The middle and bottom panels both show sinusoidal curves. Although an individual time series from the middle panel looks similar to that from the …

Appendix 3—figure 4
A many-variable deterministic system can be approximated as a stochastic system.

The position of a particle in a system of particles bouncing in a one-dimensional box is plotted over time. In each simulation, particles with radius 0 bounce around in a box with walls of infinite …

Appendix 4—figure 1
Continuity, smoothness, and the difficulty of evaluating the continuity or smoothness of a function with finite or noisy data.

(A) y is not a function of x because a single x value can correspond to more than one y value. Here, when we shade x with the value of y, we randomly choose the upper or the lower y value, …

Appendix 4—figure 2
Illustration of Takens’ theorem.

(A) We consider a 3-variable toy system in which X and Y causally influence Z, but Z does not influence X or Y. (B) Time series of the three variables. (C) We can represent the time series as …

Appendix 4—figure 3
Nonreverting continuous dynamics.

(A) Definition of nonreverting continuous dynamics. We call X nonreverting if the delay space of X maps continuously to t (time). We call Y ‘continuous‘ if Y(t) is a continuous function of t. If X is nonreverting and Y is continuous then we say that the pair of time series (X,Y) has nonreverting continuous dynamics. (B) Examples. In each row, X and Y are causally independent. Leftmost column: Dynamics. Each red or blue dot (visible upon zooming in on some of the charts) represents a single time point. Second column: Looking for a continuous map from the delay vectors of X (the X delay space) to t, i.e. nonreverting X dynamics. Third column: Looking for a continuous map from t to Y by assessing whether Y at nearby times share similar values. Since the data occur at discrete times, the standard definition of continuity does not naturally apply, so ‘continuous Y’ really means ‘highly autocorrelated’. Fourth and final column: the presence or absence of ‘nonreverting continuous dynamics’. With nonreverting continuous dynamics, there is a continuous map from the X delay space to Y, and thus Y appears to cause X even though X and Y are causally independent.

Appendix 4—figure 4
Nonreverting continuous dynamics impair the ability of CCM to correctly infer causality.

Each row represents a system where Y does or does not causally influence X (Column 1). Column 2: Governing equations. Column 3: Checking for nonreverting continuous dynamics as in Appendix …

Appendix 4—figure 5
Comparison of visual continuity testing, cross map skill testing, and prediction lag testing in causal discovery.

Each row represents a two-variable or three-variable system where Y does or does not causally influence X. The leftmost column shows the equations and ground truth causality. The second column …

Appendix 4—figure 6
Parameters within a ‘pathological’ regime almost always cause the prediction lag test to erroneously reject a true causal link.

(A) System equations. For both ‘friendly’ and ‘pathological’ regimes, initial conditions X(1) and Y(1) were independently and randomly drawn from the uniform distribution between 0.01 and 0.99 (“Unif(0.01,0.99)”), …

Videos

Video 1
Video walkthrough.

Tables

Table 1
A comparison of three statistical causal discovery approaches.
What does it mean if the method detects a link?Implied causal statementWhat are some possible failure modes?
CorrelationX and Y are statistically dependent.X causes Y, Y causes X, or Z causes both.Surrogate null model may make incorrect assumptions about the data-generating process.
Granger causalityThe history of X contains unique information that is useful for predicting the future of Y.X directly causes Y.Hidden common cause; infrequent sampling; deterministic system (no process noise); excessive process noise; measurement noise
State space reconstructionThe delay space of X can be used to estimate Y.Y causes X.Nonreverting continuous dynamics; synchrony; integer multiple periods; pathological symmetry; measurement or process noise

Additional files

Download links