Computational and Systems Biology

CausalXtract: a flexible pipeline to extract causal effects from live-cell time-lapse imaging data

Franck Simon
Maria Colomba Comes
Tiziana Tocci
Louise Dupuis
Vincent Cabeli
Nikita Lagrange
Arianna Mencattini
Maria Carla Parrini
Eugenio Martinelli author has email address
Hervé Isambert author has email address

CNRS UMR168, Institut Curie, Université PSL, Sorbonne Université, Paris, France
Department of Electronic Engineering, University of Rome Tor Vergata, Rome, Italy
INSERM U830, Institut Curie, Université PSL, Paris, France

https://doi.org/10.7554/eLife.95485.1

Open access
Copyright information

Figures and data

CausalXtract pipeline.
a, Live-cell tumor ecosystem reconstituted ex vivo¹ using the tumor-on-chip technology (Methods). b, CausalXtract’s live-cell image feature extraction module (CellHunter+). The tracking of cancer and immune cells and of their mutual interactions is illustrated in Supplementary Movies 1-3, in absence or presence of cell division and apoptosis event. c, CausalXtract’s temporal causal discovery module (tMIIC) learns a temporal causal network from the features extracted in (b). See Methods for CausalXtract’s implementation details and theoretical foundations. A step-by-step notebook of CausalXtract pipeline is provided with the source code.

Application of CausalXtract to time-lapse images of tumor ecosystems reconstituted ex vivo¹.
a, Summary causal network inferred by CausalXtract. The underlying time-unfolded causal network is shown on Supplementary Fig. 7. Red (resp. blue) edges correspond to positive (resp. negative) associations. Bidirected dashed edges represent the effect of unobserved (latent) common causes. Annotations on edges correspond to time delays in time-steps (1 ts = 2 min). The inferred network is largely robust to variations in sampling rate (δτ) and maximum lag (τ), Supplementary Fig. 8. Here δτ = 7 ts and τ = 84 ts are chosen automatically by CausalXtract, Supplementary Fig. 8b. b, The CAF presence subnetwork highlighting the direct causal effects of CAFs on cancer cells. In particular, CausalXtract uncovers that CAFs directly inhibit cancer cell apoptosis independently from treatment, which has not been reported so far. c, The treatment subnetwork highlighting the direct causal effects of treament on cancer cells. In particular, CausalXtract uncovers that treatment increases cancer cell perimeter, which has not been reported either. d, The eccentricity-area subnetwork highlighting multiple direct and possibly antagonistic time-lagged effects, notably, between cell division and eccentricity and between cell apoptosis and area, as discussed in main text.

Benchmark assessment of CausalXtract’s causal discovery module (tMIIC) using generated time series datasets.
a, Example of a 15 node causal network to generate benchmark time series datasets based on linear combinations of contributions, Supplementary Table 1. Examples of temporal causal networks reconstructed by tMIIC based on 100, 1,000 or 10,000 simulated time steps. b, Running times and scores (Precision, Recall, Fscore) averaged over 10 datasets and compared to PC and PCMCI+ methods using different kernels (GPDC, KNN, ParCorr); tMIIC is at par with PC and PCMCI+ scores using GPDC and KNN kernels but runs orders of magnitude faster. Only ParCorr kernel matches tMIIC running speed but with significantly lower scores at large sample size, see Methods.

CausalXtract insensitivity to an overestimated maximum lag τ.
a, Example of a temporal causal network model with a maximum lag τ = 2. Corresponding temporal causal networks inferred by CausalXtract’s causal discovery module (tMIIC), from 1,000 time step stationary time series (Supplementary Table 1), while assuming different maximum lags τ = 2, 5 or 10. b, Running times and scores (Precision, Recall, Fscore) of tMIIC temporal causal network reconstructions for τ = 2, 5 or 10, averaged over ten stationary time series of 10 to 10⁵ time steps. Overestimating the maximum lag τ has little impact on the reconstructed networks, as long as the time series are stationary, as demonstrated in Supplementary Fig. 3.

CausalXtract sensitivity to non-stationary variables.
a, Example of a temporal causal network model (τ = 2) with a low frequency periodic input (T = 100) applied to X8 and a time-linear trend applied to X13. Corresponding temporal causal networks inferred by tMIIC from 1,000 time step time series (Supplementary Table 1) including non-stationary inputs to X8 and X13. Increasing the maximum lag from τ = 2 to τ = 5 or 10 leads to the appearence of multiple self-loops, which result from the non-stationary dynamics of X8 and X13, whilst the rest of the network remains largely unaffected. b, Running times and scores (Precision, Recall, Fscore ignoring X8 and X13 self-loops) of tMIIC causal network reconstructions for τ = 2, 5 or 10, averaged over ten time series of 10 to 10⁵ time steps.

Benchmark assessment of CausalXtract’s causal discovery module (tMIIC) using more complex time series datasets.
a, Example of a 15 node causal network to generate more complex benchmark time series datasets based on non-linear combinations of contributions, Supplementary Table 2. Examples of temporal causal networks reconstructed by tMIIC based on 100, 1,000 or 10,000 simulated time steps. b, Running times and scores (Precision, Recall, Fscore) averaged over 10 datasets and compared to PC and PCMCI+ methods using different kernels (GPDC, KNN, ParCorr); tMIIC outperforms both PC and PCMCI+, in terms of Recall and Fscores, while running orders of magnitude faster, except for the ParCorr kernel, which leads, however, to significantly lower scores at large sample size.

Time-unfolded causal network framework and relation to Granger-Schreiber temporal causality.
a, A vanishing Transfer Entropy, i.e., implies i) the absence of (dashed) edge between X_t and any, with t^’ < t, and ii) if X_t is adjacent to Y_t, the presence of temporal (2-variable + time) v-structures, , for all adjacent to Y_t, with t^’ < t (Methods, Theorem 1). These results can be readily extended to include the presence of other observed variables, , by redefining Transfer Entropy as, , which discards contributions from indirect paths through other observed variables, , By contrast, the presence of a temporal (2-variable + time) v-structure, does not imply a vanishing Transfer Entropy, as long as there remains an edge between any and X_t. It implies that Granger-Schreiber temporal causality is in fact too restrictive and may overlook actual causal effects, which can be uncovered by graph-based causal discovery methods like CausalXtract’s causal discovery module (tMIIC). Hence, CausalXtract’s time-unfolded network framework, combining graph-based and information-based approaches, sheds light on the common foundations of the seemingly unrelated graph-based causality and Granger-Schreiber temporal causality, while clarifying their actual differences and limitations.

Time-unfolded causal network framework and relation to Granger-Schreiber temporal causality.
a, A vanishing Transfer Entropy, i.e., implies i) the absence of (dashed) edge between X_t and any, with t^’ < t, and ii) if X_t is adjacent to Y_t, the presence of temporal (2-variable + time) v-structures, , for all adjacent to Y_t, with t^’ < t (Methods, Theorem 1). These results can be readily extended to include the presence of other observed variables, , by redefining Transfer Entropy as, , which discards contributions from indirect paths through other observed variables, , By contrast, the presence of a temporal (2-variable + time) v-structure, does not imply a vanishing Transfer Entropy, as long as there remains an edge between any and X_t. It implies that Granger-Schreiber temporal causality is in fact too restrictive and may overlook actual causal effects, which can be uncovered by graph-based causal discovery methods like CausalXtract’s causal discovery module (tMIIC). Hence, CausalXtract’s time-unfolded network framework, combining graph-based and information-based approaches, sheds light on the common foundations of the seemingly unrelated graph-based causality and Granger-Schreiber temporal causality, while clarifying their actual differences and limitations.

Time series of cellular features extracted from the tumor ecosystems.
Example of time series of cellular features extracted by CausalXtract’s feature extraction module (CellHunter+) from the tumor ecosystems analyzed in this study, Fig. 1a. It includes two experimental control parameters (i.e. treatment and CAF presence) and 15 cellular features extracted every 2 minutes over a period of two days. Continuous features are highlighted for one trajectory (traj.18), while categorical features are shown for all trajectories.

Time-unfolded causal network inferred by CausalXtract.
a, Time-unfolded causal network assuming stationary dynamics of cellular ecosystems implying translational time invariance of the inferred causal network. b, Only edges involving at least one contemporaneous variables (i.e. at time t) need to be tested for conditional independence by tMIIC and the remaining edges are then duplicated at all previous time steps before assigning orientations when time-lagged latent variables are taken into account, Fig. 1c. Variables retaining multiple self-loops with different time-delays correspond to non-stationary variables in Supplementary Fig. 6, in agreement with benchmarks from simulated data including non-stationary variables, Supplementary Fig. 3.

Robustness of CausalXtract’s temporal causal networks to variations in sampling rate.
Summary causal networks inferred by CausalXtract using different sampling rates (δτ). a, δτ = 8 ts and τ = 80 ts, in time step units (1 ts = 2 min). b, δτ = 7 ts, and τ = 84 ts, as chosen automatically by CausalXtract based on the average relaxation time across the 15 monitored variables, τ_R = 40 ts, which defines a maximum lag τ = 2 τ_R = 80 ts. Given a total number of (time-lagged and -unlagged) nodes, chosen to be around 200 nodes for computational efficiency, it leads to 13 temporal layers (ν + 1 = 200/15 ≃ 13) and a lag increment δτ = τ/ν ≃ 7 ts. This summary causal network corresponds to Fig. 2a. c, δτ = 5 ts and τ = 60 ts, corresponding to τ = ν · δτ with ν + 1 = 13 temporal layers, as in (b).

15 nodes model.

15 nodes model with combinations.

Sign up for email alerts