Quantifying how post-transcriptional noise and gene copy number variation bias transcriptional parameter inference from mRNA distributions

  1. Xiaoming Fu
  2. Heta P Patel
  3. Stefano Coppola
  4. Libin Xu
  5. Zhixing Cao  Is a corresponding author
  6. Tineke L Lenstra  Is a corresponding author
  7. Ramon Grima  Is a corresponding author
  1. Key Laboratory of Smart Manufacturing in Energy Chemical Process, East China University of Science and Technology, China
  2. School of Biological Sciences, University of Edinburgh, United Kingdom
  3. Center for Advanced Systems Understanding, Helmholtz-Zentrum Dresden-Rossendorf, Germany
  4. The Netherlands Cancer Institute, Oncode Institute, Division of Gene Regulation, Netherlands
15 figures, 16 tables and 1 additional file

Figures

Overview of the inference of transcriptional parameters from synthetic data.

(a) A schematic illustration of the telegraph model. (b). A schematic of the delay telegraph model. The double horizontal line for nascent mRNA removal indicates this is a delayed reaction. (c) Illustration showing promoter switching between two states, Pol II binding to the promoter in the ON state and subsequently undergoing productive elongation. Note that the length of the nascent mRNA tail increases until Pol II terminates at the end of the gene. As Pol II travels through the 14 repeats of the PP7 loops, the intensity of the mRNA increases due to fluorescent probe binding to the mRNA; intensity saturates as Pol II enters the GAL10 gene body. (d) Illustration of the algorithms to generate synthetic data and to perform inference from mature and nascent mRNA data. The green boxes are only applicable for the inference of the fluorescence signal intensity of nascent mRNAs; note that in nascent mRNA inference, the "RNA number" in the flow chart should be interpreted as the number of bound Pol II molecules on the gene. A large iteration step Nmax (104) is chosen as the termination condition for the optimizer.

Accuracy of the inferred kinetic parameters from synthetic mature and nascent mRNA data using the telegraph and delay telegraph model, respectively.

(a) 3D scatter plot showing the ratio of the mean relative error from nascent mRNA data (using delay telegraph model MREdelay) and the mean relative error from mature mRNA data (using the telegraph model MREtele) for 789 independent parameter sets sampled on a grid. Red data points indicate parameter sets with lower relative errors for mature data compared to nascent data, blue datapoints indicate parameter sets with lower relative error for nascent data compared to mature data (b) Same data as (a) but shown as a function of the fraction of ON time, fON. For 61% of the parameters, the inference accuracy is higher when using nascent mRNA data. (c) Sampling from the same parameter space, we then add log-normal distributed noise (size 5%) to the initiation rate ρ (see text for details) to mimic external noise due to post-transcriptional processing that is only present in mature mRNA. Log10 of the ratio of the median relative error (MRE) using perturbed mature mRNA data against Log10 MRE using nascent mRNA data is shown as a function of the true fraction of ON time, fON. For 64% of the parameters, the inference accuracy is higher when using nascent mRNA data. (d) The median relative error of each transcriptional parameter as a function of the fraction of ON time, using synthetic nascent mRNA, synthetic mature mRNA data and synthetic mature mRNA with external noise. Inference from nascent data is generally more accurate than using mature mRNA data.

Inference results using four mature mRNA data sets with sample sizes of 2333, 6366, 4550 and 3163 cells, respectively.

(a) Representative smFISH image of a yeast cell with PP7-GAL10 RNAs labeled with Cy3 and the nucleus labeled with DAPI. (b) The DAPI and Cy3 signals were used to determine the nuclear and cellular mask, respectively. Detected and fitted spots are indicated in green. Mature RNA count distribution (pink) for segmentation method 1 with a best fit obtained from the telegraph model (gray curve). Scale bar is 5 μm(c-d) The DAPI and Cy3 signals were used to determine the nuclear and cellular mask using a second independent segmentation tool (segmentation 2). Mature RNA count distribution (gray and cyan) with/without counting the transcription site (TS) for segmentation method 2 with a best fit obtained from the telegraph model (gray curves). (e) Bar graphs of inferred transcriptional parameters (merged mature RNA data) from fitting the distributions of the two segmentation methods (‘seg1’ and ‘seg2’) as well as the distribution of mature RNAs only (‘seg2 -TS’ which indicates the exclusion of one spot in each cell that represents the transcription site). The burst size was computed as ρ/σoff and the fraction of ON time as σon/(σon+σoff). Error bars indicate standard deviation computed over the four datasets. (f) Distribution of the integrated DAPI intensity for each cell. Cyan line represents a Gaussian bimodal fit with highlighted regions indicating the intensity-based classification of G1 and G2 cells. Distributions of the mature RNA count for all cells (merged) and cell-cycle classified cells (G1 cells and G2 cells). (g) Tables and bar graphs of inferred parameters for merged and cell-cycle-specific data. Note that the transcriptional parameters σon,σoff,ρ are normalised by the degradation rate and hence dimensionless. For the cell-cycle-specific data, parameters were inferred per gene copy.

Inference from the normalized nascent mRNA distributions for merged and cell-cycle specific data.

(a) Normalized nascent mRNA distributions of merged cell-cycle data were obtained by normalizing the signal intensity of the transcription site (defined as the brightest spot in the cell) by the median signal intensity of the cytoplasmic spots (shown in orange and zoom-in depicted in the inset). In all 4 datasets, approximately 90% of the detected cytoplasmic spots fell in the range 0.5× median – 1.5× median (grey bargraph). Black line in normalized distribution on the right represents best fit with delay telegraph model. (b) Nascent RNA distributions for cell-cycle-specific data. Black lines represent best fits with delay telegraph model. (c) Bar graphs comparing the transcriptional parameters, burst size, fraction of ON time and Fano factor for cell-cycle-specific and merged data. Error bars indicate standard deviation of the four datasets. (d) Normalized ACF plots of cell-cycle-specific and merged data. The ACF plots are generated by stochastic simulations using estimated parameters from merged and cell-cycle specific nascent mRNA data for each of the four data sets; these were compared with the ACF measured directly using live-cell data in Donovan et al., 2019 (green line). (e) The sum of squared ACF residuals of merged and cell-cycle-specific data from each dataset (this is the sum of squared deviations between the measured and estimated normalised ACF where the sum was calculated over all time points).

Inference results using the rejection method.

(a) Nascent RNA distributions for cell-cycle-specific and merged data. Black lines represent best fits with delay telegraph model using the rejection method. (only the distributions for dataset #1 k=4 are shown). (b) Estimated transcriptional parameters, burst size, fraction of ON time and Fano factor (mean values and standard deviation error bars of the four datasets) by rejecting the first k bins with k=1,2,3,4. The estimated parameters are listed in Appendix 4—table 3. (c) Normalized autocorrelation function (ACF) predicted by stochastic simulations using the estimated parameters (for k=4) for each of the four data sets versus that measured directly using live-cell data (green line). (d) The sum of squared residuals of the ACF of cell-cycle-specific data from each dataset without/with rejection when k=4.

Inference results using the fusion method.

(a) Estimated burst size, fraction of ON and Fano factor (mean values and standard deviation error bars of the four datasets) by combining the first k bins with k=1,2,3,4. (b) Corresponding fitted distributions for G1 (top row) and G2 (bottom row) using delay telegraph model with the fusion method (only the distributions for k=4 are shown). Magenta bar represents the combined bin 0–3 when k=4. (c) Normalised autocorrelation function (ACF) predicted by stochastic simulations using the estimated parameters (for k=4) for each of the four data sets versus that measured directly using live-cell data (green line). The sum of squared residuals of the ACF plots using cell-cycle specific data without/with fusion method when k=4. (d) Estimated parameters of cell cycle specified data and merged data of nascent mRNAs with fusion method with k=4 (fusing bins 0–3). These correspond to the fitted distributions in b. The elongation time τ is fixed to 0.785 min. See the inferred parameters in Appendix 4—table 4 for all other values of k.

Appendix 1—figure 1
Comparing inference accuracy using synthetic nascent mRNA data and synthetic mature mRNA data with 10% external noise (log-normal distributed noise is added to the initiation rate ρ to mimic external noise due to post-transcriptional processing that is only present in mature mRNA).

(a) Ratio of the mean relative errors in the two types of data as a function of the true fraction of ON time, fON. For ≈91% (719/789) of the parameters, the inference accuracy is higher when using nascent mRNA data. (b) The median relative error of each transcriptional parameter as a function of the fraction of ON time using synthetic mature mRNA.

Appendix 1—figure 2
Inference with the telegraph model and delay telegraph model for six parameter sets.

(a) Estimates using the inference algorithm with the telegraph model (with no external noise) for six parameter sets. For both the ground truth and the estimated parameters, we fix the degradation rate d=1 min-1. (b). Estimates using the inference algorithm with the delay telegraph model for six parameter sets. For both the ground truth and the estimated parameters, we fix the delay τ=0.5 min. (c) Distributions from synthetic mature mRNA data fitted using the telegraph model. (d) Distributions from synthetic nascent mRNA data fitted using the delay telegraph model.

Appendix 2—figure 1
Distributions and mean errors of the transcriptional parameter inference.

(a-d) Comparison of the stochastic properties of the delay telegraph model and the telegraph model. a. Distributions of the nascent mRNA predicted by the delay telegraph model and the telegraph model for various values of τ. We fix the parameters (σoff,σon,ρ)=(0.6,0.03,1) which implies that the change of τ leads to a change in x=τ(σon+σoff) at constant y=(ρ/σoff)(σoff/(σoff+σon))2. (b) Corresponding relative error R between the variances of two models calculated as a function of τ using Equation (12). (c) Distributions of the nascent mRNA predicted by the delay telegraph model and the telegraph model for various values of ρ. We fix the parameters (σoff,σon,τ)=(0.6,0.03,100) which implies that the change of ρ leads to a change in y at constant x. (d) Corresponding relative error R between the variances of two models calculated as a function of ρ using Equation (12). (e-f) Inference of transcriptional parameters using as input synthetic fluorescent signal data generated by SSA simulations of transcription and fluorescent tagging for 104 cells (see Methods section of the main text). (e) Mean relative error and normalised fitness score (fitness/number of samples) plot for 20 sets of numerical experiments. The inference is done in two different ways, using either the telegraph model (green) or the delayed telegraph model (blue). (f) Distributions of total fluorescent intensity from synthetic data (red dots) fit using the inference algorithm with telegraph model (dashed green) or delayed telegraph model (blue) for 6 different parameter sets. The insets show the relative errors in the estimates of the burst frequency (α) and of the burst size (β) calculated using Equation (13). Note that while both models provide a very good fit to the distribution from synthetic data, nevertheless parameter estimation is far more accurate using a delayed telegraph model. This is also reflected in (a) where we see low fitness scores for both models but a high mean relative error for estimates based on the telegraph model. The true and estimated parameters are shown in Appendix 2—table 1.

Appendix 3—figure 1
Merged and cell-cycle specific mature mRNA count distributions.

(a) Merged mature mRNA count distribution (purple) under segmentation method 1 and with/without counting the transcriptional site (TS) under segmentation method 2 with a best fit obtained from the telegraph model (magenta curves). (b) Cell-cyle specific mature mRNA count distribution (purple) under segmentation method 2 with a best fit obtained from the telegraph model (magenta curves).

Appendix 4—figure 1
Inference results using merged and cell-cycle specific nascent data.

Experimental distributions (purple) are fit using the delay telegraph model (magenta curves).

Appendix 4—figure 2
Inference results using cell-cycle specific data curated with the rejection method (only the distributions for k=4 are shown).

Corresponding fitted distributions (purple) for G1 (top row) and G2 (bottom row) using the delay telegraph model (magenta curves).

Appendix 5—figure 1
Autocorrelation functions of 104 simulated GAL10 intensity traces (solid blue lines).

The transcriptional parameters for G1 and G2 cells in the four sets of experimental data were obtained using the fusion method (see Figure 6d of the main manuscript). A linear fit (dashed black line) was subtracted to correct the ACFs for switching from G1 to G2 (solid cyan lines).

Appendix 6—figure 1
Left panel: Numerical instabilities due to the calculation of higher-order derivatives in the exact solution appear when the arithmetic precision is not very high (85).

Right panel: These instabilities disappear when the precision increases to 300. The exact solution with such high precision agrees well with the FSP method using double-precision floating-point (Float64) type. The parameters are,σoff=1.71 σon=5.82,,ρ=53.74 and.τ=0.56

Appendix 7—figure 1
Blue curves (Method 1) show the fluorescent intensity distributions of the four experimental data sets after the classification of cells into G1 and G2 phases using the Fried/Baisch model.

Magenta curves (Method 2) show the same but using a bimodal Gaussian, as described in the main text.

Tables

Appendix 1—table 1
Mean and standard deviation of the parameters estimated from 10 independent synthetic datasets, generated for each parameter set in Appendix 1—figure 2.
ParameterMeanStandard deviation
σoffσonρburst sizefONσoffσonρburst sizefON
Set 131.852.6915.680.530.1019.070.227.680.070.05
Set 292.6730.47141.421.560.2523.761.5121.610.140.04
Set 39.948.0163.356.390.450.720.191.940.250.01
Set 42.302.1618.768.170.480.120.050.310.310.01
Set 52.265.8012.575.590.720.190.300.110.410.01
Set 61.2210.0124.9720.530.890.100.430.161.650.01
Appendix 1—table 2
95% confidence intervals of the 12 parameter sets (shown in Appendix 1—figure 2a-b).
ParameterTelegraph CIDelay CI
σoffσonρσoffσonρ
Set 1(6.76, 300.00)(3.53, 8.49)(5.59, 107.67)(13.80, 300.00)(1.85, 2.69)(9.94, 160.51)
Set 2(17.22, 190.38)(14.56, 23.92)(61.14, 250.51)(103.80, 268.35)(30.00, 35.38)(161.54, 300.00)
Set 3(0.54, 0.59)(0.41, 0.43)(46.08, 47.15)(6.83, 8.55)(6.42, 7.04)(58.28, 63.18)
Set 4(0.31, 0.35)(0.46, 0.49)(15.50, 15.91)(1.64,1.99)(1.65,1.85)(17.92,18.77)
Set 5(0.49, 0.85)(5.82, 7.62)(49.62, 50.92)(1.56,2.59)(4.58,5.85)(12.08,12.97)
Set 6(0.08, 0.16)(2.46, 3.54)(26.10, 26.54)(0.73,1.24)(7.53,9.75)(24.32,25.14)
Appendix 1—table 3
Table showing the relative error against profile likelihood error of 12 parameter sets (shown in Appendix 1—figure 2a and b).

See text for details.

TelegraphDelay
Relative errorProfile likelihood errorRelative errorProfile likelihood error
σoffσonρσoffσonρσoffσonρσoffσonρ
Set 10.870.270.7718.820.8811.070.820.110.581.750.341.82
Set 20.040.041.44E-034.720.522.220.410.110.270.810.160.55
Set 30.080.010.010.090.060.020.200.150.030.220.090.08
Set 40.010.011.11E-030.140.070.030.220.180.020.200.110.05
Set 50.220.134.97E-030.550.270.030.300.230.030.640.280.07
Set 60.290.240.010.740.370.020.200.120.010.550.250.03
Appendix 1—table 4
Effects of random perturbations on inference of parameters from mature mRNA data (using the telegraph model).
ParameterTrueUnperturbed–1/0/+1 stochastic perturbation
σoffσonρburst sizefONσoffσonρburst sizefONσoffσonρburst sizefON
Set 1120.387.7040.850.340.0615.585.639.220.590.270.591.173.706.280.66
Set 235.4817.3985.052.400.3336.7318.0385.172.320.3324.1315.7970.892.940.40
Set 30.520.4246.2188.360.440.570.4246.6282.470.430.610.4647.1776.740.43
Set 40.320.4715.7148.650.590.330.4715.7048.010.590.390.5416.0941.170.58
Set 50.535.8549.9594.720.920.646.6250.2077.990.910.686.7250.3574.480.91
Set 60.163.8926.45169.810.960.112.9426.30238.760.960.133.0626.42203.200.96
Appendix 1—table 5
Inference using the delay telegraph model from synthetic nascent fluorescent data, with and without perturbation by log-normal noise.
ParameterTrueDelayPerturbation
σoffσonρburst sizefONσoffσonρburst sizefONσoffσonρburst sizefON
Set 134.492.8016.650.480.0862.743.1026.380.420.0550.081.5231.710.630.03
Set 2135.6832.98179.741.320.2080.4229.44131.351.630.27233.9630.74300.001.280.12
Set 39.867.9663.166.410.457.856.7861.207.790.468.856.6365.497.400.43
Set 42.262.1418.648.260.491.771.7618.2810.360.501.901.6418.659.830.46
Set 52.275.8012.555.530.721.604.4812.217.630.742.594.7913.275.130.65
Set 61.189.9424.9221.150.890.948.7424.7426.370.901.8410.1526.1014.180.85
Appendix 2—table 1
Estimates using the inference algorithm with delay telegraph and telegraph models for the six parameter sets in Appendix 2—figure 1f.
ParameterTrueDelayTelegraph
σoffσonρburst sizefONσoffσonρburst sizefONσoffσonρburst sizefON
Set 11.058.2057.9955.090.890.947.1957.9261.870.880.604.1059.5199.420.87
Set 21.273.1458.1745.690.711.132.9158.0151.160.720.451.2259.43133.070.73
Set 32.275.8012.555.530.721.604.4812.217.630.740.591.7112.0920.610.74
Set 41.189.9424.9221.150.890.948.7424.7426.370.900.464.1224.9254.000.90
Set 52.262.1418.648.260.491.771.7618.2810.360.500.540.5817.6032.590.52
Set 61.384.7721.7415.790.781.084.0021.5019.940.790.381.4921.3156.440.80
Appendix 3—table 1
Inferred transcriptional parameters using merged mature mRNA data from segmentation 1, segmentation 2 and segmentation 2 without transcriptional site (TS).
Parametersegmentation 1segmentation 2segmentation 2 without TS
σoffσonρburst sizefONσoffσonρburst sizefONσoffσonρburst sizefON
Set 121.184.7347.152.230.183.273.3620.626.310.512.922.4521.057.220.46
Set 219.535.4440.112.050.223.223.8719.175.950.552.462.6818.357.460.52
Set 316.534.7739.482.390.223.674.0520.095.470.523.002.8719.736.570.49
Set 428.725.1650.001.740.155.794.5823.043.980.445.273.2724.334.610.38
Mean21.495.0244.192.100.193.993.9720.735.430.503.412.8220.876.460.46
Std5.190.345.210.280.031.060.431.430.890.041.260.352.561.290.06
Appendix 3—table 2
95% confidence interval intervals for the estimates from experimental mature mRNA data using segmentation 2.
matureσoffσonρCI for σoff,σon,ρ
mergedSet 13.273.3620.62(2.08, 6.08)(2.81, 4.16)(18.03, 25.97)
Set 23.223.8719.17(2.38, 4.69)(3.41, 4.44)(17.51, 21.62)
Set 33.674.0520.09(2.54, 5.92)(3.51, 4.79)(18.03, 23.67)
Set 45.794.5823.04(3.33, 13.33)(3.79, 5.85)(19.36, 33.51)
G1Set 10.692.7610.71(0.27, 2.35)(1.73, 4.96)(9.77, 12.85)
Set 20.602.8010.27(0.35, 1.24)(2.13, 3.92)(9.77, 11.15)
Set 30.883.4110.33(0.38, 2.84)(2.23, 5.77)(9.58, 12.35)
Set 42.465.2111.12(0.46, 300.00)(2.68, 18.86)(8.96, 216.67)
G2Set 10.420.999.45(0.22, 0.91)(0.65, 1.48)(8.64, 10.86)
Set 20.141.017.92(0.05, 0.36)(0.56, 1.72)(7.61, 8.46)
Set 30.181.247.91(0.05, 0.72)(0.57, 2.59)(7.49, 8.85)
Set 40.421.688.19(0.12, 2.07)(0.76, 3.50)(7.41, 10.28)
Appendix 3—table 3
Inferred transcriptional rate (normalized) per gene copy for the G2 cell cycle phase under the assumption that the two gene states are perfectly synchronized.
matureσoffσonρburst sizefON
G2 syncSet 10.622.178.4513.690.78
Set 20.373.267.7221.050.90
Set 30.674.427.9211.850.87
Set 40.553.457.5513.620.86
Mean0.553.327.9115.050.85
Std0.130.920.394.090.05
Appendix 4—table 1
Estimated parameters from the non-curated distribution of the normalized intensity of the brightest nuclear spot (nascent mRNA data) constructed by merging all data or else specific to the cell cycle phases G1 and G2.

The elongation time τ is estimated to be 0.785 min, based on measurements of the elongation speed.

nascentσoffσonρburst sizefON
mergedSet 15.245.0777.4614.780.49
Set 25.585.1382.1114.710.48
Set 35.115.1776.0914.900.50
Set 45.966.1373.2312.300.51
Mean5.475.3777.2214.170.50
Std0.380.503.701.250.01
G1Set 11.113.7637.8334.100.77
Set 21.533.9441.3327.060.72
Set 30.953.2336.7938.560.77
Set 41.283.7636.0428.090.75
Mean1.223.6738.0031.950.75
Std0.250.312.345.390.02
G2Set 10.741.6935.0047.300.70
Set 20.822.1836.3044.370.73
Set 30.912.1934.5437.900.71
Set 41.082.6133.2730.760.71
Mean0.892.1734.7840.080.71
Std0.150.381.257.350.01
Appendix 4—table 2
95% confidence intervals for non-curated data estimated using the profile likelihood method.
nascentσoffσonρburst sizefONCI for σoff,σon,ρ
mergedSet 15.245.0777.4614.780.49(4.42, 6.23)(4.68, 5.45)(73.48, 82.92)
Set 25.585.1382.1114.710.48(5.05, 6.15)(4.97, 5.37)(79.38, 85.53)
Set 35.115.1776.0914.900.50(4.53, 5.71)(4.98, 5.38)(73.49, 79.64)
Set 45.966.1373.2312.300.51(5.19, 7.22)(5.87, 6.57)(69.34, 78.37)
G1Set 11.113.7637.8334.100.77(0.77, 1.66)(3.04, 4.63)(36.26, 39.96)
Set 21.533.9441.3327.060.72(1.29, 1.88)(3.64, 4.40)(40.18, 42.85)
Set 30.953.2336.7938.560.77(0.77, 1.17)(2.86, 3.60)(36.05, 37.69)
Set 41.283.7636.0428.090.75(0.99, 1.73)(3.18, 4.34)(34.68, 37.75)
G2Set 10.741.6935.0047.300.70(0.54, 1.02)(1.36, 2.08)(33.64, 36.72)
Set 20.822.1836.3044.370.73(0.66, 1.07)(1.85, 2.52)(35.35, 37.40)
Set 30.912.1934.5437.900.71(0.70, 1.23)(1.85, 2.61)(33.38, 36.05)
Set 41.082.6133.2730.760.71(0.75, 1.71)(2.00, 3.41)(31.91, 35.40)
Appendix 4—table 3
(Rejection method) Estimated parameters by discarding the first k signal bins of the experimental distribution of the signal intensity (and renormalizing afterwards).

Inference is done for each of the four data sets. The elongation time is fixed to τ0.785 min.

k = 1σoffσonburst sizefON
G1Set 11.064.3336.6234.630.80
Set 21.454.4239.8427.510.75
Set 31.043.8436.6635.190.79
Set 41.274.1235.4127.780.76
Mean1.214.1837.1331.280.78
Std0.190.261.904.200.02
G2Set 10.842.1034.5441.200.71
Set 20.972.9435.4936.730.75
Set 30.942.5333.6835.990.73
Set 41.203.0732.8127.280.72
Mean0.992.6634.1335.300.73
Std0.150.441.155.820.02
k=2σoffσonρburst sizefON
G1Set 11.586.3337.7723.890.80
Set 21.935.9340.8621.160.75
Set 31.285.1537.0729.070.80
Set 41.555.2935.9223.100.77
Mean1.595.6737.9124.300.78
Std0.270.552.113.380.02
G2Set 11.433.4235.8325.110.71
Set 21.694.3937.4222.100.72
Set 31.333.3834.6426.080.72
Set 41.563.6833.7021.610.70
Mean1.503.7235.3923.720.71
Std0.160.471.612.200.01
k=3σoffσonρburst sizefON
G1Set 12.669.0339.7314.920.77
Set 22.557.4142.0416.460.74
Set 31.646.6737.6922.920.80
Set 42.227.3237.0216.650.77
Mean2.277.6139.1217.730.77
Std0.461.002.263.540.02
G2Set 12.014.3537.2318.500.68
Set 22.165.1038.5617.850.70
Set 31.704.0535.5520.850.70
Set 41.884.1634.4818.300.69
Mean1.944.4136.4518.880.69
Std0.190.481.801.340.01
k=4σoffσonρburst sizefON
G1Set 16.1014.1844.417.280.70
Set 23.729.5743.9611.810.72
Set 32.027.9938.2518.890.80
Set 42.758.6037.7813.760.76
Mean3.6510.0941.1012.940.74
Std1.772.803.574.810.04
G2Set 12.124.5037.4817.680.68
Set 22.595.6839.5515.250.69
Set 32.475.1437.2715.080.68
Set 42.184.5535.1716.130.68
Mean2.344.9737.3716.040.68
Std0.230.561.791.190.01
Appendix 4—table 4
(Fusion method) Estimated parameters by combining the first k signal bins of the experimental distribution of the signal intensity.

Inference is done for each of the four data sets. The elongation time is fixed to τ0.785 min.

k=1σoffσonρburst sizefON
G1Set 11.113.7637.8334.100.77
Set 21.533.9441.3327.060.72
Set 30.953.2336.7938.560.77
Set 41.283.7636.0428.090.75
Mean1.223.6738.0031.950.75
Std0.250.312.345.390.02
G2Set 10.741.6935.0047.300.70
Set 20.822.1836.3044.370.73
Set 30.912.1934.5437.900.71
Set 41.082.6133.2730.760.71
Mean0.892.1734.7840.080.71
Std0.150.381.257.350.01
k=2σoffσonρburst sizefON
G1Set 10.782.9936.6546.990.79
Set 21.143.2740.0135.060.74
Set 30.712.5836.0051.000.79
Set 40.963.0935.0836.520.76
Mean0.902.9836.9442.390.77
Std0.190.302.157.820.02
G2Set 10.541.3034.3763.200.70
Set 20.671.8735.7453.700.74
Set 30.741.8833.9645.750.72
Set 40.942.3632.8434.890.72
Mean0.721.8534.2349.380.72
Std0.170.441.2012.010.01
k=3σoffσonρburst sizefON
G1Set 10.712.7936.3851.390.80
Set 21.073.1439.7737.010.75
Set 30.632.3535.7456.840.79
Set 40.802.7134.5843.090.77
Mean0.802.7536.6247.080.78
Std0.190.322.238.780.02
G2Set 10.521.2534.3065.860.71
Set 20.651.8435.7054.750.74
Set 30.711.8133.8547.670.72
Set 40.912.3132.7635.910.72
Mean0.701.8034.1551.050.72
Std0.160.431.2212.570.01
k=4σoffσonρburst sizefON
G1Set 10.692.7336.3052.850.80
Set 21.053.0939.6937.710.75
Set 30.632.3535.7456.780.79
Set 40.832.7934.6841.570.77
Mean0.802.7436.6047.230.78
Std0.190.312.179.040.02
G2Set 10.561.3434.4561.050.70
Set 20.661.8635.7354.110.74
Set 30.671.7233.7050.560.72
Set 40.922.3332.7935.480.72
Mean0.701.8134.1750.300.72
Std0.150.411.2410.800.01
Appendix 4—table 5
Inference of the kinetic parameters (σoff,σon,ρ) using nascent data curated with the fusion method.

Inferred values and the corresponding 95% confidence intervals of G1 and G2 cell-cycle-specific data calculated using the profile likelihood method.

nascentσoffσonρburst sizefONCI for σoff,σon,ρ
G1Set 10.692.7336.3052.850.80(0.46, 1.05)(2.12, 3.52)(35.06, 37.91)
Set 21.053.0939.6937.710.75(0.86, 1.29)(2.73, 3.47)(38.77, 40.92)
Set 30.632.3535.7456.780.79(0.49, 0.79)(1.99, 2.71)(35.06, 36.58)
Set 40.832.7934.6841.570.77(0.62, 1.16)(2.27, 3.41)(33.54, 36.01)
G2Set 10.561.3434.4561.050.70(0.40, 0.81)(1.04, 1.73)(33.35, 35.82)
Set 20.661.8635.7354.110.74(0.51, 0.85)(1.57, 2.21)(34.87, 36.77)
Set 30.671.7233.7050.560.72(0.50, 0.91)(1.40, 2.11)(32.78, 34.87)
Set 40.922.3332.7935.480.72(0.62, 1.45)(1.74, 3.09)(31.65, 34.68)
Appendix 4—table 6
Inferred transcriptional rate (normalized) per gene copy for the G2 cell cycle phase using nascent data curated with the fusion method under the assumption that the two gene states are perfectly synchronized.
nascent fusionσoffσonρburst sizefON
G2 syncSet 11.844.4233.8418.420.71
Set 22.185.7835.8516.440.73
Set 32.155.3633.6115.630.71
Set 43.537.4534.169.690.68
Mean2.425.7534.3615.050.71
Std.0.751.261.013.760.02
Appendix 6—table 1
Comparison of the performance of three methods to compute the probability distribution.

Parameters same as in Appendix 6—figure 1. The time was calculated using the Julia package BenchmarkTools.jl.

Exact solution: precision = 85Exact solution: precision = 300FSP method
Minimum time:6.422ms9.277ms8.317ms
Median time:6.868ms9.562ms9.092ms
Mean time:8.279ms11.482ms9.415ms
Maximum time:16.791ms16.919ms14.203ms
# of Simulations:604436531

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Xiaoming Fu
  2. Heta P Patel
  3. Stefano Coppola
  4. Libin Xu
  5. Zhixing Cao
  6. Tineke L Lenstra
  7. Ramon Grima
(2022)
Quantifying how post-transcriptional noise and gene copy number variation bias transcriptional parameter inference from mRNA distributions
eLife 11:e82493.
https://doi.org/10.7554/eLife.82493