Phenotypic landscape inference reveals multiple evolutionary paths to C4 photosynthesis

Abstract
eLife digest
Introduction
Results
Discussion
Methods
References
Article and author information
Metrics

Abstract

C₄ photosynthesis has independently evolved from the ancestral C₃ pathway in at least 60 plant lineages, but, as with other complex traits, how it evolved is unclear. Here we show that the polyphyletic appearance of C₄ photosynthesis is associated with diverse and flexible evolutionary paths that group into four major trajectories. We conducted a meta-analysis of 18 lineages containing species that use C₃, C₄, or intermediate C₃–C₄ forms of photosynthesis to parameterise a 16-dimensional phenotypic landscape. We then developed and experimentally verified a novel Bayesian approach based on a hidden Markov model that predicts how the C₄ phenotype evolved. The alternative evolutionary histories underlying the appearance of C₄ photosynthesis were determined by ancestral lineage and initial phenotypic alterations unrelated to photosynthesis. We conclude that the order of C₄ trait acquisition is flexible and driven by non-photosynthetic drivers. This flexibility will have facilitated the convergent evolution of this complex trait.

https://doi.org/10.7554/eLife.00961.001

eLife digest

Plants rely on carbon for their growth and survival: in a process called photosynthesis, they use energy from sunlight to convert carbon dioxide and water into carbohydrates and oxygen gas. The chemical reactions that make up photosynthesis are powered by a chain of enzymes, and plants must ensure that these enzymes—which are in the leaves of the plant—are supplied with enough carbon dioxide and water. Carbon dioxide from the atmosphere enters plants through pores in their leaves, but water must be carried up the plant from the roots.

The type of photosynthesis used by about 90% of flowering plant species—including tomatoes and rice—is called C₃ photosynthesis. The first step in this process begins with an enzyme called RuBisCO, which reacts with carbon dioxide and a substance called RuBP to form molecules that contain three carbon atoms (hence the name C₃ photosynthesis).

In a hot climate, however, a plant can lose a lot of water through the pores in its leaves: closing these pores allows the plant to retain water, but this also reduces the supply of carbon dioxide. Under these circumstances this causes problems because RuBisCO uses oxygen to break down RuBP, instead of creating sugars, when carbon dioxide is not readily available. To prevent this process, which wastes a lot of energy and resources, some plants—including maize, sugar cane and many other agricultural staples—have evolved an alternative process called C₄ photosynthesis. Although it is more complex than C₃ photosynthesis, and required many changes to be made to the structure of leaves, C₄ photosynthesis has evolved on more than 60 different occasions.

In C₄ plants, the mesophyll—the region that is associated with the capture of carbon dioxide by RuBisCO in C₃ plants—contains high levels of an alternative enzyme called PEPC that converts carbon dioxide molecules into an acid that contains four carbon atoms. To avoid carbon dioxide being captured by both enzymes, C₄ plants evolved to relocate RuBisCO from the mesophyll to a second set of cells in an airtight structure known as the bundle sheath. The four-carbon acids produced by PEPC diffuse to the cells in the bundle sheath, where they are broken down into carbon dioxide molecules, and photosynthesis then proceeds as normal. This process allows photosynthesis to continue when the level of carbon dioxide in the leave is low because the plant has closed its pores to retain water.

Since C₄ plants grow faster than C₃ plants, and also require less water, plant biologists would like to introduce certain C₄ traits into C₃ crop plants. To help with this process, Williams, Johnston et al. have used computational methods to explore how C₄ photosynthesis evolved from ancestral C₃ plants. This involved investigating the prevalence of 16 traits that are common to C₄ plants in a total of 73 species that undergo C₃ or C₄ photosynthesis (including 37 species that possess characteristics of both C₃ and C₄).

Williams, Johnston et al. then went on to produce a new mathematical model that represents evolutionary processes as pathways across a multi-dimensional “landscape”. The model shows that traits can be acquired in various orders, and that C₄ photosynthesis evolved through a number of independent pathways. Some traits that evolved early in the transitions to C₄ photosynthesis influenced how evolution proceeded, providing “foundations” upon which further changes evolved.

Interestingly, the structure of the leaf itself appeared to change before any of the photosynthetic enzymes changed. This led Williams, Johnston et al. to conclude that climate change—in particular, the declines in carbon dioxide levels that occurred in prehistoric times—was probably not responsible for the original evolution of C₄ photosynthesis. Nevertheless, these results could help with efforts to adapt important C₃ crop plants to on-going changes in our climate.

https://doi.org/10.7554/eLife.00961.002

Introduction

The convergent evolution of complex traits is surprisingly common, with examples including camera-like eyes of cephalopods, vertebrates, and cnidaria (Kozmik et al., 2008), mimicry in invertebrates and vertebrates (Santos et al., 2003; Wilson et al., 2012) and the different photosynthetic machineries of plants (Sage et al., 2011a). While the polyphyletic origin of simple traits (Hill et al., 2006; Steiner et al., 2009) is underpinned by flexibility in the underlying molecular mechanisms, the extent to which this applies to complex traits is less clear. C₄ photosynthesis is both highly complex, involving alterations to leaf anatomy, cellular ultrastructure, and photosynthetic metabolism, and also convergent, being found in at least 60 independent lineages of angiosperms (Sage et al., 2011a). As the emergence of the entire C₄ phenotype cannot be comprehensively explored experimentally, C₄ photosynthesis is an ideal system for the mathematical modelling of complex trait evolution as transitions on an underlying phenotype landscape. Furthermore, understanding the evolutionary events that have generated C₄ photosynthesis on many independent occasions has the potential to inform approaches being undertaken to engineer C₄ photosynthesis into C₃ crop species (Hibberd et al., 2008).

The C₄ pathway is estimated to have first evolved between 32 and 25 million years ago (Christin et al., 2011b) in response to multiple ecological drivers, including decreasing atmospheric CO₂ concentration (Vicentini et al., 2008). C₄ species have since radiated to represent the most productive crops and native vegetation on the planet because modifications to their leaves increase the efficiency of photosynthesis in the sub-tropics and tropics (Edwards et al., 2010). In C₄ plants, photosynthetic efficiency is improved compared with C₃ species because significant alterations to leaf anatomy, cell biology and biochemistry lead to higher concentrations of CO₂ around the primary carboxylase RuBisCO Slack and Hatch, 1967; Langdale, 2011). The morphology of C₄ leaves is typically modified into so-called Kranz anatomy that consists of repeating units of vein, bundle sheath (BS) and mesophyll (M) cells (Hattersley, 1984; Langdale, 2011) (Figure 1—figure supplement 1). Photosynthetic metabolism becomes modified and compartmentalised between the M and BS, with M cells lacking RuBisCO but instead containing high activities of the alternate carboxylase PEPC to generate C₄ acids. The diffusion of these acids followed by their decarboxylation in BS cells around RuBisCO increases CO₂ supply and therefore photosynthetic efficiency (Zhu et al., 2008). C₄ acids are decarboxylated by at least one of three enzymes within BS cells: NADP- or NAD-dependent malic enzymes (NADP-ME or NAD-ME respectively), or phosphoenolpyruvate carboxykinase (PCK) (Hatch et al., 1975). Specific lineages of C₄ species have typically been classified into one of three sub-types, based on the activity of these decarboxylases, as well as anatomical and cellular traits that consistently correlate with each other (Furbank, 2011).

The genetic mechanisms underlying the evolution of cell-specific gene expression associated with the separation of photosynthetic metabolism between M and BS cells involve both alterations to cis-elements and trans-acting factors (Akyildiz et al., 2007; Brown et al., 2011; Kajala et al., 2012; Williams et al., 2012). Phylogenetically independent lineages of C₄ plants have co-opted homologous mechanisms to generate cell specificity (Brown et al., 2011) as well as the altered allosteric regulation of C₄ enzymes (Christin et al., 2007) indicating that parallel evolution underpins at least part of the convergent C₄ syndrome. However, while a substantial amount of work has addressed the molecular alterations that generate the biochemical differences between C₃ and C₄ plants (Williams et al., 2012) much less is known about the order and flexibility with which phenotypic traits important for C₄ photosynthesis are acquired (Sage et al., 2012). Clues to this question exist in the form of C₃–C₄ intermediates, species exhibiting characteristics of both C₃ or C₄ photosynthesis, such as the activity or localisation of C₄ cycle enzymes (Hattersley and Stone, 1986), the possession of one or more anatomical or cellular adaptations associated with C₄ photosynthesis (Moore et al., 1987), or combinations of both (e.g., Kennedy et al., 1980; Kotayeva et al., 2010). To address these unknown aspects of C₄ evolutionary history, we combined the concept of considering evolutionary paths as stochastic processes on complex adaptive landscapes (Wright, 1932; Gavrilets, 1997) with the analysis of extant C₃–C₄ intermediate species to develop a predictive model of how the full C₄ phenotype evolved.

Results

A meta-analysis of photosynthetic phenotypes

To parameterise the phenotypic landscape underlying photosynthetic phenotypes, data was consolidated from 43 studies encompassing 18 C₃, 18 C₄, and 37 C₃–C₄ intermediate species from 22 genera (Table 1). These C₃–C₄ species are from 18 independent lineages likely representing 18 distinct evolutionary origins of C₃–C₄ intermediacy (Sage et al., 2011a) (Figure 1—figure supplement 2). These studies were used to quantify 16 biochemical, anatomical, and cellular characteristics associated with C₄ photosynthesis (Figure 1—source data 1). Principal components analysis (PCA) was performed to confirm the phenotypic intermediacy of the C₃–C₄ species (Figure 1A). This result, the sister-group relationships of C₃–C₄ species with congeneric C₄ clades (McKown et al., 2005; Vogan et al., 2007; Christin et al., 2011a; Sage et al., 2011a; Khoshravesh et al., 2012) and the prevalence of extant C₃–C₄ species in genera with the most recent origins of C₄ photosynthesis (Christin et al., 2011b) all support the notion that C₃–C₄ species represent phenotypic states through which transitions to C₄ photosynthesis could occur. The combined traits of C₃–C₄ intermediate species therefore represent samples from across the space of phenotypes connecting C₃ to C₄ photosynthesis (Figure 1B). Within our meta-analysis data, C₃–C₄ phenotypes were available for 33 eudicot and 4 monocot species. 16 and 17 of these species have extant congeneric relatives performing NADP-ME or NAD-ME sub-type C₄ photosynthesis respectively. No C₃–C₄ relatives of PCK sub-type C₄ species are known (Sage et al., 2011a). Our meta-analysis therefore encompassed a variety of taxonomic lineages, as well as representing close relatives of known phenotypic variants performing C₄ photosynthesis.

Table 1

Summary of C₃–C₄ lineages assessed

https://doi.org/10.7554/eLife.00961.003

Family	Species	References*
Amaranthaceae	Alternanthera ficoides (C₃–C₄)	Rajendrudu et al. (1986)
	Alternanthera tenella (C₃–C₄)	Devi and Raghavendra (1993)
	Alternanthera pungens (C₄)	Devi et al. (1995)
Asteraceae	Flaveria cronquistii (C₃)
	Flavera pringlei (C₃)
	Flaveria robusta (C₃)
	Flaveria angustifolia (C₃–C₄)
	Flaveria anomala (C₃–C₄)	Ku et al. (1983)
	Flaveria chloraefolia (C₃–C₄)	Holaday et al. (1984)
	Flaveria floridana (C₃–C₄)	Adams et al. (1986)
	Flaveria linearis (C₃–C₄)	Brown and Hattersley (1989)
	Flaveria oppositifolia (C₃–C₄)	Ku et al. (1991)
	Flaveria ramosissima (C₃–C₄)	Rosche et al. (1994)
	Flaveria sonorensis (C₃–C₄)	Casati et al. (1999)
	Flaveria brownie (C₃–C₄)	McKown et al. (2005)
	Flaveria vaginata (C₃–C₄)	McKown and Dengler (2007)
	Flaveria pubescens (C₃–C₄)	Gowik et al. (2011)
	Flaveria australasica (C₄)
	Flaveria bidentis (C₄)
	Flaveria kochiana (C₄)
	Flaveria trinervia (C₄)
	Parthenium incanum (C₃)	Moore et al. (1987)
	Parthenium hysterophorus (C₃–C₄)	Devi and Raghavendra (1993)
Boraginaceae	Heliotropium europaeum (C₃)
	Heliotropium calcicola (C₃)	Vogan et al. (2007)
	Heliotropium convolvulaceum (C₃–C₄)	Muhaidat et al. (2011)
	Heliotropium greggii (C₃–C₄)
	Heliotropium polyphyllum (C₄)
Brassicaceae	Moricandia foetida (C₃)	Holaday et al. (1981)
	Moricandia arvensis (C₃–C₄)	Rawsthorne et al. (1988)
	Moricandia spinosa (C₃–C₄)	Beebe and Evert (1990)
	Moricandia nitens (C₃–C₄)	Rawsthorne et al. (1998)
	Raphanus sativus (C₃)	Ueno et al. (2003)
	Diplotaxis muralis (C₃–C₄)	Ueno et al. (2006)
	Diplotaxis tenuifolia (C₃–C₄)
Chenopodiaceae	Salsola oreophila (C₃)	P’yankov et al. (1997)
	Salsola arbusculiformis (C₃–C₄)	Voznesenskaya et al. (2001)
	Salsola arbuscula (C₄)
Cleomaceae	Cleome spinosa (C₃)	Voznesenskaya et al. (2007)
	Cleome paradoxa (C₃–C₄)	Koteyeva et al. (2010)
	Cleome gynandra (C₄)
Cyperaceae	Eleocharis acuta (C₃)	Bruhl and Perry (1995)
	Eleocharis acicularis (C₃–C₄)	Keeley (1999)
	Eleocharis tetragona (C₄)
Euphorbiaceae	Euphorbia angusta (C₃)
	Euphorbia acuta (C₃–C₄)	Sage et al. (2011b)
	Euphorbia lata (C₃–C₄)
	Euphorbia mesembryanthemifolia (C₄)
Molluginaceae	Mollugo tenella (C₃)
	Mollugo verticillata (C₃–C₄)	Sayre et al. (1979)
	Mollugo naudicalis (C₃–C₄)	Kennedy et al. (1980)
	Mollugo pentaphylla (C₃–C₄)	Christin et al. (2011a)
	Mollugo cerviana (C₄)
Poaceae	Avena sativa (C₃)	Slack and Hatch (1967)
	Neurachne tenuifolia (C₃)	Hattersley and Stone (1986)
	Neurachne minor (C₃–C₄)	Brown and Hattersley (1989)
	Neurachne munroi (C₄)
	Panicum bisculatum (C₃)	Goldstein et al. (1976)
	Panicum hians (C₃–C₄)	Ku et al. (1976)
	Panicum milioides (C₃–C₄)	Ku and Edwards (1978)
	Panicum miliaceum (C₄)	Rathnam and Chollet (1978)
		Rathnam and Chollet (1979)
		Holaday and Black (1981)
		Hattersley (1984)
	Saccharum officinarum (C₄)	Slack and Hatch (1967)
	Sorghum bicolor (C₄)	Slack and Hatch (1967)
	Triticum aestivum (C₃)	Slack and Hatch (1967)
	Zea mays (C₄)	Slack and Hatch (1967)
Portulaceae	Sesuvium portulacastrum (C₃)
	Portulaca cryptopetala (C₃–C₄)	Voznesenskaya et al. (2010)
	Portulaca oleracea (C₄)
Scrophularaceae	Anticharis kaokoensis (C₃)	Khoshravesh et al. (2012)
	Anticharis ebracteata (C₃–C₄)
	Anticharis imbricate (C₃–C₄)
	Anticharis namibensis (C₃–C₄)
	Anticharis glandulosa (C₄)

The family, species, photosynthetic type and original study are listed. In total, 16 characteristics relating to C₄ photosynthesis were extracted from 43 studies encompassing 18 C₃, 18 C₄, and 37 C₃–C₄ intermediate species.
*

References apply to all species within each genus.

Figure 1 with 4 supplements see all

Download asset Open asset

Evolutionary paths to C₄ phenotype space modelled from a meta-analysis of C₃–C₄ phenotypes.

Principal component analysis (PCA) on data for the activity of five C₄ cycle enzymes confirms the intermediacy of C₃–C₄ species between C₃ and C₄ phenotype spaces (A). Each C₄ trait was considered absent in C₃ species and present in C₄ species, with previously studied C₃–C₄ intermediate species representing samples from across the phenotype space (B). With a dataset of 16 phenotypic traits, a 16-dimensional space was defined. (C) A 2D representation of 50 pathways across this space. The phenotypes of multiple C₃–C₄ species were used to identify pathways compatible with individual species (e.g., *Alternanthera ficoides* [red nodes] and *Parthenium hysterophorus* [blue nodes]), and pathways compatible with the phenotypes of multiple species (purple nodes).

https://doi.org/10.7554/eLife.00961.004

Figure 1—source data 1 Binary scoring of C₄ traits present in C₃–C₄ species. The EM algorithm was used to assign binary scores for the presence or absence of 16 C₄ traits in 37 C₃–C₄ intermediate species. 1 denotes the presence of a trait, 0 denotes absence. Blank cells denote traits that have not been defined.: https://doi.org/10.7554/eLife.00961.005
Download elife-00961-fig1-data1-v1.xlsx

We defined each C₄ trait as either being absent (0) or present (1). For quantitative traits the expectation-maximization (EM) algorithm and hierarchical clustering were used to impartially assign binary scores (Figure 1—figure supplement 3). This generated a 16-bit string for each of the species (Figure 1—source data 1), with a presence or absence score for each of the traits included in our meta-analysis. This defined a 16-dimensional phenotype space with 2¹⁶ (65,536) nodes corresponding to all possible combinations of presence (1) and absence (0) scores for each characteristic.

A novel Bayesian approach for predicting evolutionary trajectories

Many existing methods of inference for evolutionary trajectories rely on phylogenetic information or assumptions about the fitness landscape underlying evolutionary dynamics (Weinreich et al., 2005; Lobkovsky et al., 2011; Mooers and Heard, 2013). In convergent evolution, these properties are not always known, as convergent lineages may be genetically distant and associated with poor phylogenetic reconstructions. In addition, the selective pressures experienced by each may be different and dynamic. We therefore consider the convergent evolution of C₄ fundamentally as the acquisition of the key phenotypic traits identified through our meta-analysis (Figure 1B). The process of acquisition of these traits can be pictured as a path on the 16-dimensional hypercube (Figure 1C), from the node labelled with all 0’s (the C₃ phenotype, with no C₄ characteristics) to the node labelled with all 1’s (the C₄ phenotype, with all C₄ characteristics).

The phenotypic landscape underlying the evolution of C₄ photosynthesis was then modelled as a transition network, with weighted edges describing the probability of transitions occurring between two phenotypic states (two nodes on the hypercube, Figure 1—figure supplement 4). Observed intermediate points were then used to constrain the structure of these phenotypic landscapes. To do this, we developed inferential machinery based on the framework of Hidden Markov Models (HMMs) (Rabiner, 1989) (Figure 1—figure supplement 4) and simulated an ensemble of Markov chains on trial transition networks. Each of these chains represents a possible evolutionary pathway from C₃ to C₄, and passes through several intermediate phenotypic states. The likelihood of observing intermediate states with characteristics compatible with the biologically observed data on C₃–C₄ intermediates was recorded for the set of paths supported on each trial network. A Bayesian MCMC procedure was used to sample from the set of networks most compatible with the meta-analysis dataset, and thus most likely to represent the underlying dynamics of C₄ evolution. The order in which phenotypic characteristics were acquired was recorded for paths on each network compatible with the C₃–C₄ species data, and posterior probability distributions (given uninformative priors) for the time-ordered acquisition of each C₄ trait were generated. For further information and mathematical details, see ‘Methods’.

To model the evolutionary paths generating C₄ without requiring additional dimensionality, we imposed that only one C₄ trait may be acquired at a time, and loss of acquired C₄ traits was forbidden. To test if we were nevertheless able to detect traits acquired simultaneously in evolution, we tested our approach on artificial positive control datasets containing intermediate nodes representing a stepwise evolutionary sequence of events (Figure 2A) and an evolutionary pathway in which four traits are acquired simultaneously at a time (Figure 2B). Our approach clearly assigned equal acquisition probabilities to traits whose timing was linked in the underlying dataset, even when 50% of the data was occluded (Figure 2B). These data are consistent with this approach detecting the simultaneous acquisition of traits in evolution, even though single-trait acquisitions are simulated.

Figure 2 with 1 supplement see all

Download asset Open asset

Verifying a novel Bayesian approach for predicting evolutionary trajectories.

(A and B) Datasets were obtained from an artificially constructed diagonal dynamic matrix (A), and a diagonal matrix with linked timing of locus acquisitions (B). The single, diagonal evolutionary trajectory was clearly replicated in both examples, over a time-scale of 16 individual steps, or four coarse-grained quartiles. We subjected these artificial datasets to our inferential machinery with fully characterised artificial species, and with 50% of data occluded in order to replicate the proportion of missing data from our C₃–C₄ dataset. (C) When applied to our meta-analysis of C₃–C₄ data, predictions were generated for every trait missing from the biological dataset. We tested this predictive machinery by generating 29 artificial datasets, each missing one data point, and comparing the presence/absence of the trait as predicted by our approach with the experimental data from the original study. (D and E) Quantitative real-time PCR (qPCR) was used to verify the predicted phenotypes of four C₃–C₄ species. The abundance *RbcS* (D) and *MDH* (E) transcripts were determined from six *Flaveria* species. White bars represent phenotypes already determined by other studies, grey bars those that were predicted by the model and asterisks denote intermediate species phenotypes correctly predicted by our approach (Error bars indicate SEM, N = 3).

https://doi.org/10.7554/eLife.00961.010

Verifying prediction accuracy

The presence and absence of unknown phenotypes were predicted by recording all phenotypes encountered along a set of simulated evolutionary trajectories that were compatible with the data from a given species (Figure 1—figure supplement 4), and calculating the posterior distribution of the proportion of these phenotypes with the value 1 for the unknown trait. If the mean of this distribution was <25% or >75%, and that value fell outside one standard deviation of the mean, the missing trait was assigned a strong prediction of absence or presence. To comprehensively test the accuracy of our predictive machinery, we generated 29 occluded datasets, consisting of the original full dataset with one randomly chosen data point removed. The predicted phenotype of each missing trait was then compared with the known phenotype published in the original study. For 29 occluded traits 18 were strongly predicted to be present or absent, and the remaining 11 predictions were neutral. Of the 18 strongly predicted traits (i.e., <25% or >75% probability), 15 were correct, with only one false positive and two false negative predictions (Figure 2C). The approach therefore assigns neutral predictions much more frequently than false positive or false negative predictions, suggesting that its outputs are highly conservative, and thus unlikely to produce artefacts. Predictions were generated for phenotypes that have not yet been described in C₃–C₄ species (Figure 2—figure supplement 1). Quantitative real-time PCR experimentally verified a subset of these, relating to abundance of C₄ enzymes not previously measured (Figure 2D–E). We also found that the model was able to successfully infer evolutionary dynamics in artificially constructed datasets (Figure 2A–B). Taken together, these prediction and verification studies illustrate that our approach robustly identifies key features of C₄ evolution.

A high-resolution model for the evolutionary events generating C₄

The posterior probability distributions for the acquisition time of each phenotypic trait were combined to produce an objective, computationally generated blueprint for the order of evolutionary events generating C₄ photosynthesis (Figure 3). These results were consistent with previous work on subsets of C₄ lineages that proposed the BS-specificity of GDC occurs prior to the evolution of C₄ metabolism (Hylton et al., 1988; Rawsthorne et al., 1988; Devi et al., 1995; Sage et al., 2012), and loss of RuBisCO from M cells occurs late (Cheng et al., 1988; Khoshravesh et al., 2012), but also provided higher resolution insight into the order of events generating C₄ metabolism. Alterations to leaf anatomy as well as cell-specificity and increased abundance of multiple C₄ cycle enzymes were predicted to evolve prior to any alteration to the primary C₃ and C₄ photosynthetic enzymes RuBisCO and phosphoenolpyruvate carboxylase (PEPC) (Figure 3).

Figure 3 with 3 supplements see all

Download asset Open asset

The mean ordering of phenotypic changes generating C₄ photosynthesis.

EM-clustered data from C₃–C₄ intermediate species were used to generate posterior probability distributions for the timing of the acquisition of C₄ traits in sixteen evolutionary steps (A) or four quartiles (B). Circle diameter denotes the mean posterior probability of a trait being acquired at each step in C₄ evolution (the Bayes estimator for the acquisition probability). Halos denote the standard deviation of the posterior. The 16 traits are ordered from left to right by their probability of being acquired early to late in C₄ evolution. Abbreviations: bundle sheath (BS), glycine decarboxylase (GDC), chloroplasts (CPs), decarboxylase (Decarb.), pyruvate, orthophosphate dikinase (PPDK), malate dehydrogenase (MDH), phosphoenolpyruvate carboxylase (PEPC).

https://doi.org/10.7554/eLife.00961.012

There was also strong evidence for enlargement of BS cells as an early innovation in most C₄ lineages (Figure 3), consistent with the suggestion that this was an ancestral state within C₃ ancestors of C₄ grass lineages and that this contributed to the high number of C₄ origins within this family (Christin et al., 2013; Griffiths et al., 2013). The compartmentation of PEPC into M cells and its increased abundance compared with C₃ leaves was predicted to occur at similar times, but for all other C₄ enzymes the evolution of increased abundance and cellular compartmentation were clearly separated by the acquisition of other traits (Figure 3). This result is consistent with molecular analysis of genes encoding C₄ enzymes that indicates cell-specificity and increased expression are mediated by different cis-elements (Akyildiz et al., 2007; Kajala et al., 2012; Wiludda et al., 2012).

Two approaches were taken to verify that these conclusions are robust and accurately reflect biological data. First, the analysis was repeated using scores for presence or absence of traits that were assigned by hierarchical clustering, as opposed to using the EM algorithm (Figure 3—figure supplement 1A). Although hierarchical clustering generated differences in the scoring of a small number of traits, the predicted evolutionary trajectories were not affected, producing highly similar results (Figure 3—figure supplement 1B). Second, we introduced structural changes to the phenotype space, by both adding and subtracting traits from the analysis (Figure 3—figure supplement 2). Removing two independent pairs of traits from the analysis did not affect the predicted timing of the remaining 14 traits (Figure 3—figure supplement 2A–B). However, increased standard deviations were observed in some cases (e.g., for the probabilities of acquiring enlarged BS cells, or decreased vein spacing) likely a consequence of using fewer data. To test if the addition of data might also affect the results, we performed an analysis with two additional traits included (Figure 3—figure supplement 2C). We selected two traits that have been widely observed in C₃–C₄ species, the centripetal positioning of mitochondria and the centrifugal or centripetal position of chloroplasts within BS cells (Sage et al., 2012). Despite the widespread occurrence of these traits, their functional importance remains unclear (Sage et al., 2012). Consistent with observations made from several genera, we predict that these cellular alterations are acquired early in the evolution of C₄ photosynthesis (Hylton et al., 1988; McKown and Dengler, 2007; Muhaidat et al., 2011; Sage et al., 2011b). Importantly, including these additional early traits in the analysis did not alter the predicted order of the original 16 traits. Together, these analyses did not alter our main conclusions, suggesting that they are robust.

The order of C₄ trait evolution is flexible

In addition to the likely order of evolutionary events generating C₄ photosynthesis, the number of molecular alterations required is also unknown. We therefore aimed to test if multiple traits were predicted to evolve with linked timing, and therefore likely mediated by a single underlying mechanism. To achieve this, we performed a contingency analysis by considering trajectories across phenotype space beginning with a given initial acquisition step. In this analysis, the starting genome had one of the 16 traits acquired and the rest absent, and the contingency of the subsequent trajectory upon the initial step was recorded. This approach was designed to test if acquiring one C₄ trait increased the probability of subsequently acquiring other traits, thus detecting if the evolution of multiple traits is linked by underlying mechanisms. Inflexible linkage between multiple traits was detected in artificial positive control datasets (Figure 2B) but not in the C₃–C₄ dataset (Figure 3—figure supplement 3). This result suggests that the order of C₄ trait acquisition is flexible. Multiple origins of C₄ may therefore have been facilitated by this flexibility in the evolutionary pathways connecting C₃ and C₄ phenotypes.

C₄ evolved via multiple distinct evolutionary trajectories

Our Bayesian analysis strongly indicates that there are multiple evolutionary pathways by which C₄ traits are acquired by all lineages of C₄ plants. First, no single sequence of acquisitions was capable of producing intermediate phenotypes compatible with all observations (‘Methods’). Second, several traits such as compartmentation of GDC into BS and the increased number of chloroplasts in the BS clearly displayed bimodal probability distributions for their acquisition (Figure 3). This bimodality is indicative of multiple distinct pathways to C₄ photosynthesis that acquire traits at earlier or later times. To investigate factors underlying this bimodality, we inferred evolutionary pathways generating the C₄ leaf using data from monocot and eudicot lineages, or from lineages using NAD malic enzyme (NAD–ME) or NADP malic enzyme (NADP-ME) as their primary C₄ acid decarboxylase. PCA on the entire set of inferred transition networks for monocot and dicot subsets revealed distinct separation (Figure 4A), suggesting that the topology of the evolutionary landscape surrounding C₄ is largely different for these two anciently diverged taxa. Performing this PCA including networks that were inferred from the full data set (with both lineages) confirmed that this separation is a robust result and involves posterior variation on a comparable scale to that of the full set of possible networks (Figure 4—figure supplement 1). Analysis of the posterior probabilities of the mean pathways representing either monocots or dicots revealed that this separation is the result of differences in the timing of events generating both anatomical and biochemical traits (Figure 4C). We propose that the ancient divergence of the monocot and eudicot clades constrained the evolution of C₄ photosynthesis to broadly different evolutionary pathways in each.

Figure 4 with 1 supplement see all

Download asset Open asset

Differences in the evolutionary events generating different C₄ sub-types and distantly related taxa.

Principal component analysis (PCA) on the entire landscape of transition probabilities using only monocot and eudicot data (A) and data from NADP-ME and NAD-ME sub-type lineages (B) shows broad differences between the evolutionary pathways generating C₄ in each taxon. Monocots and eudicots differ in the predicted timing of events generating C₄ anatomy and biochemistry (C), whereas NADP-ME and NAD-ME lineages differ primarily in the evolution of decreased vein spacing and greater numbers of chloroplasts in BS cells (D).

https://doi.org/10.7554/eLife.00961.016

There was more overlap between the landscapes generating NAD–ME and NADP-ME species (Figure 4B), likely reflecting the convergent origins of NAD–ME and NADP-ME sub-types (Furbank, 2011; Sage et al., 2011a). Despite the traditional definition of these lineages on the basis of biochemical differences, we detected differences in the timing of their anatomical evolution (Figure 4D). For example, in NAD–ME lineages, increased vein density was predicted to be acquired early in C₄ evolution, while in NADP-ME species this trait showed a broadly different trajectory (Figure 4D). The proliferation of chloroplasts in the BS was also acquired with different timings between the two sub-types. The alternative evolutionary pathways generating the NADP-ME and NAD–ME subtypes were therefore defined by differences in the timing of anatomical and cellular traits that are predicted to precede the majority of biochemical alterations (Figure 3, Figure 4D). We therefore conclude that these distinct sub-types evolved as a consequence of alternative evolutionary histories in response to non-photosynthetic pressures. Furthermore, we propose that early evolutionary events determined the downstream phenotypes of C₄ sub-types by restricting lineages to independent pathways across phenotype space.

Discussion

A novel Bayesian technique for inferring stochastic trajectories

The adaptive landscape metaphor has provided a powerful conceptual framework within which evolutionary transitions can be modelled (Gavrilets, 1997; Whibley et al., 2006; Lobkovsky et al., 2011). However, the majority of complex biological traits provide numerous challenges in utilising such an approach, including missing phenotypic data, incomplete phylogenetic information and in the case of convergent evolution, variable ancestral states. Here we report the development of a novel, predictive Bayesian approach that is able to infer likely evolutionary trajectories connecting phenotypes from sparsely sampled, highly stochastic data. With this model, we provided insights into the evolution of one of the most complex traits to have arisen in multiple lineages: C₄ photosynthesis. However, as our approach is not dependent on detailed phylogenetic inference, we propose that it could be used to model the evolution of other complex traits, such as those in the fossil record, which are also currently limited by the fragmented nature of data available (Kidwell and Holland, 2002). Our approach is also not limited by the time-scale over which predicted trajectories occur. As a result, it may be useful in inferring pathways underlying stochastic processes occurring over much shorter timescales, such as disease or tumour progression, or the differentiation of cell types.

C₄ evolution was initiated by non-photosynthetic drivers

A central hypothesis for the ecological drivers of C₄ evolution is that declining CO₂ concentration in the Oligocene decreased the rate of carboxylation by RuBisCO, creating a strong pressure to evolve alternative photosynthetic strategies (Christin et al., 2008; Vicentini et al., 2008). According to this hypothesis, alterations to the localisation and abundance of the primary carboxylases PEPC and RuBisCO would be expected to occur early in the evolutionary trajectories generating C₄. Conversely, our data suggest that alterations to anatomy and cell biology were predicted to precede the majority of biochemical alterations, and that other enzymes of the C₄ pathway are recruited prior to PEPC and RuBisCO (Figure 3). These enzymes, such as PPDK and C₄ acid decarboxylases, function in processes not related to photosynthesis within leaves of C₃ plants (Aubry et al., 2011), so the early changes to abundance and localisation of these enzymes within C₄ lineages may have been driven by non-photosynthetic pressures. A recent in silico study also predicts that changes to photorespiratory metabolism and GDC in BS cells evolved prior to the C₄ pathway (Heckman et al., 2013). Our model predicts that BS-specificity of GDC was acquired early in C₄ evolution for the majority of lineages. However, we also note that the predicted timing of GDC BS-specificity is bimodal in our analysis (Figure 3), and not predicted to be acquired early in monocot lineages (Figure 4C). These results suggest that this is not a feature of C₄ evolution to have occurred repeatedly in all lineages.

Recent evidence from physiological and ecological studies has identified a number of additional environmental pressures that may have driven the evolution and radiation of C₄ lineages, including high evaporative demands (Osborne and Sack, 2012) and increased fire frequency (Edwards et al., 2010). Increased BS volume and vein density have been proposed as likely adaptations to improve leaf hydraulics under drought (Osborne and Sack, 2012; Griffiths et al. 2013), but nothing is known about how early recruitment of GDC, PPDK, and C₄ acid decarboxylases (Figure 3) may relate to these pressures. A better understanding of the mechanisms underlying the recruitment of these enzymes (Brown et al., 2011; Kajala et al., 2012; Wiludda et al., 2012) may help identify the key molecular events facilitating C₄ evolution.

Our data also suggest that modifications to leaf development drove the evolution of diverse C₄ sub-types. For example, we find that differences in the timing of events altering leaf vascular development and BS chloroplast division occur prior to the appearance of the alternative evolutionary pathways generating the NADP-ME and NAD-ME biochemical sub-types (Figure 4D). These traits are predicted to evolve prior to any alterations to the C₄ acid decarboxylase enzymes that traditionally define these sub-types (Furbank, 2011). As an homologous mechanism has been shown to regulate the cell-specificity of gene expression in both NADP-ME and NAD-ME gene families in independent lineages (Brown et al., 2011), it is unlikely that mechanisms underlying the recruitment of these enzymes drove the evolution of distinct sub-types. We therefore conclude that these different sub-types evolved as a consequence of alternative evolutionary histories in leaf development, rather than biochemical or photosynthetic pressures. This may explain why differences in the carboxylation efficiency or photosynthetic performance of different C₄ sub-types have never been detected (Furbank, 2011), making the adaptive significance of different decarboxylation mechanisms difficult to explain. Instead, we propose that early evolutionary events determined the downstream phenotypes of C₄ sub-types by restricting lineages to independent pathways across phenotype space. The numerous differences in leaf development and cell biology between C₄ sub-types (Furbank, 2011) may provide clues as to which developmental changes underlie subsequent differences in metabolic evolution.

Convergent evolution was facilitated by flexibility in evolutionary trajectories

C₄ photosynthesis provides an excellent example of how independent lineages with a wide range of ancestral phenotypes can converge upon similar complex traits. Several studies on more simple traits have demonstrated that convergence upon a phenotype can be specified by diverse genotypes, and thus non-homologous molecular mechanisms in independent lineages (Wittkopp et al., 2003; Hill et al., 2006; Steiner et al., 2009). Taken together, our data also indicate that flexibility in the viable series of evolutionary events has also facilitated the convergence of this highly complex trait. First, we show that at least four distinct evolutionary trajectories underlie the evolution of C₄ lineages (Figure 4). Second, we find no evidence for inflexible linkage between the predicted timing of distinct C₄ traits (Figure 3—figure supplement 1). This diversity in viable pathways also helps explain why C₄ has been accessible to such a wide variety of species and not limited to a smaller subset of the angiosperm phylogeny. A recent model for the evolution of the biochemistry associated with the C₄ leaf also found that C₄ photosynthesis was accessible from any surrounding point of a fitness landscape (Heckman et al., 2013). Our study of C₄ anatomy, biochemistry, and cell biology also suggests the C₄ phenotype is accessible from multiple trajectories. Encouragingly, the trajectories predicted by Heckman et al. (2013) were found to pass through phenotypes of C₃–C₄ species, despite the fact that these species were not used to parameterise the evolutionary landscape. As different mechanisms generate increased abundance and cell-specificity for the majority C₄ enzymes in independent C₄ lineages (reviewed in Langdale, 2011; Williams et al., 2012), it is likely that mechanistic diversity underlies the multiple evolutionary pathways generating C₄ photosynthesis and may be a key factor in facilitating the convergent evolution of complex traits. This may benefit efforts to recapitulate the acquisition of C₄ photosynthesis through the genetic engineering of C₃ species (Hibberd et al., 2008), expanding the molecular toolbox available to establish C₄ traits in distinct phenotypic backgrounds.

Share this article

Cite this article

Evolutionary paths to C4 phenotype space modelled from a meta-analysis of C3–C4 phenotypes.

Figure 1—source data 1

Verifying a novel Bayesian approach for predicting evolutionary trajectories.

The mean ordering of phenotypic changes generating C4 photosynthesis.

Differences in the evolutionary events generating different C4 sub-types and distantly related taxa.

Author details

Ben P Williams

Contribution

Contributed equally with

Competing interests

Iain G Johnston

Contribution

Contributed equally with

Competing interests

Sarah Covshoff

Contribution

Competing interests

Julian M Hibberd

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Further reading

Evolutionary paths to C₄ phenotype space modelled from a meta-analysis of C₃–C₄ phenotypes.

The mean ordering of phenotypic changes generating C₄ photosynthesis.

Differences in the evolutionary events generating different C₄ sub-types and distantly related taxa.