Deep Evolutionary Comparison of Gene Expression Identifies Parallel Recruitment of Trans-Factors in Two Independent Origins of C4 Photosynthesis

With at least 60 independent origins spanning monocotyledons and dicotyledons, the C4 photosynthetic pathway represents one of the most remarkable examples of convergent evolution. The recurrent evolution of this highly complex trait involving alterations to leaf anatomy, cell biology and biochemistry allows an increase in productivity by ∼ 50% in tropical and subtropical areas. The extent to which separate lineages of C4 plants use the same genetic networks to maintain C4 photosynthesis is unknown. We developed a new informatics framework to enable deep evolutionary comparison of gene expression in species lacking reference genomes. We exploited this to compare gene expression in species representing two independent C4 lineages (Cleome gynandra and Zea mays) whose last common ancestor diverged ∼ 140 million years ago. We define a cohort of 3,335 genes that represent conserved components of leaf and photosynthetic development in these species. Furthermore, we show that genes encoding proteins of the C4 cycle are recruited into networks defined by photosynthesis-related genes. Despite the wide evolutionary separation and independent origins of the C4 phenotype, we report that these species use homologous transcription factors to both induce C4 photosynthesis and to maintain the cell specific gene expression required for the pathway to operate. We define a core molecular signature associated with leaf and photosynthetic maturation that is likely shared by angiosperm species derived from the last common ancestor of the monocotyledons and dicotyledons. We show that deep evolutionary comparisons of gene expression can reveal novel insight into the molecular convergence of highly complex phenotypes and that parallel evolution of trans-factors underpins the repeated appearance of C4 photosynthesis. Thus, exploitation of extant natural variation associated with complex traits can be used to identify regulators. Moreover, the transcription factors that are shared by independent C4 lineages are key targets for engineering the C4 pathway into C3 crops such as rice.