A within-species analysis reveals the adaptive value of vein density along climatic gradients
Introduction
Functional ecology holds promise for revealing the phenotypic determinants of the adaptation of organisms to their local environment thanks to the comparison of plant functional traits across species or genotypes on a physiological basis (Calow, 1987; Keddy, 1992; Violle et al., 2007; Garnier et al., 2016). In that respect, plant ecology has gained tremendous progress in our understanding of local adaptation through structure-function analyses at the leaf level (e.g., REF Kikuzawa, wright et al. 2004, un papier de Reich). Notably, leaf vein density (VD) has a strong physiological basis, and has been advanced as a key trait to explain both leaf-level variation in metabolism (Brodribb et al., 2007; Brodribb & Feild, 2010) and plant-level adaptation to climate (Blonder et al., 2018). This echoes macroevolution studies that have long emphasized the complexification of leaf vein architecture as a major morphological innovation in the success of Angiosperms over Gymnosperm and ferns (Roth-Nebelsick et al., 2001; Boyce C. Kevin et al., 2009; de Boer et al., 2012; Simonin & Roddy, 2018). In addition, leaf vein traits have been proposed as a palaeoclimate and palaeoenvironmental proxy (Uhl & Mosbrugger, 1999; Blonder et al., 2014). However, most of these studies are based on the environment-leaf venation-fitness linkage that is in fact still poorly known. Establishing trait-environment relationships (TERs) is a foundation stone of functional ecology and functional biogeography (Violle et al., 2014; Garnier et al., 2016), but many of them have been hardly examined in plants (Shipley et al., 2016). This is especially true for VDclimate relationships even if theoretical expectations have been provided. Physiological models hypothesized that VD is positively linked to growing season temperature, based on an assumed coupling between leaf transpiration rate and the maximum potential water demand of the environment (potential evapotranspiration) (Blonder & Enquist, 2014). Higher VD strategies should also be advantageous in arid environments given the putative role of leaf venation in withstanding hydraulic continuity failure (i.e. embolism) resulting from soil water depletion (Brodribb et al., 2016). Some studies indeed highlighted a positive relationship between VD and temperature or aridity (Zhu et al., 2012)(Blonder et al., 2018)(Schneider et al., 2017). However, many others also showed a lack of climate signal on VD variation (e.g., Jordan et al., 2013). Overall, VD-climate relationships appeared to be taxon-specific, which can blur crossspecies explorations. The comparison of intraspecific and interspecific TERs has gained momentum in functional ecology (Siefert et al., 2015), and has allowed to reveal physiological mechanisms at play. More generally, cross-species TERs are sensitive to many sampling bias Chapitre 2 – Introduction 62 (Borgy et al., 2017), and phylogeny-controlled TERs can only partly capture their underlying mechanisms (Violle et al., 2014). TERs are classically built trait-by-trait, i.e. by overlooking trait covariation. Such a single-trait approach can lead to spurious physiological response curves if one or several traits covary with the trait and the environmental factor under scrutiny (Wüest et al., 2018). This could be particularly true for VD that is computed as the total path length of vein conduit divided by leaf area. By construction, it is a leaf area-based trait, which makes it a scaledependent parameter (Price et al., 2014a). If VD is under the control of leaf area, TERs can be quite complex because the latter is already known to display a strong environmental signal at both local and global scales (Moles et al., 2014; Wright et al., 2017). In this context, the scaling relationship for vascular architecture with leaf size is pivotal to examine (Sack et al., 2012). Sack et al. (2012) advocated that the relationship between VD and leaf area depends on the developmental stage of the leaves, and by consequence, on the different orders of veins that are used to compute VD. For full mature leaves, the authors did not expect any relationship between the density of minor veins as well of total VD and leaf area. Global cross-species analyses indeed revealed a lack of relationship between total VD and leaf area (Price et al., 2012; Sack et al., 2012). On the opposite, a negative relationship was found in several taxa and in more local studies (Roth-Nebelsick et al., 2001). The divergence between global interspecific patterns and local and/or intraspecific patterns is common in functional ecology (Price et al., 2014b; Messier et al., 2017; Anderegg et al., 2018; Osnas et al., 2018). It can translate differential actions of evolutionary, ecophysiological and biophysical constraints at different scales and biological organizational levels. Within-species studies are required to go deeper into the characterization of the constraints at the origin of trait-trait relationships. In particular, allometric scaling relationships are expected to hold within species if they result from fundamental biophysical (Witting, 1998; Shoval et al., 2012) and/or evolutionary (Donovan et al., 2011) constraints. More broadly, it is the time for functional ecology to meet molecular ecology in order to reveal the adaptive meaning of plant functional traits. In particular, model species for which genetic information is available are the best candidates to examine whether VD is linked to local adaptation. The colocalization (or lack of) of genes involved in both VD and leaf area could further be very promising to identify pleiotropic effects and explain the interdependency of both traits. Among model species, Arabidopsis thaliana has been used for decades in molecular biology due to its Chapitre 2 – Material and Methods 63 short life cycle and small genome (Krämer, 2015). The natural distribution of the species covers large climatic gradients that led to a strong genetic structure of the populations, suggesting underlying local adaptation (Lasky et al., 2012). Recently, A. thaliana has been used in functional ecology to quantify the heritability of functional traits and plant ecological strategies (Vasseur et al., 2018b; Kazakou et al., 2019) as well as to identify the mechanisms at the origin of functional tradeoffs and allometric scaling relationships (Vasseur et al., 2012, 2018a; Blonder et al., 2015; Sartori et al., 2019). Surprisingly, the analysis of the natural variability of VD in A. thaliana remains scarce (Rishmawi et al., 2017), as well as the exploration of VDenvironment linkage and the underlying local adaptation (Stewart et al., 2015, 2016). The combined use of the unique genetic data available for the species and the newly developed fast and efficient genome analysis methods (Zhou & Stephens, 2012; Luu et al., 2017) allows an unprecedented exploration of the VD genetic determinism and adaptive value across the distribution range of A. thaliana. Specifically, following a recent study that highlights drought resistance-associated alleles at both Northern and Southern extremes of A. thaliana distribution (Exposito-Alonso et al., 2018), we expect genes conferring high vein density to be under selection at these margins, too.
Plant material and growth conditions
We selected a set of 169 A. thaliana genotypes covering the natural distribution of the species (Fig. 1) and loaded the genetic sequences from the 1001 genome website (1001genomes.org). Seeds were sown in moist organic compost and stratified in a cold chamber at 4 °C for four days. Four seedlings per genotype were then transferred in individual pots filled up with organic compost (Neuhaus N2). Pots were randomly distributed on four tables, i.e. blocks, with one replicate per genotype per table. Tables were placed in a greenhouse with temperature maintained at 18°C during the day and 16°C during the night and with a supplemental lighting to maintain a constant 12.5 h day length. Plants were watered twice a week. Tables were rotated daily to reduce block effects within the greenhouse. The experiment lasted 140 days, from sowing (1st of December 2015) to the last harvest (19th of April 2016).
Vein density measurement
We harvested the last developed leaf which was fully expanded and fully exposed to light, at the bolting stage of each individual plant. Doing so, we aimed to measure traits expressed by adult leaves and avoid bias potentially caused by ontogenic trait variations (Vasseur et al., 2018b). Leaf tissues were fixed by placing the leaves in individual micro-tubes filled with Formalin–Acid–Alcohol for at least two days. Leaves were then cleared by a solution of 95% ethanol and 5% glacial acetic acid for 24 h. To ensure a high contrast between the vein network and other leaf tissues, the solution was supplemented with 0.001% of safranin powder. Then, leaves were dipped one hour in pure glycerol before mounting between two glass blades. We took pictures of the samples using a backlight device and a digital camera (Nikon D300s) equipped with a macro lens, at a 100 pixel per millimeter resolution. Leaf veins were manually traced using the Gimp software (The GIMP Development Team, 2019) and the vein networks were analyzed using MATLAB (Thompson & Shure, 1995) with the code provided in Blonder et al. (2018). Vein density was computed as the ratio of total vein length to leaf area. Leaf area were measured using the ImageJ software (Schneider et al., 2012). We compared the VD range in our dataset to the range reported in a previous A. thaliana study and global interspecific study using published datasets (Sack et al., 2012; Rishmawi et al., 2017).
Statistical analyses
Statistical analyses were performed with the R software (R Core Team, 2019, version 3.6.1). We calculated the genotype means of both VD and LA by estimating the marginal means of the variables from linear mixed models. The linear mixed models included the genotype as a random effect and the experimental block as a fixed effect. The linear mixed models were performed with the lme function from the nlme package (Pinheiro et al., 2020). The marginal means were computed with the emmeans function from the emmeans package (Searle et al., 1980). We evaluated the part of VD variation explained by LA and by the genotypes using a partial regression model, performed using the varpart function from the vegan package We explored the relationships between leaf traits and mean annual temperature and precipitation using spearman correlation test. When the correlation was significant but the distribution of the trait were asymmetric, we fitted 5th and 95th quantile linear regressions. The mean annual temperature and mean annual precipitation of genotypes’ collecting sites were extracted from the CHELSA database (chelsa-climate.org). The 5th and 95th quantile regressions and their slope comparison were performed using the quantreg package.
Genotype-phenotype associations
Taking advantage from the genetic data available for A. thaliana, we estimated the narrow sense heritability (h2) of the traits, i.e. the phenotypic variance due to the additive effect of the alleles. We used the Bayesian sparse linear mixed model using a Markov chain Monte Carlo method performed by the GEMMA software (Zhou et al., 2013). The model takes the genetic structure into account by computing the relatedness matrix of the genotypes. The model compute the overall additive effect of all the single nucleotide polymorphisms (SNPs) on the phenotype. To look for specific association between particular SNPs and the phenotype, and detect candidate genes, we performed genome wide association studies (GWAs) for both VD and LA. We used a linear mixed model performed with GEMMA (Zhou & Stephens, 2012) that Chapitre 2 – Material and Methods 66 controls for the genetic structure of the dataset by using the genetic relatedness matrix. In this case, we calculated the relatedness matrix using the -gk GEMMA function. We ensured that the tests were not liberal looking at the histograms of significance values (François et al., 2016) and calculated q-values (corrected p-values for the false discovery rates) using the Bioconductor’s q value R package when required (Storey, 2002). Using a Bonferroni significance threshold, we spotted sequences significantly associated with the phenotype (quantitative trait loci, QTL). To account for the effect of linkage disequilibrium, we extended the targeted sequences by 10 Kb upstream and downstream significant sequences (Kim et al., 2007). Using the Arabidopsis Information Resource database (www.arabidopsis.org), we extracted the list and functions of genes carried by these QTL. The final filter consisted in considering the relevance of the gene functions for the studied phenotype.
Selection cues
We scanned the genome for selection cues using the PCAdapt R package (Luu et al., 2017). The algorithm assesses the importance of each SNP on the genetic structure of the dataset. It postulates that SNPs under selection contribute more to the population structure than expected by neutral processes, such as genetic drift. The method uses a multivariate analysis that is well suited to study the large regional continuous populations of A. thaliana (Horton et al., 2012). It identifies the principal components (PC) of genetic variation, i.e. the genetic structure, and measures the contribution of each SNPs on their construction. The contributions of the SNPs are processed as p-values and filtered for false discovery rates. Using this method, we conducted an analysis with a genotype set independent from the GWAS’ genotype sets that are limited by the available phenotypic data. We started with the full set of 1135 genotype sequences available on the 1001 genomes project website (1001genomes.org) containing 12,883,854 variant loci. The distribution of P-values computed by PCAdapt from this dataset did not show the expected uniform distribution due to correlations between genotypes and between SNPs. We reduced the dataset until the uniform distribution assumption was correct (François et al., 2016). We pruned the data by keeping only the Single Nucleotide Polymorphisms, and loci and genotypes having less than 10% of missing data, which resulted in a 1,032 genotypes by 6,385,774 SNPs dataset. We then filtered the genotypes by the genetic distance using a 0.075 correlation coefficient limit. Final data contained 222 genotypes and 6,385,774 SNPs. As for GWAs, the p-values produced by PCAdapt were controlled for false discovery rates using the Bioconductor’s q value R package and the subsequent q-values were used to scan for adaptive peaks using a Bonferroni threshold.