The genomic basis of adaptive evolution in threespine sticklebacks
Sticklebacks are originally marine fish that colonized freshwater habitats after the last glaciation. Adaptation to freshwater environment happened independently in various rivers and lakes around the globe, giving rise to similar phenotypes following natural selection. In a recent study, researchers aimed to identify potential loci repeatedly associated with the divergence between marine and freshwater sticklebacks. An underlying question was to uncover if this adaptation is due to regulatory or protein-coding changes.
To ensure that the changes reflected parallel evolution, the authors sequenced a reference freshwater stickleback and 20 other freshwater and marine sticklebacks from both Pacific and Atlantic populations. They selected populations showing characteristic marine and freshwater morphologies (Figure1 a, b).
To find loci involved in repeated adaptation to freshwater habitats, the authors used two methods, aiming to identify regions where sequences from freshwater sticklebacks were similar to each other but different from marine sticklebacks. The first method is a self-organizing map-based iterative Hidden Markov Model (SOM/HMM) (Figure1 c). With this method, they identified the 20 most common patterns of genetic relationships (trees) among the 21 individuals. The authors found that for most of the genome, the fish clustered by geography, with fish from Pacific regions being closer to each other than they were to fish from Atlantic regions. For 215 regions however (0.46% of the genome), the fish clustered by marine / freshwater ecology. 
The second method the authors used was genetic distance based. The idea was to use distance matrices based on 21*21 pairwise nucleotide divergence. They then calculated a marine-freshwater cluster separation score (CSS) for each distance matrix, used to quantify the average distance between marine and freshwater clusters (Figure 1 c). 174 marine-freshwater divergent regions were found, covering 0.26% of the genome. The two methods are complementary, as they found 242 regions identified by either method (0.5% of the genome) and 147 regions identified by both (0.2% of the genome). Both methods confirmed that the previously known chromosome IV EDA locus plays an important role in the difference between marine and freshwater populations.

Figure1: Genome scans for parallel marine-freshwater divergence a. Marine (red) and freshwater (blue) stickleback populations were surveyed from diverse locations. b. Morphometric analysis was used to select individuals for re-sequencing. The 20 chosen individuals are from multiple geographically-proximate pairs of populations with typical marine and freshwater morphology (solid symbols). Points: population mean morphologies; ellipses: 95% confidence intervals for ecotypes. c. Genomes were analysed using SOM/HMM (upper) and CSS (lower) methods to identify parallel marine-freshwater divergent regions. Across most of the genome, the dominant patterns reflect neutral divergence or geographic structure. In contrast, <0.5% of the genome show haplotype-ecotype association, a pattern characteristic of divergent marine and freshwater adaptation via parallel reuse of standing genetic variation.

The authors then aimed to determine to what extent the globally shared regions found with the previous methods are widespread in a particular marine-freshwater species pair, compared to locally evolved genomic regions. To do this, they sequenced whole genomes of a single marine-freshwater pair found across a marine-freshwater hybrid zone in a river in Scotland. By analyzing the 0.1% most divergent regions, they found that they contained 35.3% of globally shared marine-freshwater divergence. This result means that only a part of the divergence is due to globally shared variants and that the major part may be due to locally evolved mutations (Figure4).

Figure 4: How much of local marine-freshwater adaptation occurs by reuse of global variants? a. Classic marine and freshwater ecotypes are maintained in downstream and upstream locations of the River Tyne, despite extensive hybridization at intermediate sites16. b. Pairwise sequence comparisons identify many genomic regions that show high divergence between upstream and downstream fish (X-axis). Many, but not all, of these regions also show high global marine-freshwater divergence (Y-axis; red points indicate significant CSS FDR<0.05), indicating that both global and local variants contribute to formation and reproductive isolation of a marine-freshwater species pair.
The team also observed extended regions of marine-freshwater divergence on chromosomes I, XI and XXI corresponding to chromosome inversions, which are a known genetic mechanism that can maintain diverging ecotypes in hybridizing populations, by preventing recombination between independent adaptive loci (Figure 3).

Figure3: Genome-wide distribution of marine-freshwater divergence regions Whole-genome profiles of SOM/HMM and CSS analyses reveal many loci distributed on multiple chromosomes (plus unlinked scaffolds, here grouped as “ChrUn”). Extended regions of marine-freshwater divergence on chrI, XI, and XXI correspond to inversions (red arrows). Marine-freshwater divergent regions detected by CSS are shown as grey peaks with grey points above chromosomes indicating regions of significant marine-freshwater divergence (FDR 0.05). Genomic regions with marine-freshwater-like tree topologies detected by SOM/HMM are shown as green points below chromosomes.

The authors were then interested in the proportion of regulatory and coding change involved in stickleback’s adaptation to freshwater environment. To estimate this, they analyzed 64 divergent regions showing the strongest evidence of parallel evolution that were identified with the previous SOM/HMM and CSS methods. They found that even though both coding and regulatory changes are involved in stickleback adaptation to freshwater habitats, regulatory changes seem to play a much stronger role. Seventeen percent of these 64 regions consisted of coding regions with consistent non-synonymous substitutions between marine and freshwater fish. On the other hand, 41 % consisted of non coding regions of the genome that were most likely regulatory, while 42% were evaluated as probably regulatory, as they contained both coding and non-coding sequences, but lacked ecotype-specific amino acid substitutions. Finally, the authors investigated whole genome expression levels of freshwater and marine fish. 2817 of the 12594 informative genes across the whole genome showed significant differences in expression levels between freshwater and marine ecotypes. They also found that genes that had a difference in expression between ecotypes were more likely situated in or near adaptive regions previously discovered with the SOM/HMM or CSS methods (Figure 6).

Figure 6: Contributions of coding and regulatory changes to parallel marine-freshwater stickleback adaptation a. A genome-wide set of marine-freshwater loci recovered by both SOM/HMM and CSS analyses includes regions with consistent amino acid substitutions between marine and freshwater ecotypes (yellow sector); regions with no predicted coding sequence (green sector); and regions with both coding and non-coding sequences, but no consistent marine-freshwater amino acid substitutions (grey). b. Genome-wide expression analysis shows that marine-freshwater regions identified by SOM/HMM or CSS analyses are enriched for genes showing significant expression differences in 6 out of 7 tissues between marine LITC and freshwater FTC fish (observed: grey bars; expected: white bars; *P<0.01, **P<0.001, ***P<0.0001, ****P[double less-than sign]0.00001), consistent with a role for regulatory changes in marine-freshwater evolution.
In conclusion, the fact that sticklebacks repeatedly evolved from marine to freshwater habitats, coupled with the power of whole genome sequencing, has allowed to uncover a great number of loci globally involved in marine-freshwater adaptation. The differentiation seems to be spread across the genome, on several different chromosomes. Globally shared mutations, however, only account for a fraction of the differences, as a lot of locally evolved mutations also seem to play a significant role. Moreover, regulatory adaptations are particularly important in this case of repeated evolution, although protein-coding changes have also been found in the set of loci implicated in differences between ecotypes.
The authors finally suggest that although they focused on freshwater-marine differences, other ecological traits could be studied, like lake-stream or open-water and bottom dwelling habitats or gigantism in particular lakes, as sticklebacks have also repeatedly evolved these characteristics.

Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, Swofford R, Pirun M, Zody MC, White S, Birney E, Searle S, Schmutz J, Grimwood J, Dickson MC, Myers RM, Miller CT, Summers BR, Knecht AK, Brady SD, Zhang H, Pollen AA, Howes T, Amemiya C, Broad Institute Genome Sequencing Platform & Whole Genome Assembly Team, Baldwin J, Bloom T, Jaffe DB, Nicol R, Wilkinson J, Lander ES, Di Palma F, Lindblad-Toh K, & Kingsley DM (2012). The genomic basis of adaptive evolution in threespine sticklebacks. Nature, 484 (7392), 55-61 PMID: 22481358