Tutorial – Tutorial Genomics, Ecology, Evolution, etc https://wp.unil.ch/genomeeee Blog of a tutorial of Ecole doctorale de biologie UNIL Mon, 08 Nov 2021 16:12:31 +0000 en-US hourly 1 https://wordpress.org/?v=5.8.1 The genomic landscape of rapid repeated evolutionary adaptation to toxic pollution in wild fish https://wp.unil.ch/genomeeee/2017/12/22/the-genomic-landscape-of-rapid-repeated-evolutionary-adaptation-to-toxic-pollution-in-wild-fish-2/ https://wp.unil.ch/genomeeee/2017/12/22/the-genomic-landscape-of-rapid-repeated-evolutionary-adaptation-to-toxic-pollution-in-wild-fish-2/#respond Fri, 22 Dec 2017 10:45:57 +0000 http://wp.unil.ch/genomeeee/?p=924 Introduction

The pace of the evolutionary change depends on the existence of genetic variation, population size and intensity of the selection. While environmental change very often exceeds the rate of evolution for many species, killifish (Fundulus heteroclitus), living in U.S Atlantic coast estuaries turn out to be remarkably resilient. They have adapted to survive levels of toxic industrial pollutants, tolerating concentrations up to 8000 times higher than sensitive fish.  In this interesting study, Reid et al. use population genomic and transcriptomic analyses to reveal complex genetic basis of rapid adaptation in killifish to dramatic, human-induced, environmental change.

Results

Four pairs of sensitive and tolerant populations were compared. Based on comparative trancriptomics and analysis of 384 whole genome sequences few candidate regions are identified to underlay tolerance to complex mixtures of polycyclic and halogenated aromatic hydrocarbons. Interestingly, they are shared among four tolerant populations and are highly ranked. This suggests that the most important targets of selection have evolved in parallel across polluted sites.

Within shared outliers are genes involved in aryl hydrocarbon receptor (AHR) signalling pathway. Role of this pathway is to mediate toxicity. Experiments showed that tolerant populations exhibit reduced inducibility of AHR regulated genes while sensitive populations showed up to 70 upregulated genes in response to pollutant. At the genetic level, the tolerant populations evolved in highly similar ways indicating constrained phenotypic variation. It seems that selection acts only on few genes.

Processes involved in the adaptation of killifish to lethal levels of environmental pollution are complex. AHR pathway is a key target of natural selection but potentially negative effects of its desensitisation lead to compensatory adaptations in genes responsible for estrogen and hypoxia signalling regulation of cell cycle or immune system function.  Authors identified CYP1A dosage- compensating adaptation through gene duplications for impaired AHR signalling pathway. In northern tolerant populations CYP1A duplications have swept to high frequencies. Some individuals have up to eight copies of this gene. Other selective targets include genes outside AHR signalling pathway such as KCNB2 and KCNC3 genes whose products form conductance pore of the voltage-gated potassium channel. It seems to be very common that compensatory changes go along with rapid adaptive evolution.

Conclusion

This study underlies the role of high nucleotide diversity and extensive pre-existing genetic variation as crucial for selective sweeps and evolutionary rescue.  Also, number of evolutionary solutions to this kind of pollution is limited. Even though this study showed that some species have the capacity to overcome severe environmental changes due to natural richness of their genetic pool, most of the species, unfortunately, are not able to adapt such rapid changes due to low level of genetic variation.

Reid, N. M., Proestou, D. A., Clark, B. W., Warren, W. C., Colbourne, J. K., Shaw, J. R., et al. (2016). The genomic landscape of rapid repeated evolutionary adaptation to toxic pollution in wild fish. Science (New York, N.Y.), 354(6317), 1305–1308.

 

 

]]>
https://wp.unil.ch/genomeeee/2017/12/22/the-genomic-landscape-of-rapid-repeated-evolutionary-adaptation-to-toxic-pollution-in-wild-fish-2/feed/ 0
How the Galapagos cormorant lost its ability to fly https://wp.unil.ch/genomeeee/2017/12/19/how-the-galapagos-cormorant-lost-its-ability-to-fly/ Tue, 19 Dec 2017 13:50:29 +0000 http://wp.unil.ch/genomeeee/?p=910 Introduction

Novel traits play a key role in evolution by facilitating the access to new ecological niches. Novelty is often recognized at a phenotypic level and usually related to gain of new function. But can nature innovate through the loss of the function? Wing reduction and loss of flight in birds occurred several times in evolutionary history. It is found among 26 families of birds. However, it is difficult to determine genetic basis underlying this change.

In this interesting study Burga et al.  are using flightless Galapagos cormorant (Phalacrocorax harrisi) as an interesting model to study evolution of recent loss of flight. Namely, P.harrisi diverged from its flighted relatives within the past 2 million years and represents the only flightless cormorant among 40 existing species. The entire population (approximately 1500 individuals) is distributed along the coastlines of Isabela and Fernandina islands in the Galapagos archipelago.

There are two evolutionary paths that could possibly explain the loss of flight. Flightlessness could be positively selected if it helps birds to develop alternative ability to escape from predators and to survive (like swimming). Alternatively, if flying was not essential for surviving (no need to escape from predators) the mutations that obstruct flight might accumulate in the gene pool. These two scenarios are not necessarily mutually exclusive, meaning that passive loss of flight might be followed by positive selection that will keep reducing wings.

Results

In this study authors showed that comparative genomics is a powerful tool for disentangling evolutionary history and for understanding molecular mechanisms behind evolutionary changes. They sequenced, and de novo assembled the complete genomes of the Galapagos flightless cormorant and three flighted relatives.

Initially authors identified highly conserved regions of non-coding DNA in attempt to find potential candidates of the wing shortening but eventually they focused their attention on coding variants. Among coding variants, many genes were identified exclusively in the Galapagos cormorant genome. Variants related with dysfunction of the primary cilium (key role in mediating hedgehog signaling pathway during development) were selected as candidates linked to reduction of the wings. Interestingly, impaired function of the same genes in humans lead to bone development disorders described as skeletal ciliopathies.

To confirm hypotheses about effects of mutations in cilia-related genes, authors combine in vivo experiments in C. elegans and in vitro experiments in mouse chondrogenic cell lines. Experiments in C. elegans confirmed that a missense variant present in Galapagos cormorant IFT122 protein is sufficient to affect cilia function in vivo. Nevertheless, it would be interesting to see if knock-in of some other functional gene from Galapagos cormorant would lead to same behavior as wild type gene from C. elegans (used as positive control in this experiment).

Further analyses were focused on gene called CUX1. This gene is linked to shortened wings in chickens. Results from these experiments suggest that cilia and hedgehog signaling pathway related genes are likely transcriptional targets of CUX1 in chondrocytes. In normal functioning of hedgehog signaling pathway, CUX1 regulates expression of cilia related genes and promotes chondrogenesis. Since Galapagos cormorants carry different variant of the CUX1 gene, this possibly modify gene’s function, influencing both cilia formation and their functioning. All this reflects on hedgehog pathway activity, resulting in impaired bone growth.

Conclusion

This exhaustive study underlined importance of sophisticated genetic tools such as genomic analyses in explaining molecular mechanisms responsible for the changes observable at phenotypic level. Experiments revealed polygenic basis of the flightlessness. Even more, they select ciliary disfunction as a likely contributor to the evolution of loss of flight. Most valuable aspect of this study is its approach that can be used for identification of the other variants responsible for evolutionary innovation by analyzing genomes of closely related species.

 

Burga et al. (2017) A genetic signature of the evolution of loss of flight in the Galapagos cormorant. Science 356 (6341), eaal3345.

]]>
Convergent evolution of caffeine in plants by co-option of exapted ancestral enzymes https://wp.unil.ch/genomeeee/2017/12/18/convergent-evolution-of-caffeine-in-plants-by-co-option-of-exapted-ancestral-enzymes/ Mon, 18 Dec 2017 16:41:24 +0000 http://wp.unil.ch/genomeeee/?p=896 A biochemical story on convergent evolution

Introduction

Convergent evolution is the process by which similar traits evolve independently in distantly related organisms, such as wings in bats and birds. This can target orthologous or unrelated genes, which gives a different view on the concept of convergent evolution : how much it is constrained to some pathways, or, reversely, how diverse the path to the same function can be.

For convergent evolution to arise, different proteins must be assembled into an ordered, functional pathway. Currently, Three hypotheses shed light on the matter. Under the cumulative hypothesis, enzymes catalyzing the earlier reactions of a pathway must evolved first, because, otherwise, enzymes that perform the following steps would have no substrate to react with. Later steps would arise by duplication of the first enzyme. This suppose that intermediates are advantageous. Reversely, under the retrograde hypothesis, enzymes catalazing the later steps of a pathway would evolved first, and then gene duplication would give rise to the enzymes catalysing earlier steps. This suppose that intermediates could be produced non-enzymatically but doesn’t assume anything on their potential effect. Finally, the patchwork hypothesis states that a novel pathway will arise by the recruitment and rerouting of an alternative, preexisting pathway – we talk about ‘exapted’, or ‘co-opted’ enzymes. This suppose that the older, recruited enzyme was catalazing a promiscous reaction.

In plants, one of the most studied example of convergent evolution is caffein biosynthesis, which seems to have independently appeared at least five times during flowering plant history : only a few representatives of each clade display caffein biosynthesis, wich means, under the parcimony rule, that rather to be a trait ancestrally shared, it is more likely to have independently and repeatedely emerged. For the past 30 years, only one biosynthetic path among the several possible was shown to have convergently eveloved (Fig. 1), though with paralougous enzymes in both Coffea (XMT) and Camillia (CS) from the SABATH family. Most of those enzymes actually catalyze O-methylation, but were recently co-opted in those species to catalyze N-methylation.

In their article, Huang et al. [1] shed light in how caffeine convergence occured in 5 different species : Theobroma, Paullinia, Citrus, Coffea and Camillia. They were able to uncover different biosynthetic paths, thereby contradicting the idea that convergence in caffeine biosynthesis was constrained to one and only path, and to reconstruct ancestral enzyme activities, thereby illustrating to a molecular level how a new function can arise.

 

Different biosynthetic pathways to caffeine exist in Theobroma, Paullinia, and Citrus

To uncover genes and pathways used in plants to produce caffeine, Huang et al. first identify SABATH enzymes from each species studied and mapped them to the EST database to uncover the ones expressed in the caffeine producing tissues. Next they conducted enzyme assays to identify substracts with which they were able to react and mass spectroscopy scans to identify products, reconstructing full pathways to caffeine in those distantly related species (Fig. 2A).

Indeed, they could reveal that Theobroma and Paullinia express CS-type enzymes orthologous to the ones expressed in Camellia in their caffeine producing tissues. Surprisingly, they catalyze a different biosynthetic path (Fig. 1). Importantly, the enzymes catalyzing the first steps, TcCS1 and PcCS1 for Theobroma and Paullinia respectively, and the second step, TcCS2 and PcCS2 are respectively more distantly related than are TcCS1 and TcCS2, and PcCS1 and 2, which was unexpected considering their catalytic similarity. Therefore, it represents a strong evidence towards convergent repeated duplication of those enzymes in each lineage, rather than ancestrally duplicated enzymes that would have been lost in other non-caffeine producing lineages. However, tough Huang et al. provide a phylogenetic tree of SABATH enzymes with bootstrap values for major nodes, the ones providing that statement are not supported, which would have give further credit to it.

Because of the phylogenetic proximity of Paullinia and Citrus, that are both part of the Sapindales family, one could expect that they share the same enzyme type involved in caffeine production. Nevertheless, Citrus do not express CS-type enzymes in sites of caffein prduction but rather express two recently duplicated XMT-type enzymes ortologous to the ones found in Coffea, but they are specialized in another biosynthetic path to caffeine (Fig. 1).

Therefore, contrary to what has been believed for more than 30 years, plants have a much broader biosynthetic repertoire than previously known, with at least three different paths leading to caffeine biosynthesis that convergently emerged. However, this is unclear which proteins were exapted and what function they previously served, allowing them to be preserved along million of years of evolution.

Ancestral XMT enzymes displayed O-methylation

Coffea and Citrus XMT enzymes ancestors needed to be maintained for more than 100 My from their common ancestor, to then independently give rise to N-methylating enzymes involved in caffeine production. To understand what allow them to be maintained, Huang et al. used a method allowing to ‘resurect’ ancient protein [2]. This consist in inferring ancestral sequences based on to-day descendant protein alignments and to synthetize them to characterize their function.

Using that method, Huang et al. ressurect the 100 My old XMT enzyme ancestor to Rosids and Asterids, hereby called RAAncXMT  (Fig. 2A), and its descendant CisAncXMT1, at the node giving rise to the citrus lineage (Fig. 2B). They both exhibit high O-methylation activty (Fig. 2C), which explains why they would have been maintained over such a long time, but no N-methylation activity. It is still the case of one of its to-day descendants in Mangifera, whereas they have specialized in N-methylation in Citrus. Today, Citrus possess a SAMT enzyme capable of both methylations, which could account for the loss of that function in XMT enzymes.

They also resurrected CisAncXMT2, at the node giving rise to both to-day CisXMT1 and 2, responsible for caffeine production. Interrestingly, CisAncXMT2, while still maintaining small O-methylation activity, display N-methylation activity, including almost all of the activities of both to-day enzymes, reconstituting together the two last steps needed for caffeine production, tough it is still unclear how it was recruited to form a functional pathway.

Ancestral citrus XMT enzymes were only a few steps away from to-day caffeine production function

To understand how nowadays Citrus XMT enzymes arised from CisAncXMT2, Huang et al. mapped it against CisXMT1 and 2 and identified key mutations. They then mutagenized the resurrected enzyme. In the lineage leading to CisXMT2, they identified one key mutation, P25S, that was sufficient to reproduce qualitatively the activity of CisXMT2 (Fig. 2C). Similarly, in the lineage leading to CisXMT2, they identified H150N as the mutation sufficient to reproduce roughly today’s activity. Altough other mutations could have shifted the ancestral enzyme activity, this shows that, after duplication, from those 2 single mutations alone, a complete pathway to caffeine would have emerge.

Two very interesting points sould be noted here. First, that very few mutations are sufficient to shift one enzyme substrate preference, which may have been a more widespread fact during evolution. Second, that contrary to the very linear biosynthetic vision one may have, several activities can emerge at the same time, and, more importantly, while maintaining the original activity, as it was the case for CisAncXMT1, thereby reconciling several hypothesis.

 

Conclusion

Using the very concrete exemple of caffeine Biosynthesis, Huang et al. were able to nicely illustrate the mechanisms of convergent evolution, unveiling much more diversity than previously thougth  in the biosynthetic path, and to give us a view on the transition from the ancestral enzyme to the nowaday ones, demonstrating that the hypotheses running in the field were not mutually exculsive since biological pathways are not as linear as one may think.

Altough enzymatic data were quite strong and the whole story quite convincing, the phylogenetic analysis leading to the resurrection of enzymes would have benefit from the authors sharing statistical confidence on the alignment, and especially in the sites they mutagenised thereafter. Nevertheless, the study Huang et al. conducted was well constructed and easily understandable form people outside of the field, and we hope to learn more about the other caffeine producing plants, such as Guayusa, that contains much more caffein that coffea itself.

 

[1] Huang R, O’Donnell AJ, Barboline JJ, Barkman TJ (2016) Convergent evolution of caffeine in plants by co-option of exapted ancestral enzymes. Proc Natl Acad Sci USA 113:10613–10618

[2] Thornton JW (2004) Resurrecting ancient genes: Experimental analysis of extinct molecules. Nat Rev Genet 5(5):366–375

]]>
A genomic history of Aboriginal Australia; https://wp.unil.ch/genomeeee/2017/11/13/a-genomic-history-of-aboriginal-australia-blogpost/ Mon, 13 Nov 2017 19:43:14 +0000 http://wp.unil.ch/genomeeee/?p=860 Blogpost on:

Malaspinas et al 2016 A genomic history of Aboriginal Australia. Nature 538: 207–214.

Introduction:

Prior to the publication of Malaspinas et al. 2016, investigation of Aboriginal Australian genome sequences had been quite limited. In fact, only 3 whole genome sequences from Aboriginals had been analyzed, 2 of these obtained with limited information concerning their place of origin (Rasmussen et al. 2011).

Malaspinas et al. 2016 is the first comprehensive study aimed at uncovering how the settlement of Australia occurred. The study combines genomic, linguistic and archeological studies in order to obtain more detailed information on how the settlement of Australia occurred.

For the largest part of the past 100000 years, Tasmania, New Guinea and Australia were part of the same continent known as Sahul. This continent was detached from mainland Asia, and its settlement process by human populations still remains poorly understood.

Previous archaeological evidence has led to the hypothesis that the settlement of Australia occurred from an African emigration wave, which predates the African emigration wave that settled in Eurasia (Lahr, M. et al. 1994). This has been coined the 2 Out of Africa event hypothesis (2OoA). Yet, other genetic studies support the notion that one major migration out of Africa (OoA) followed by 1 or 2 independent migratory waves led to the settlement of the modern Eurasian and Oceanic continents respectively.

The authors find that the data collected in the study more closely fits a model of single out of Africa dispersal (OoA), followed by divergence of Eurasians from Australo-Papuans. Finally the divergence of Aboriginals and Papuans from their common ancestral population ensued between 25000 and 40000 years ago.

Dataset

The study is based on 108 newly-sequenced Aboriginal and Papuan genomes (83 Aboriginals and 25 Papuans) and genotype data for 45 additional Papuans. Moreover, SNP genotype data on Aboriginal Australians from Arnhem Land and from the European Collection of Cell Cultures Panel defined in previous studies was taken advantage of for admixture studies.

Colonization of Sahul

The authors use sparse non-negative matrix factorization (sNMF) on the combined datasets in order to determine the genomic ancestry proportions of Papuans and Aboriginals (Frichot, E. et al. 2014). The authors find that Aboriginals are mainly a mix of European, East Asian, New Guinean and Aboriginal ancestry. The most significantly contributing ancestry proportions stemming from Europeans and from Aboriginal ancestry. As expected, individuals from the Australian coastline displayed higher proportion of European ancestry compared to individuals from the desertic Australian inland.

Papuans instead display a majority of genomic ancestry stemming from New Guineans and East Asians. The proportion of New/Guinean ancestry in Aboriginal Australians is related to the distance from Papua, with Northeastern Australians containing a significantly higher proportion of Papuan ancestry compared to Southwestern Australians (Fig. 2a).

Based on f3 statistics, multidimensional scaling analyses (MDS) and genomic ancestry proportion inference, the authors show that Australians and New-Guineans are more similar to each other than to the other populations analyzed in the study (Fig. 2b,c) This favors the hypothesis that they share a common ancestral population which settled the continent of Sahul.

For the subsequent analyses the authors mask data stemming from non-Aboriginal ancestry or select samples based on their Aboriginal Ancestry. Specifically, the authors filter the information from the ancestry proportions and maintain only loci in which both loci show Aboriginal ancestry (Suppl Inf S06).

In order to shed some light on whether the settlement of Australia proceed through one or 2 separate founding waves, the authors use a simulation-based framework initially presented in (Excoffier et al. 2013). Specifically, this composite likelihood method compares the observed joint site frequency spectrum (SFS) to the expected one, allowing inference of coalescence based on SNPs (Excoffier et al. 2013).

The SFS approximation and the MDS analysis results both suggests that, a one wave founding model followed by divergence of a common ancestor into Papuan and Aboriginal populations fits the data more closely (Fig. 2a,b, Fig.3).

Fig.1 (Malaspinas et al. 2016) Describes locations for analyzed Australo-Papuans datasets

Fig.2 (Malaspinas et al. 2016) Australian Aboriginal Ancestry. A)Analysis of admixture in Australo-Papuans by sNMF. B-C) MDS analysis and f3-statistics to assess relationships within the Aboriginal population and between Australo-Papuans

 

 

 

Archaic Admixture

Next, the authors focus on characterizing the extent of archaic (Neanderthal and Denisovan) admixture contributing to the Australian and Papuan genomes. They do so based on the previously described SFS modeling-based approach, a D-statistics based on goodness-of-fit analysis (Green et al. 2010) and a putative archaic haplotype derivation method (Suppl Inf. Section 10).

D-statistics test was initially used in genetics (Green et al. 2010) in order to determine the extent of admixture between 3 populations. It compares which pair in a trio of tested populations is more closely related based on SNPs. The archaic haplotype derivation method used instead, is based on enhanced D-statistics (Meyer, M. et al. 2012) and linkage disequilibrium approaches (Wall, JD. et al. 2013)

Based on the 3 approaches the authors report that Aboriginal and Papuan genomes display an accumulation of Denisovan introgressed genes compared to non-Africans, and have the highest proportion of putatively Denisovan derived haplotypes compared to non-Africans. Additionally, they show that the estimated number of Denisovan derived haplotypes correlates with the proportion of Australo-Papuan ancestry across individuals (Ext. Fig. 3a,b,c). In summary the evidence indicates that Denisovan admixture predates the split of Australo-Papuans and the widespread Eurasian admixture into Aboriginal Australians.

Ext. Fig 3 (Malaspinas et al. 2016) Archaic (Denisovan and Neanderthal) genome introgressed haplotypes. A) Analysis of putative introgressed Denisovan and Neanderthal sites in European, East Asian, Australo-Papuan and South American Populations. B-C) Analysis of estimated archaic haplotype number in world populations. D-E) metrics of archaic derived haplotypes in populations of study.

 

Out of Africa

In order to determine whether the OoA or 2OoA wave scenario is more likely, the authors peform D-statistic on the following trios: Aboriginals and Eurasians compared to Africans  VS Aboriginals and Eurasians compared to Ust’-Ishim (proxy for modern human from Asian (Fu, Q. et al. 2014)).

The authors find that if not accounting for Denisovan admixture, Africans and Ust’-Ishim are closer to Eurasians than to Aboriginal Australians, supporting a 2OoA model. Yet when accounting for the previously identified Denisovan admixture events, the test results indicate that Aboriginals and Eurasians are equally related to Ust’-Ishim favouring a OoA wave model. The same is seen when taking into account Denisovan admixture and considering populations across the whole world (Ext. Data Fig. 4a,b).

Implementation of the SFS-analysis and accounting for moderate Denisovan admixture also shows a more accurate fit of the data to the OoA model. The most accurately fitting model of the SFS-analysis shows that, first, Australo-Papuan divergence from Eurasians most likely occurred about 58000 years ago, and European divergence from East Asians occurred about 42000 years ago (Fig. 4).

Multiple Sequential Markovian Coalescence analysis also supports a model in which Australo-Papuans and Eurasians split from one ancestral population (Ext. Fig. 4, Ext. Fig. 6).

 

Fig. 3 (Malaspinas et al. 2016) SFS based modelling approximation of Australia settlement founding waves

 

 

 

 

 

 

Fig. 4 (Malaspinas et al. 2016) SFS based modelling approximation of most likely OoA migration

 

Genetic Structure of Aboriginal Australians

Subsequent investigation of mitochondrial DNA (mtDNA) and Y chromosome between-group variation shows that male-mediated migration was a driving factor in the substructuring of Aboriginals. MDS analyses performed on Aboriginals masked and non-masked for non-Aboriginal ancestry and geographic location of samples (Ext. Fig. 7a,b)suggests a population separation between Southwestern and Northeastern groups. This is in line with the model proposed by the SFS analysis of Australian continent settlement.

Usage of an ulterior modelling approach based on a three layer neural network (Bishop, CM. 1996,; Heaton, J. 2011), reveals that a majority of the gene flow took place along the Australian coasts (Ext. Fig. 7e-g). This result is consistent with the hypothesis that desertic internal Australian regions formed a natural barrier to gene flow.

Implementation of Bayesian statistics approaches for European, East Asian and Papuan admixture among Aboriginals reveals that Papuan admixture predated both East Asian and European admixture (Ext. Fig. 8a). Additionally, local ancestry inference based on tract length also underlines that Papuan gene flow into Aboriginals occurred before European and East Asian gene influx.

Ext. Fig. 4 A)D-statistics based proposed model for African emigration models. B) Sum of squared errors for the possible odels predicted by D-statistics. C)MSMC derived crosscoalescence rates determined by analysis of pairs of individuals. D) Assessment of archaic admixture on cross coalescence presented in C by modelling

 

Pama-Nyungan languages and genetic structure

Next, the study investigates how closely linguistic demographics could reflect genetic relationships by comparing phylogenetic trees obtained based on linguistics and trees based on Fixation index masking for Eurasian tracts (Ext. Fig. 7). Analysis of the common patterns identified in the two trees by distance-matrices and correlation analysis of linguistics and genetics reveal an initial divergence of populations beginning about 30000 years ago, followed by population size changes and highly reduced gene flow from Northeastern to Southwestern Australia, due in great part to the desertic geographic barrier.

 

Ext. fig. 7 A-B) MDS analysis performed when including all genome sequences (A) and when masking non-boriginal variants. C-D) Comparison of phylogenetic trees computed based on genetic and linguistic analysis respectively.

 

 

 

 

 

 

 

Selection in Aboriginal Australians

Finally, the authors perform scanning analyses to identify Aboriginal genomic regions which diverge highly in allele frequency since the split from Papuans (about 10000-30000 years ago), followed by identification of genomic regions which diverge highly based on geo-ecological location. Based on this analysis the authors identify two mutations which may have played a role in Aboriginal adaptation to the arid and desert Australian interior.

 

Conclusions

The authors analyses supports a model by which a OoA movement, followed by the split of an ancestral population which first colonized Australia and predated the physical separation of Sahul into Mainland Australia from the Papuan/New Guinea islands.

 

References:

Rasmussen, M. et al. An Aboriginal Australian genome reveals separate human dispersals into Asia. Science 334, 94–98 (2011)

Davidson, I. The colonization of Australia and its adjacent islands and the evolution of modern cognition. Curr. Anthropol. 51, S177–S189 (2010)

Lahr, M. M. & Foley, R. Multiple dispersals and modern human origins. Evol. Anthropol. Issues News Rev . 3, 48–60 (1994)

Frichot, E., Mathieu, F., Trouillon, T., Bouchard, G. & François, O. Fast and efficient estimation of individual ancestry coefficients. Genetics 196, 973–983 (2014)

Excoffier, L. Dupanloup, I. Huerta-Sanchez, E. Sousa, VC. Foll ,M. et al. Robust Demographic Inference from Genomic and SNP data. Plos Genetics 10, 1-17(2013)

Green RE, Krause J, Briggs AW, et al. A draft sequence of the Neandertal genome. Science. (56 co-authors). 2010;328(5979):710–722.

Meyer M, et al. A high-coverage genome sequence from an archaic denisovan individual. Science. 2012;338:222–226. doi: 10.1126/science.1224344

 

Wall JD, et al. Higher levels of Neanderthal ancestry in East Asians than in Europeans. Genetics. 2013;194:199–209

 

Fu Q, Li H, Moorjani P, Jay F, Slepchenko SM, Bondarev AA, Johnson PL, Aximu-Petri A, Prüfer K, de Filippo C, et al. 2014. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature 514: 445–449

Bishop, CM 1996. Neural Networks For Pattern Recognition. 1 edition. Oxford New York: Clarendon Press

Heaton, J. 2011: Programming neural networks with Encog3 in Java. 2 edition. Heaton Research, Inc.

]]>
The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons https://wp.unil.ch/genomeeee/2016/12/14/the-spotted-gar-genome-illuminates-vertebrate-evolution-and-facilitates-human-teleost-comparisons/ Wed, 14 Dec 2016 12:43:01 +0000 http://wp.unil.ch/genomeeee/?p=770 ResearchBlogging.org

The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons

About 450 mya bony vertebrates radiated into Lobe-finned fish, from which tetrapods appeared later, and Ray-finned fish, which include Teleost (Fig.1). Nowadays they make up to 96 percent of all fish in the planet. Among the latter some species such as zebrafish (Dario renio) and medaka (Oryzias latipes) are used as model organisms in biomedical research in order to try to understand which is the genetic basis of certain human diseases. However, the transferability between the models is difficult given the phylogenetic distance between tetrapods (humans) and Ray-finned fish. For this reason, the authors decided to sequence the genome of the Spotted Gar (Lepisosteus oculatos), that can act as a bridge as it split off from the teleosts before the TGN (Teleost Genome Duplication). During vertebrate evolution two other genome duplications happened in the vertebrate linage: VGD1 and VGD2.

Fig1: Spotted gar is a ray-finned fish that diverged from teleost fishes before the TGD. Gar connects teleosts to lobe-finned vertebrates, such as coelacanth, and tetrapods, including human, by clarifying evolution after the two earlier rounds of vertebrate genome duplication (VGD1 and VGD2) that occurred before the divergence of ray-finned and lobe-finned fishes 450 million years ago (MYA)

Genes duplicates derived from the TGN are called Ohnologs. They were named by after Susumu Ohno, who showed in his work genome duplication may play an important role in evolution. The resulting paralogs (a special case of homology when duplicate genes or regions are in the same genome) are associated with development, signaling and gene regulation [2 sentences edited by Marc Robinson-Rechavi]. In addition ohnologs, which amount to about 20 to 35% of genes in the human genome, are frequently implicated in cancer and genetic diseases. Evolution acts on these duplicates and usually they can evolve in three different ways. Mechanisms that lead to preservation of duplicates are sub functionalization (partitioning of ancestral gene functions on the duplicates), neofunctionalization (assigning a novel function to one of the duplicates) and dosage selection (preserving genes to maintain dosage balance between interconnected components). Therefore the most likely outcome is non-functionalization of one duplicate genes due to the lack of selective constraint on preserving both. Because of the asymmetric evolution of ohnologs, TGD, and the speed at which the genome of teleost has evolved, connecting teleost sequences to human sequences can be challenging.
The authors thought, however, that the genome of the Gar can solve these problems due to its slow genetic evolution. Using this “Gar Bridge” allows to clarify the evolution of orthologs (genes in different species that evolved from a common ancestral gene by speciation) in humans such as: (i) Hox and Parahox genes, involved in the formation of body segments during embryogenesis; (ii) The SCPP genes (Calcium binding phosphoproteins), involved in the mineralization of tissues; (iii) miRNA genes, small non-coding RNA molecules that function in RNA silencing and post-transcriptional regulation of gene expression; (iv) CNEs (Conserved Non-coding Elements), regulatory sequences than in previous comparisons between tetrapod and teleost have never appeared. Finally, by the use of transcriptome data they tried to quantify the sum of expression domains and the levels of expression of the TGD-duplicate genes to figure out how these genes evolved.

Genome assembly and annotation
The authors sequenced the genome of one adult female gar to 90x coverage using Illumina technology. By anchoring a scaffold to a meiotic map they captured 94% of assembled bases in 29 linkage groups (LGs). Next, they constructed a gene set composed of 21,433 high confidence protein-coding genes and discovered that 20% of the genome is repetitive with Transposable Elements (TE) that are found in both teleost and lobe-finned fishes. Thanks to this they could clarify the phylogenetic origins of the TE.

The Gar lineage evolved slowly
The authors have made a Bayesian phylogenetic analysis using 243 one-to-one orthologs from 25 jawed vertebrates (Fig.2). Thanks to an evolutionary rate analysis, they showed that the proteins of the sister group of Holostei have evolved more slowly than those of the other vertebrates included in the analysis. These results suggest that the TGD maybe played a role in the rapid evolution of Teleost. The latter is confirmed by the greater branch lengths of the three teleost species used as outgroup.

Fig2: Bayesian phylogeny inferred from 243 proteins with a one-to-one orthology ratio from 25 jawed (gnathostome) vertebrates using PhyloBayes under the CAT + GTR + ?4 model with rooting on cartilaginous fishes. Node support is shown as posterior probability (first number at each node) and bootstrap support from maximum-likelihood analysis (second number at each node).

Gar inform the evolution of bony vertebrate karyotypes
The karyotype of Gar (n2=58), which is composed of micro- and macro-chromosomes, was aligned to those of human, chicken and medaka, a teleost fish. Microchromosomes are present in a wide range of vertebrate classes but not in mammals and teleost. Probably they are the product of an evolutionary process that minimizes the DNA content (mostly through the number of repeats) and maximizes the recombination rate of them. The authors chose the Gar because its genome is the first that does not belong to teleost or lobe finned fish. They could demonstrate a high degree of one-to-one synteny (co-localization of genetic loci on the same chromosome) comparing gar to the chicken genome. This adds support to the hypothesis that the bony ancestor possessed both micro and macro chromosomes. They explain the absence of microchromosomes in teleost by fusion processes that occurred after the divergence from Gar followed by the TGD. In fact, if you look at the comparisons made between Gar and Medaka chromosomes, the synteny relationship is one-to-two meaning that the chromosome sequences are conserved, but are now located on different chromosomes. This confirms that after the fusion and the TGD, teleostei’s chromosomes where subjected to rearrangements and rediploidization and that the radiation of Holostei sister group happened before the genome duplication (Fig.3).

Fig.3: Gar-chicken-medaka comparisons illuminate the karyotype evolution leading to modern teleosts. The genome of the bony vertebrate ancestor contained both macro- and microchromosomes, some of which remain largely conserved in chicken and gar, for example, macrochromosome Loc2-GgaZ and microchromosomes Loc20-Gga15 and Loc21-Gga17. All three chromosomes possess double-conserved synteny with medaka chromosomes Ola9 and Ola12, which is explained by chromosome fusion in the lineage leading to teleosts after divergence from gar, followed by TGD duplication of the fusion chromosome and subsequent intrachromosomal rearrangements and rediploidization.

Gar clarifies vertebrate gene family evolution
Molecular and physiological mechanisms are shared between vertebrates and this allows to highlight the different types of evolution to which genes were subjected. Despite this after a genome duplication is possible that some ohnologs lineages went lost. The analysis of gar genome allowed to find ancestral genes belonging to VGD1 VGD2 and to clarify the functions of some gene families. For instance, they analyzed the hox family and were able to identify four clusters The number of hox genes that it possesses is greater compared to the ones of tetrapod and teleost. The latter in fact lack some hox orthologs, highlighting that were lost independently in the two groups. The hox genes are very important during embryonic development and intuitively one would think that these have to be more preserved than others. Surprisingly, in my opinion, this study reveal that the teleost, instead of 82 expected Hox cluster genes, have fewer than 50 indicating a massive gene loss after the TGD. The same results were obtained by analyzing circadian clocks, specifically opsin; the MHC’s family; the immunoglobulin genes; the Toll-like receptors. All these genes have shown that gar’s genome can act as a bridge between teleosts and tetrapods, as it possesses characteristics of both.

Gar uncover evolution of vertebrate mineralized tissues
The authors chose this class of proteins because they are preserved for almost all vertebrates. In gar they have an important role as the epidermis is composed of ganoid scales and then formed by ganoin, an “ancestor” of the enamel. However, the evolution of the Scpp (Secretory Calcium-binding Phosphoproteins) was not clear. Gar contain the largest gene number of Scpp, 35, and thanks to this big gene repertory made possible to identify orthologs which with a teleost-tetrapod comparison was not possible to find. The Ambn, Enam and Amel genes, respectively encode ameloblastin, aenamelin and amelogenin. They had been found in the lobe finned fish but not in teleost. These are, however, present in the transcriptome of gar and showing sequence similarity with zebrafish Scpp genes. This suggests that teleost may have different orthologs and that the common ancestor of bony vertebrates had a rich repertoire of Spcc genes. On one hand gar has kept it on the other hand teleosts and tetrapods suffered a loss of subsets of these genes.

Gar connects vertebrate microRNAomes
miRNA is a small non-coding RNA molecule (containing about 22 nucleotides) that functions in RNA silencing and post-transcriptional regulation of gene expression. This gene class has suffered the same evolutionary fate of others mentioned previously. Some sequences have become tetrapod or teleost-specific. The gar genome enabled to identify 107 families. In my opinion the authors did an interesting discover: TGD did not lead to the miRNA loss in teleost. Indeed, the retention rate is higher compared to some protein coding genes, shading new light to the hypothesis that “miRna genes are likely to be retained after a duplication owing their incorporation into multiple gene regulatory networks”. This is evidence of how very often we focus on the evolution of coding sequences of DNA when regulatory mechanisms and non-coding sequences seem to have greater importance.

Gar highlights hidden orthology of cis-regulatory elements
Conserved non-coding element (CNE) are non-coding regions of the genome identified by conventional alignment of genomic sequences from two or more species.
These regions are widely studied because it is unclear the role they play. However, are often considered as cis-acting regulatory sequence (acting on the same molecule of DNA that they regulate). The authors analyzed the evolution of these sequences close to developmental Hox and Parahox genes considering that, during embryonic development, gene expression must be controlled precisely both spatially and temporally. This control is brought about, in large part, by the combinatorial interaction of specific transcription factors with cis-regulatory modules. They chose CNS65, a limb enhancer, because in previous alignments its sequence has been shown to be conserved in humans and chicken but not in teleost. Again using gar CNS65 was possible to find an ortholog in zebrafish. They tested if this cryptic CNS65 enhancer preserves the ancestral function by generate transgenic zebrafish and mice embryos. What they discovered is that the ancestral function was also maintained in zebrafish but with different spatial dynamics. Using mouse embryos, gar CNS65 drives expression of forelimbs and hind limbs in the early stages of development and just later its function is restricted to the distal portion. In zebrafish CNS65 it is only active in the development of the forelimbs (Fig.4).

Fig.4: Gar CNS65 drives expression throughout the early mouse forelimbs and hindlimbs (arrows) at stage E10.5 (left). At later stages (E12.5), gar CNS65 activity is restricted to the proximal portion of the limb and is absent in developing digits (middle). Zebrafish CNS65 drives reporter expression in developing mouse limbs at E10.5 but only in forelimbs (right).

This is an example of partial loss of the original function, a mechanism that during evolution is more frequent than the gaining of a new function. Besides CNS65 they had found 108 other limb-enhancer in common with humans, compared to 81 that had been found previously with the teleost alignment confirming the presence of hidden orthology (Fig.5).

Fig.5: The gar bridge principle of vertebrate CNE connectivity from human through gar to teleosts. Hidden orthology is uncovered for elements that do not directly align between human and teleosts but become evident when first aligning tetrapod genomes to gar, and then aligning gar and teleost genomes

This shows that the latter have suffered the loss of a great number of limb enhancer. In the future, gar will be the ideal candidate to study the limb-to-fin transition.

Gar illuminate gene expression evolution following the TGD
Initially I spoke of evolutionary path that ohnologs (paralog) genes may have after the duplication of the genome. Here the authors were able to get two very clear, I think also very rare, examples as they evolved. The gene slc1a3 went to a neo-functionalization. In gar is expressed only in brain, bone and testis while in medaka, that was chosen by the authors as the representative of the teleost, a ohnolog is mainly expressed in the brain and the other in the liver (Fig.6.c). Completely different fate hit the gpr22 gene that has undergone sub-functionalization. In gar is expressed in the brain and in the heart while in the medaka one ohnolog is expressed in the brain and the other in the heart (Fig.6.d).

Fig.6: (c) Neofunctionalized ohnologs for slc1a3 showing new expression in liver. (d) Subfunctionalized TGD orthologs of gpr22 with one expressed in brain as in gar and the other expressed in heart as in gar. In c and d, the r values denote the correlation of the expression profile of each ohnolog with the gar pattern.

This second mechanism is what you would expect with more chances: an ancestral gene sub-function tends to be partitioned between the TGD-derived paralogs. The authors have also seen that the same mechanism occurs regarding the level of gene expression where a ohnologs pair tends to evolve the same level of expression of the pre-duplication gene.

Conclusions
The “Gar-bridges” led to the identification of many ortholog and paralog genes and clarify their fate during evolution. Previously the lack of direct connection between teleost and tetrapod genomes often lead to the wrong use of the word “innovation” on one group or the other. I think that this work is an excellent starting point to connect the evolution of genetic, developmental and physiological mechanisms that made the human genome evolve to its present state. To fully understand the differences between human and model organisms used in biomedicine it is crucial to create very powerful and close-to-reality models. For these reasons, this path should not stop here because the gar is only one species of Holostei – which is composed of nine species and two orders. The study of their genome and also that of other so-called “primitive” fish can help to shine more light on even the striking points that have emerged from this study. Perhaps the outcome of other comparative studies can give even more emphasis to these results or maybe provide answers that may now be counterintuitive.

References

Braasch I, Gehrke AR, Smith JJ, Kawasaki K, Manousaki T, Pasquier J, Amores A, Desvignes T, Batzel P, Catchen J, Berlin AM, Campbell MS, Barrell D, Martin KJ, Mulley JF, Ravi V, Lee AP, Nakamura T, Chalopin D, Fan S, Wcisel D, Cañestro C, Sydes J, Beaudry FE, Sun Y, Hertel J, Beam MJ, Fasold M, Ishiyama M, Johnson J, Kehr S, Lara M, Letaw JH, Litman GW, Litman RT, Mikami M, Ota T, Saha NR, Williams L, Stadler PF, Wang H, Taylor JS, Fontenot Q, Ferrara A, Searle SM, Aken B, Yandell M, Schneider I, Yoder JA, Volff JN, Meyer A, Amemiya CT, Venkatesh B, Holland PW, Guiguen Y, Bobe J, Shubin NH, Di Palma F, Alföldi J, Lindblad-Toh K, & Postlethwait JH (2016). The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons. Nature genetics, 48 (4), 427-37 PMID: 26950095

]]>
The genetic sex-determination system predicts adult sex ratios in tetrapods https://wp.unil.ch/genomeeee/2016/05/29/the-genetic-sex-determination-system-predicts-adult-sex-ratios-in-tetrapods/ Sun, 29 May 2016 08:56:41 +0000 http://wp.unil.ch/genomeeee/?p=701 ResearchBlogging.org

Genetic sex determination, i. e. the determination of sexual phenotypes by the effect of sex-determining genes, is found in the majority of vertebrates. Sex determination genes have evolved multiple times independently and can be located on different chromosomes. Depending on whether the presence of the sex determining region (SDR) determines female or male sex, genetic systems of sex determination are called ZW or XY systems respectively and the sex which is heterozygous for the SDR is called the heterogametic sex. Lower fitness in the heterogametic sex has long been observed in interspecific hybrids in a wide range of animal and even plant species, an observation called Haldane’s rule. In this paper the authors find a similar pattern in (non-hybrid) tetrapod species: by comparing the adult sex ratio in XY and ZW systems in 344 tetrapod species, they find that the ASR is skewed towards the homogametic sex (towards females in an XY system and towards males in a ZW system).

This observation is based on a dataset containing known genetic sex determination systems and adult sex ratios (ASRs) of species across the vertebrate phylogeny. Within amphibians and reptiles (in which both XY and ZW systems are found), the authors show that ASRs in ZW systems are significantly more male biased than in XY systems and that the proportion of species with male-biased ASRs is greater in ZW than in XY systems. Furthermore these observations hold true for the combined dataset of amphibians, reptiles, mammals (which have a conserved XY-system and male-biased ASRs), and birds (which have a conserved ZW system and female-biased ASRs).

It is important to test whether these observations are actually caused by the GSD or whether there are other factors, which could systematically influence ASR:

– ASRs could be influenced by body size and breeding latitude through correlated life history traits like development, growth and reproductive ecology.

– Differences in body size and dispersal between sexes can lead to differences in mortality which influence ASRs.

The authors account for potential effects of sex-biased dispersal, body size, breeding latitude and sexual size dimorphism in a phylogenetically corrected multi-predictor analysis. Although they do find a significant correlation between sexual size dimorphism and ASR as well as between sex-biased dispersal and ASR, the effect of the GSD remains significant in all cases. Because the dataset for sex-biased dispersal is limited to 32 species in total, which is less than 10% of the number of species in the complete dataset, it is not included in the main multi-predictor model.

Another important factor is the effect of phylogenetic relatedness between species: The effects of GSDs on ASRs of more closely related species are more likely to be correlated due to shared genetic and phenotypic traits.

To account for this, phylogenetic corrections, which are based on composite phylogenies of different tetrapod groups, are applied. As these composite phylogenies don’t include branch length information, different methods are used to assign arbitrary branch lengths, which has surprisingly little effect on the results. Two different methods are applied to account for phylogenetic relatedness across samples: Phylogenetic generalized least squares (PGLS) models to test for differences in ASRs between XY and ZW taxa and Pagel’s discreet method (PDM) to test the fit of dependent and independent models of transitions in ASR bias and GSD. As the second model implies, the number of transitions between GSDs should be more important than the phylogenetic relatedness between species. The author’s claim to take this into account by rerunning their analyses while reducing three large groups with a known shared sexual system (mammals, birds and snakes) to a single datapoint, resulting in unchanged significant differences in ASRs between GSDs.

I wonder whether it would also make a difference to reduce further groups, which share non-independent evolution of SDRs, to single datapoints. For example this dataset includes five species of lizards from the family Lacertidae, which are assumed to share a conserved GSD (Rovatsos et al. 2016) and 9 lizard species of the genus Anolis included in the dataset are likely to share a common sex chromosome system (Gamble et al. 2014). Furthermore in many amphibians and reptiles nothing is known about synteny across sex chromosomes and it is likely that a rigorous reduction of GSDs with common ancestry into single datapoints would reduce the number of independent observations and thus statistical power.

However, the number of relevant datapoints in amphibians is fairly limited anyway: Amphibian species with an XY sex determination system show no significant ASR bias (or even a slight male bias after phylogenetic correction). Thus the observed effect within amphibians relies on data for only 11 species with a ZW system.There are good reasons to be careful when making general conclusions from this dataset:

Sex reversal is common in some amphibian species, which could bias the observed ASRs. Furthermore, although the authors claim to have included only species with known GSDs, the GSD for amphibians with homomorphic, microscopically indistinguishable sex chromosomes is difficult to determine and frequent subject of scientific dissent.

One example for this is Bufo viridis. The ASR of B. viridis is strongly male biased (0.70), and the GSD is supposed to be a ZW system based on the entry from www.treeofsex.org. However, the claim that B. viridis is female heterogametic is based on a single study, which detected that all seven females examined in a single Moldavian population were heterozygous for a chromosomal inversion. Such a pattern has never been found in any other green toad population, but instead multiple sex linked genetic markers have been developed, which show male-heterogametic segregation patterns in crosses from different B. viridis populations as well as in the closely related species B. siculus, B. balearicus and B. variabilis (Stöck et al. 2011). In my opinion it would be more appropriate to assign B. viridis to species with XY system, which would result in a decrease in the overall differences in ASRs between both groups.

Possible reasons for the effect of the sex-determination system on adult sex ratios

In general, a skewed adult sex ratio can have two different reasons: a skewed gametic sex ratio or higher mortality of one sex resulting in different sex ratios in adults. In more detail six potential not mutually exclusive explanations of how the GSD could bias adult sex ratios are proposed and discussed:

– Sexual selection in males could increase mortality.

This would be expected to result in a bias towards females in XY and ZW systems and cannot explain male biased ASRs in ZW systems.

– Recessive deleterious mutations on X/Z chromosomes or Y/W specific deleterious mutations.

Recombination suppression on sex chromosomes leads to degeneration of the sex-linked region on Y /W chromosomes, which can result in adverse fitness effects caused by either deleterious mutations on the Y/W, or deleterious recessive mutations on the hemizygous part of the X/Z chromosome.

Based on a population genetic model they develop, the authors claim that the accumulation of deleterious mutations may not be enough to cause the observed adult sex-ratio bias. However, they admit that many of their parameter estimates are very crude and results may vary when other factors are taken into account, like large differences in the rate of deleterious mutations.

The number of deleterious mutations is expected to increase with increasing sex chromosome differentiation and degeneration. Sex chromosome differentiation in tetrapods spans a wide range from completely homomorphic sex chromosomes in many lizards and amphibians but also in some families of snakes and birds to complete loss of the Y chromosome in some mammals. It would thus be interesting to look if there is an association between variable sex chromosome degeneration and skews in the ASR within groups with homologous sex chromosomes.

– Imperfect dosage compensation.

In the heterogametic sex, genes located in the hemizygous region of the X/Z chromosome are present in only one functional copy. In order to reach similar expression levels as in the homogametic sex, the expression of these genes has to be increased. However, research has shown that not all genes are upregulated in the same way and as a result many sex chromosomal genes have a lower expression levels in the heterogametic than in the homogametic sex.

This explanation is unlikely to result in a general pattern across tetrapods, because there are different mechanisms of dosage compensation in vertebrates: mammals deactivate one X chromosome in females to compensate for gene loss on the Y chromosome, while birds show incomplete dosage compensation on a gene-by-gene basis. Since one X is deactivated in the homogametic sex in mammals, we would expect to find sex-specific fitness differences based on dosage compensation only for non-mammals.

– Meiotic drive:

Meiotic drive systems are genetic variants, which favor their own transmission by distorting sex ratios at meiosis. The authors point out, that the observed skews in ASR are unlikely to be caused by meiotic drive, because the sex ratio at birth does not predict the adult sex ratio in mammals and birds. However, there is little information on sex ratio at birth in reptiles or amphibians. Furthermore, a better measure for the effect meiotic drive would be the gametic sex ratio, since the sex ratio may be already skewed at birth due to sex-specific differences in embryonic mortality.

– More rapid degeneration of X and Y chromosomes during lifetime:

The author’s propose, that the Y/W may be more affected by further degeneration during lifetime (for example by increased telomere shortening or loss of epigenetic marks). To my knowledge this is rather speculative, as I am not aware of any results supporting this hypothesis.

– Sexually antagonistic selection:

Loci, which are only beneficial to one sex, but may be detrimental to the other are expected to accumulate on sex chromosomes. In an XY-system, male beneficial loci are expected to be found in linkage disequilibrium with the SDR, which ensures that they are exclusively transmitted to males. The positive fitness effects of these Y/W-linked sexually antagonistic mutations would thus result in a postive skew towards the heterogametic sex (although the evolution of recombination suppression may introduce further degeneration of the Y/W chromosome, which can be detrimental). Furthermore, the authors develop a model for sexually antagonistic selection of loci located on X/Z chromosomes and come to the conclusion, that there are no robust generalizations about the direction of the skew of the adult sex ratio resulting from these loci.

The authors point out, that there is no clear support for any of these hypothesis. Further research could test the assumptions of some of these hypotheses: Recessive deleterious mutations on X/Z chromosomes or Y/W specific deleterious mutations, imperfect dosage compensation and sexually antagonistic selection are all related to sex chromosome degeneration and recombination suppression. Although it is difficult to comparatively quantify sex chromosome degeneration across species, more high quality sequences of sex chromosomes are becoming available and it may soon be possible to link sex chromosome degeneration on a gene level to sex specific fitness differences. A very crude proxy for this would be to include whether sex chromosomes are microscopically distinguishable (heteromorphic) or indistinguishable (homomorphic) in this analysis and test whether this explains significant variance in ASRs. Also further research could clarify whether there is a connection between ASR and sex ratio at birth or even better gametic sex ratio in amphibians or reptiles, which could be indicative of meiotic drive.

Conclusions

Overall, I am skeptical that comparing sexual systems as a simple binary character (male or female heterogametic) does adequately represent the diversity of tetrapod sex chromosome systems and I expect that fitness differences should be more related to sex chromosome degeneration than to the GSD itself. Although a significant proportion of the interspecific variation in ASRs is explained by the GSD in groups with variable sex determination systems, there are multiple possible confounding factors (like sex reversal, problems in determining GSDs, uncertainty of common ancestry of GSDs), which could easily lead to biases in the relatively small number of observations in these groups.

References:

Gamble T, Geneva AJ, Glor RE, Zarkower D (2014). Anolis sex chromosomes are derived from a single ancestral pair. Evolution.68(4):1027-41

Rovatsos M, Jasna V, Altmanova M, Johnson Pokorna M (2016). Conservation of sex chromosomes in lacertid lizards. Molecular Ecology.

Stöck M, Croll D, Dumas Z, Biollay S, Wang J, Perrin N (2011). A cryptic heterogametic transition revealed by sex-linked DNA markers in Palearctic green toads. Journal of Evolutionary Biology. 24:1064-1070

]]>
Supergenes and social organization in a bird species https://wp.unil.ch/genomeeee/2016/05/06/supergenes-and-social-organization-in-a-bird-species/ Fri, 06 May 2016 16:52:51 +0000 http://wp.unil.ch/genomeeee/?p=681 ResearchBlogging.org

 

 

 

Cindy Dupuis, Xinji Li, Casper van der Kooi

 

The development of new molecular mechanisms and next generation sequencing techniques have advanced our knowledge on the genetic basis underlying phenotypic polymorphism. Over the coarse of recent years, scientific studies have documented large genomic regions with drastic phenotypic effects, the so-called supergenes. A supergene is a set of genes on the same chromosome that exhibit close genetic linkage and thus inherits as one unit.

The evolution of a supergene requires that multiple loci with complementary effects become linked (i.e. they are genetically clustered and recombination between the loci is suppressed) and that optimal alleles at the linked loci are combined. Genetic clustering of different loci can occur when, via mutation, an adaptive interaction between two closely placed loci is created. In addition, gene duplications or translocations that generate a series of (novel) complementary genes can give rise to supergenes. The probability of a recombination event occurring in between loci depends on various factors. The chance of a recombination event occurring in between two loci will be small when the loci are located closely together, as the chance of a recombination event in between two loci generally decreases with physical distance between the loci. Given the large size of supergenes, additional mechanisms seem, nonetheless, important. This can, for instance, be maintained via structural differences, such as inversions, between the supergene and their homologous chromosomal region.

An interesting example of a supergene in an invertebrate is the case documented by Purcell et al. (2014). They documented a large, nonrecombining region that is association with social organisation in an ant species. The nonrecombining region was found to largely constitute one chromosome and was hence aptly called the ‘social chromosome’. They find a structurally similar region with similar effects in another ant species, however the regions exhibit no homology, suggesting parallel evolution of the social chromosome. Examples of vertebrates social systems determined by supergenes are, to our knowledge, unknown.

Two recent articles (Küpper et al., 2016; Lamichhancy et al., 2016) revealed a single supergene controlling alternative male mating tactics in the ruff (Philomachus pugnax). The studies were carried out independently by two research groups, but reach almost the same conclusions. The ruff (Philomachus pugnax) is a lekking wader known for the great diversity in the male plumage color and behavioral polymorphism. Three types of males can be distinguished; these types are characterized by differences in territoriality and behavior that are highly correlated with differences in nuptial plumage and body size. Predominantly dark-colored Independent males are most common (80-95% of males), these males defend small territories on a lek. Smaller, lighter colored Satellite males (5-20%) are non-territorial and less strict to a particular lek. Satellite males make use of – and are largely tolerated by – the residences of Independent males. The third type are the Faeder males, which are very rare (<1% of males). Faeder males lack male display, are small and resemble the unornamented females; however, they have disproportionately large testes.

Previous studies using pedigrees of large, captive populations showed that reproductive polymorphism follows a single-locus autosomal pattern of inheritance (Lank et al., 1995; Lank et al., 2013). The dominant Faeder allele controls development into Faeder males, whereas the Satelllite allele (that is dominant to Independent) controls development into Satellite or Independent males. Ekblom et al. (2012) studied the nucleotide sequence variation and gene expression in ornamental feathers from 5 Independent and 6 Satellites males using transcriptome sequencing. No significant expression divergence of pre-identified coloration candidate genes was found, but many genetic markers showed nucleotide differentiation between the two morphs. Later, Farrell et al. (2013) used linkage analysis and comparative mapping to locate the Faeder locus, and found linkage to microsatellite markers on avian chromosome 11 that included the Melanocortin-1 receptor (MC1R) gene, a strong candidate in alternative male morph determination, because it is considered to be important in plumage coloration.

Using the captive population that was previously phenotyped, Küpper et al. now set out to determine the genomic structure of the existing morph divergence in P. pugnax. The first step in their analysis was to generate and annotate the full genome for one Independent male. Followingly, the authors identified SNPs in the population using RAD sequencing. More than one million SNPs could be distinguished, and Faeder and Satellites could be mapped to a genetic map based on 3’948 SNPs. Interestingly, both morphs mapped to the same region on chromosome 11, but exhibited clear structural differences. This was corroborated by a GWAS analysis on 41 unrelated Satellite, Independant and Faeder males from a natural population.

 

In order to characterize the genomic region more precisely, they conducted a whole genome sequencing of a small set of Independent, Satellite and Faeder males. They showed that the region on chromosome 11 was highly differentiated between Satellite and Faeder morphs and that this region contained a greater nucleotide variation compared to the adjacent regions. Using the reads orientation, they found clear evidence for an inversion of the chromosomal regions between the different morphs. Interestingly, they found that one breakpoint occurs within an essential gene, CENPN (encoding centromere protein N, recessive lethal), which implies that individuals homozygous for the inversion are not viable – an observation that is confirmed by breeding experiments. The authors also suggested a recombination event or gene conversion to have occurred between the Satellites and Independent alleles.

 

By comparing gene sequences among morphs, the authors discovered that 78% of the gene sequences were different between morphs, and that those differences had the potential to change the encoded protein. Among the divergent genes, some where found to be involved in hormonal production, like HSD17B2, an enzyme inactivating testosterone and estradiol. Varying specifically depending on the morph, this enzyme may alter steroid metabolism and explain partly why plumage patterns and behavior is different between morphs. The MC1R gene was also found within the altered genomic region. This gene is considered an important locus controlling color polymorphism, which could be at the source of the reduced melanin levels in satellites. The PLCG2 gene, which has been rearranged in Faeders, was found to be a candidate gene for the rather feminine appearance and non-aggressive behavior in Faeders. Presumably, this gene is part of a cascade leading to the development of the usual impressive plumage of other males morphs.

 

In a second article, Lamichhancy et al., 2016 studied a natural ruff population using whole-genome sequencing. They first established a high-quality reference genome assembly from an Independent male and conducted functional annotation based on both evidence data and de novo gene predictions. Then, whole-genome resequencing and SNP calling were performed for 15 Independent, 9 Satellite and 1 Faeder males. Their genome-wide screen for genetic divergence estimates (FST) between different male morphs identified a 4.5-Mb region, based on which Independents and Satellites could be phylogenetically clustered as distinct groups. Screening for structural variants identified a 4.5-Mb inversion in Satellites that perfectly overlapped with the differentiated region. In addition, PCR-based sequencing confirmed the positions of proximal and distal breakpoints and identified a 2,108-bp insertion of a repetitive sequence at the distal breakpoint. Diagnostic tests showed that Satellite males were heterozygous (S/I), while most Independent males were homozygous (I/I). They suggested the Independent allele to represent the ancestral state, which is consistent with the conserved synteny among birds.

The comparison between Faeder and Independent males showed that the genetic differentiation was equally strong across the same region, creating a mirror image of the differentiation pattern between Satellites and Independents. Accordingly, the region could be subdivided into two parts: region A where Satellite and Faeder chromosomes were closely related and less closely related to Independent, and region B where the Satellite and Independent loci were closer related and divergent from Faeder. Since an inversion is expected to reduce the amount of recombination within the region between the wild-type (I) and mutant alleles (either S or F), the disruption of the differentiation pattern might be considered the result of one or two recombination events between an Independent and a Faeder-like chromosome. The divergence time between the Independent allele and Satellite or Faeder alleles was estimated to be approximately 4 million years, using the nucleotide divergence and estimated mutation rates for birds. The last recombination event was estimated to occur 520,000 ± 20,000 years ago.

To better understand the genetic consequences of the inversion and relate it to the phenotypic variantion in male ruffs, the authors searched for candidate mutations amongst the genes in the inverted region. Mutations in several genes with important functions were found on Satellite and Faeder chromosomes, including the abovementioned CENPN, HSD17B2 and MC1R genes as well as and SDR42E1 (the latter one is important for the metabolism of sex hormones). Missense mutations in derived MC1R were found to be associated to the Satellite and Faeder alleles, hinting at a potential mechanism explaining the male plumage polymorphism during breeding season.

In conclusion, these two studies demonstrated presence of a genomic inversion that led to the evolution of a supergene. This supergene determines the complex phenotypic variation in male ruffs. These two papers contribute to our understanding of supergenes, complex phenotypes and social organization.

 

Küpper C, Stocks M, Risse JE, Dos Remedios N, Farrell LL, McRae SB, Morgan TC, Karlionova N, Pinchuk P, Verkuil YI, Kitaysky AS, Wingfield JC, Piersma T, Zeng K, Slate J, Blaxter M, Lank DB, & Burke T (2016). A supergene determines highly divergent male reproductive morphs in the ruff. Nature genetics, 48 (1), 79-83 PMID: 26569125

]]>
Evolution of Darwin’s finches and their beaks revealed by genome sequencing https://wp.unil.ch/genomeeee/2015/06/02/evolution-of-darwins-finches-and-their-beaks-revealed-by-genome-sequencing-2/ Tue, 02 Jun 2015 10:48:37 +0000 http://wp.unil.ch/genomeeee/?p=621 The recent formation and habitat diversity of the Galápagos archipelago, in conjunction with its relative isolation from the mainland, has helped the islands become rich in endemic species that have much to offer for the study of evolutionary biology.

As a result of their volcanic origin and fluctuating climates, the islands of the Galápagos archipelago vary in age, size, topography and vegetation. In conjunction with their isolation from the mainland, this diversity of relatively new environments, both within and between islands, are perfect breeding grounds for speciation. The finches of the Galápagos archipelago and Cocos Island are the product of a fascinating adaptive radiation that started only about 1.5 million years ago, following the arrival of a common ancestor from South America. These finches are most notable for their diversity in beak morphology, which reflect the differences in their respective adaptations to exploiting various food resources. Charles Darwin’s observations of this diversity in beak morphology played an important role in the development of his theory of natural selection.

“Seeing this gradation and diversity of structure in one small, intimately related group of birds, one might really fancy that from an original paucity of birds in this archipelago, one species had been taken and modified for different ends” – Charles Darwin

There has since been great interest in the study of Darwin’s finches, and much research has been done towards the efforts of resolving their phylogenetic history and elucidating the mechanisms that drive their variation. In the paper reviewed here, the authors took the extraordinary step of sequencing the whole genomes of 120 individuals, representing all 15 of the Darwin’s finch species across the Galápagos and Cocos Islands as well as two close relatives (Tiaris bicolor and Loxigilla noctis) from Barbados. Analyzing this rich data set, they find important deviations from previous taxonomies and identify several genomic regions associated with beak shape.

Phylogeny

Figure 1: (a) Sample locations and (b) phylogeny based on all autosomal sites
Figure 1: Sample locations and phylogeny based on all autosomal sequences
Species tree from F based largely on mitochondrial DNA
Species tree from Farrington et al. (2014)

After sequencing and assembly, they generated four phylogenies according to (i) autosomal DNA (see Figure 1 above), (ii) mitochondrial DNA and (iii, iv) sequences linked to sex chromosomes Z and W. Their phylogenies largely supported previous taxonomies (compare Figure 1 with the tree to the right, which was generated using 14 nuclear introns and two short sequences of mitochondrial DNA (Farrington et al., 2014). However, this new genome-based phylogeny also showed some important differences. For one, as we can see in figure 1 above, the species classified as G. difficilis actually forms three distinct groups, which cluster geographically by the islands of (1) Pinta, Santiago and Fernandina, (2) Wolf and Darwin and (3) Genovese. Apparently this is consistent with taxonomies proposed in two studies that appeared in 1931 and 1945, but it is unclear to me why they remained classified as a single species until now.

Similarly, they found that G. conirostris is likely also paraphyletic, given that G. conirostris on Española was most similar to G. magnirostris, and G. conirostris on Genovesa was most similar to G. scandens. Following these findings, the authors of this study recommend that the taxonomy for G. difficilis and G. conirostris be revised to reflect their paraphyly, according to the new genome-based phylogeny.

Gene flow

Evidence for introgression was found by comparing the autosomal phylogenetic tree with those of the sex-linked loci and mtDNA, and through ABBA-BABA tests. They found that there has been extensive gene flow and hybridization between the species throughout the radiation, which likely contributed to their rapid evolution.

While this is certainly a very interesting find, it is perhaps unsurprising, given the proximity of the islands and the relative ease for individuals to fly from one to the other. This does offer a nice comparison to the adaptive radiations of cichlid fishes, which occur in geographically isolated lakes without the opportunity for gene flow between them (see blog post).

Genetic basis of beak shape

Network tree of Darwin's finches showing diversity of beak shape
Network tree of Darwin’s finches with images showing diversity of beak shape

Now that they had this large genomic dataset, they wanted to address the question of how molecular differentiation contributes to beak morphology. They did this by choosing four closely related populations that differed in beak shape (two blunt and two pointed), and then scanned the whole genomes to identify regions with high genetic differentiation (Z-transformed FST, ZFST) between the two phenotypes. In figure 3a below, we can see that they have marked 15 regions with the highest ZFST values, along with the genes identified within them.Screen Shot 2015-06-01 at 11.14.08

Of those genes, they found 6 that were previously reported to be involved with craniofacial/beak development in mammals and birds. Interestingly, they did not find high genetic differentiation in bone morphogenetic protein 4 (BMP4), a gene that was previously reported to show differential expression between beak types. This may be due to differential expression, and it’s a pity this study does not include any RNA-seq work to complement their huge genomic dataset, but perhaps they’re saving that for another Nature.

Haplotype tree of the ALX1 region
Haplotype tree of the ALX1 region

The highest ZFST peak contained the gene ALX1, which is involved in craniofacial development, and whose loss in humans can even cause severe facial clefting. The authors found that two variants of this ALX1 gene are present, and each remarkably corresponds to one of two categories of beak shape: blunt and pointed. A phylogenetic tree constructed from this region (figure 3c left) shows that the blunt shape was an early adaptation that seems to have been quite favorable; the short branch lengths among the blunt haplotypes (red) are indicative of a selective sweep, which is further supported by the low nucleotide diversity shown in figure 3b below (although it looks as though G. difficilis from Wolf may also be showing low nucleotide diversity in part of this region, possibly from introgression with G. magnirostris?).

Nucleotide diversity in the ALX1 region
Nucleotide diversity in the ALX1 region

G. fortis populations show substantial diversity in beak shape, and so the authors then genotyped an additional 62 birds from this species and found a textbook association between beak shape and genotype (figure 3e below; BB is blunt haplotype homozygote, PP is pointed haplotype homozygote, and BP is heterozygote). While beak morphology certainly involves multiple genes, as evidenced by the 15 significant genomic regions, their work shows that ALX1 alone is one of the most important, if not the most important contributor.

Linear regression analysis of beak shape score by genotype
Linear regression analysis of beak shape score by genotype

Conclusions

The authors put in a tremendous effort to sequence the genomes of all of these individuals, representing each of the 15 Darwin’s finches, and once these are made accessible to the public they will no doubt be a valuable resource for any future studies involving those species, and indeed for anyone interested in the field of adaptive radiation. However, given such a large dataset, it would have been nice to see some additional work done, e.g. an assessment for possible differential gene expression of the genes within the 15 observed ZFST peaks, or further analyses of some of the other genes found in the ZFST peaks. I wonder also whether they might be able to apply an approach used in Zhan et al. (2014), which was used to identify regions of the genome associated withmigratory behavior in Monarch butterflies. This, or a similar approach, might yield more information than only scanning for high FST.

Update 21/10/2015: a previous version of this post stated incorrectly that the species tree from Farrington et al. (2014) was based largely on mitochondrial DNA.

Lamichhaney, Sangeet, Jonas Berglund, Markus Sällman Almén, Khurram Maqbool, Manfred Grabherr, Alvaro Martinez-Barrio, Marta Promerová, et al. “Evolution of Darwin’s Finches and Their Beaks Revealed by Genome Sequencing.” Nature 518 (2015): 371–375. http://www.nature.com/doifinder/10.1038/nature14181.

]]>
Convergent evolution of the genomes of marine mammals https://wp.unil.ch/genomeeee/2015/05/28/convergent-evolution-of-the-genomes-of-marine-mammals-2/ Thu, 28 May 2015 13:59:04 +0000 http://wp.unil.ch/genomeeee/?p=611 ResearchBlogging.org

Introduction

Convergent evolution is the independent evolution of similar features in species of different lineages. Marine mammals from different mammalian orders share several phenotypic traits adapted to the aquatic environment is a very classic example of convergent evolution. Although there are potentially several genomic routes to reach the same phenotypic outcome, it has been suggested that the genomic changes underlying convergent evolution may to some extent be reproducible and that convergent phenotypic traits may commonly arise from the same genetic changes. To investigate convergent evolution at the genomic level, the authors present high-coverage whole-genome sequences for four marine mammal species: the walrus (Odobenus rosmarus), the bottlenose dolphin (Tursiops truncatus), the killer whale (Orcinus orca) and the West Indian manatee (Trichechus manatus latirostris)(figure 1). Here are some interesting results of this paper.

Fig 1: Phylogeny of 20 eutherian mammalian genome sequences, rooted with a marsupial outgroup.
Fig 1: Phylogeny of 20 eutherian mammalian genome sequences, rooted with a marsupial outgroup.

Detecting positively selected protein-coding genes

In order to study the molecular mechanism of convergence evolution, firstly, they focused on detecting positive selected protein-coding genes in all three orders; Branch-site likelihood ratio test is a powerful polygenetic method to detect relatively ancient selection. This test is useful for identifying positive selection along prespecified lineages that affects only a few sites in the protein. Applying branch-site likelihood ratio method, they totally tested a series of four different branches. One on the combined marine mammal branches and one on each of the individual branches leading to manatee, walrus and the order containing dolphin and killer whale (see the branches colored red in Fig. 1). They identified 191 genes under positive selection across the combined marine mammal branches, 5 after conservatively correcting for multiple testing.

Identifying Convergent amino acid substitutions in positively selected genes 

Secondly, they focused on identifying convergent amino acid substitutions encoded within positive selected genes found in the first part. They found such parallel nonsynonymous changes in coding genes mapping to the same amino acid site in more than one marine mammal lineage were widespread across the genome. In a word, they identified 44 parallel nonsynonymous amino acid substitutions occurred along all 3 marine mammal lineages. To specifically, they found 15 of the 44 identical nonsynonymous amino acid substitutionsin all 3 marine mammal lineages encoded within genes evolving under positive selection in at least one lineage; 8 of these genes were inferred to have evolved under positive selection in the test including all 3 marine mammal lineages (Fig. 2 and Table 1).

Table 1 Positively selected genes that encode parallel substitution in all three marine mammal lineages
Table 1 Positively selected genes that encode parallel substitution in all three marine mammal lineages
Figure 2 Genome scans for convergence. Marine mammal genomes showed a large number of parallel substitutions (blue) that occurred along the branches of at least two marine mammal lineages since they evolved from a terrestrial ancestor. Parallel substitutions that occurred in positively selected genes are shaded red.
Figure 2 Genome scans for convergence. Marine mammal genomes showed a large number of parallel substitutions (blue) that occurred along the branches of at least two marine mammal lineages since they evolved from a terrestrial ancestor. Parallel substitutions that occurred in positively selected genes are shaded red.

Is phenotype associated with genotype identified in this study? Indeed, they found several of the 15 genes under positive selection have known functional associations that suggest a role in the convergent phenotypic evolution of the marine mammal lineages. For example, S100A9 and MGP encode calcium-binding proteins that have a role in bone formation, SMPX has a role in hearing and inner ear formation16, C7orf62 has known links to hyperthyroidism17, MYH7B has a role in the formation of cardiac muscle18 and SERPINC1 regulates blood coagulation19. These genes could therefore be linked to convergent phenotypic traits such as changes in bone density (S100A9 and MGP), which is high in shallow-diving species such as the manatee and walrus to overcome neutral buoyancy but low in deep-diving cetacean species that collapse their lungs to overcome neutral buoyancy.

For me, the most interesting result they found is an unexpectedly high level of convergence along the combined branches of the terrestrial sister taxa (cow, dog and elephant) to the marine mammals, for which there is no obvious phenotypic convergence. This finding suggests that the options for both adaptive and neutral substitutions in many genes may be limited, possibly because substitutions at alternative sites have pleiotropic and deleterious effects.

Summary

This paper nicely showed that convergent amino acid substitutions were widespread throughout the genome and that a subset of these substitutions were in genes evolving under positive selection and putatively associated with a marine phenotype. However, the authors also found higher levels of convergent amino acid substitutions in a control set of terrestrial sister taxa to the marine mammals. These results suggest that, whereas convergent molecular evolution is relatively common, adaptive molecular convergence linked to phenotypic convergence is comparatively rare.

Foote, A., Liu, Y., Thomas, G., Vina?, T., Alföldi, J., Deng, J., Dugan, S., van Elk, C., Hunter, M., Joshi, V., Khan, Z., Kovar, C., Lee, S., Lindblad-Toh, K., Mancia, A., Nielsen, R., Qin, X., Qu, J., Raney, B., Vijay, N., Wolf, J., Hahn, M., Muzny, D., Worley, K., Gilbert, M., & Gibbs, R. (2015). Convergent evolution of the genomes of marine mammals Nature Genetics, 47 (3), 272-275 DOI: 10.1038/ng.3198

]]>
The genomic landscape underlying phenotypic integrity in the face of gene flow in crows https://wp.unil.ch/genomeeee/2015/05/25/the-genomic-landscape-underlying-phenotypic-integrity-in-the-face-of-gene-flow-in-crows-2/ Mon, 25 May 2015 18:20:44 +0000 http://wp.unil.ch/genomeeee/?p=592 ResearchBlogging.org

In this paper authors returned to the question about the role of interspecific gene flow for the evolution and species diversification. Authors studied hybrid zone between two bird classes of the all-black carrion crows (Corvus corone) and the gray-coated hooded crows (C. cornix). Their morphological hybrid zone in Europe gives the possibility to study the effects of introgression on evolution during early species divergence. Authors identified genome-wide introgression and showed the divergence in the expression levels of genes, implicated in plumage coloration in both species, and genes, involved in visual perception, that could be important for maintaining phenotypic differences and responsible for heterogeneity in introgression landscapes.

Principal results

Firstly, authors assembled a high-quality reference genome of one hooded crow male which was aligned to chicken and zebra finch genomes and, then, annotated through mRNA sequencing. Consequently, a set of 20.794 protein coding genes containing open reading frames of more than 100 amino acids was found. RNA seq data was used to validate identified in silico genes. Then, authors resequenced 60 genomes of unrelated birds from four populations of carrion and of hooded crows and found 8.44 million single-nucleotide polymorphisms (SNPs) segregated across all investigated populations. Interestingly, carrion and hooded crows shared just 5.27 million SNPs among all found. Authors also discovered substantial genome-wide gene flow across the hybrid zone. They observed that the major axes of genetic variation corresponded to hypothesized direction of special expansion out of Spain. Moreover, German carrion crows grouped more closely to both Swedish and Polish hooded populations than Spanish carrion crows (Figure 1). By using multiple tests, such as ABBA-BABA test, admixture analysis and coalescence-based parameter estimate of isolation-with migration model, authors proved extensive gene flow between hooded crows and the German carrion crows populations.
Further, mRNA sequencing analysis was performed on 19 individuals and five tissues to check gene expression divergence between species across the hybrid zone. However, authors observed low proportion (0.03% – 0.41%) of differently expressed genes across tissues in carrion and hooded crows. Most of differently expressed genes were responsible for plumage coloration and all found overexpressed genes were implicated in the melanogenesis pigmentation pathway (Figure 3). Nineteen of these 20 genes involved in melanogenesis were found underexpressed in the gray hooded crows. All differently expressed genes were related to growing feather follicles from the bird’s torso. Authors confirmed that gene expression bias was related to a broad spectrum of down-regulated genes implicated in melanogenesis pathway rather than to defect in melanin deposition due to various melanocytes density (Figure 4).
Then, authors investigated the landscape of genomic divergence through a 50-kb window-based approach which uses clustering algorithm reconstructing local genomic phylogenies without any a priori input hypothesis. They showed that only 0.28% of genome was divergent between carrion and hooded crows. Also, one 1.95-Mb genomic region located on chromosome 18 and exhibiting strong genetic differentiation between two species was found. This region had 81 of all 82 fixed sites between carrion and hooded crows and possessed 40 annotated protein coding genes. Moreover, it was characterized by marked reduced nucleotide diversity and differentiation in all populations and increased linkage disequilibrium (LD). Authors do not deny the possibility of inversion in this region. On Figure 2 authors demonstrated one region with recent, positive selection in hooded crows. This region had a lot of fixed hooded crow-specific derived variants and reduced values of Fu and Li’s D statistic (P < 0.05). Moreover, the region contained members of the voltage-gated calcium channel subunit gene (CACNG) family encoding for the transmembrane regulators of AMPA receptors. These proteins modulate activity of the microphtalmia-associated transcription factor gene MITF, a principal regulator of the melanogenesis (Figure 3C). Authors found 11 melanogenesis genes which were regulated by MITF and underexpressed in gray hooded crow feather follicles. Thus, the authors connect gene expression, color phenotypic differences and the signature of local divergent selection and postulate that a number of genes cause color divergence in crows. Further gene expression analysis revealed that regulator of G protein signaling 9 (RGS9), normally highly expressed in eye, together with members of SLC24 gene family, responsible for pigmentation, showed decreased expression levels in hooded crows.
To conclude, this paper underlines the significance of inversion for evolutionary process and role of sexual selection for phenotypic and genotypic differentiation.

Personal comment

This paper presents a great and complete work which deepens our understanding of the role of interspecific gene flow for the evolution and species diversification. However, on figure 1 authors showed the map of the European distribution of the carrion and hooded crows that does not linked to principal components PC1 and PC2. On my opinion, it should be better perform an analysis that links PCA to geographical coordinates, as, for example, Procrustes analysis (form of statistical shape analysis used to analyze the distribution of a set of shapes).

Poelstra, J., Vijay, N., Bossu, C., Lantz, H., Ryll, B., Muller, I., Baglione, V., Unneberg, P., Wikelski, M., Grabherr, M., & Wolf, J. (2014). The genomic landscape underlying phenotypic integrity in the face of gene flow in crows Science, 344 (6190), 1410-1414 DOI: 10.1126/science.1253226

]]>