From CBG
Jump to: navigation, search
»  An association method for discovering pioneer transcription factors

We just published a new method for discovering new pioneer transcription factors, i.e., transcription factors that play an important role in the establishment and maintenance of open chromatin. The method basically associates transcription factor expression with chromatin accessibility genome-wide in multiple cell lines. Applied to the ENCODE data, it rediscovers known pioneer transcription factors along with new ones. It shows yet again that novel biological insights can be obtained by sophisticated analysis of large-scale public data sets. The paper is published in Plos Computational Biology .

14 Feb 2017 — 13:02
»  A new paper on regulatory circuits

Using gene expression data and other genomic information we constructed 394 cell type and tissue-specific gene regulatory networks for human, each specifying the genome-wide connectivity between transcription factors, enhancers, promoters and genes. Each of these networks describes hundreds of thousands of regulatory interactions among thousands of genes, giving the first global view of the “control system” of cells and tissues. We found that genetic variants associated with human diseases disrupt components of these networks in disease-relevant tissues, giving new insights on disease mechanisms, which may lead to personalised treatments that are more effective and have fewer side effects. The paper is published in Nature Methods.

24 Mar 2016 — 18:03
»  Sven Bergmann is on TV

CBG director Sven Bergmann was interviewed in the Quarks & Co science show on the German TV channel WDR. The TV host wanted to know whether there are such things as German genes. The answer can be seen in the show .

22 Mar 2016 — 15:03
» Pascal

A new pathway analysis method and for GWAS summary statistics


We recently published Pascal (Pathway scoring algorithm), a tool that allows gene and pathway-level analysis of GWAS association results without the need to access the original genotypic data. Pascal was designed to be fast, accurate and to have high power to detect relevant pathways. We extensively tested our approach on a large collection of real GWAS association results and saw better discovery of confirmed pathways than with other popular methods. The paper is available in Plos Computational Biology . Jan 25 == Rigorous gene and pathway analysis of GWAS ==

'''Pascal (Pathway scoring algorithm) is an easy-to-use tool for gene scoring and pathway analysis from GWAS results'''. Pascal uses external data to estimate linkage disequilibrium. Therefore, the user only needs to supply genome wide SNP p-values. Pascal then derives p-values for genes and predefined pathways. Pascal doesn’t use Monte-Carlo simulation to derive gene p-values. This leads to increased speed and accuracy. This speed in the gene scoring is then leveraged to control the false positive rate in pathway scoring. For pathway scoring we implemented and tested enrichment strategies that compared very favorably compared to hypergeometric enrichment. This comparison was done on a large collection of GWAS results giving us confidence to recommend Pascal for downstream analysis of GWAS results. Pascal is mainly written in Java and has been tested on Unix systems and Mac OsX.


  • The Pascal paper was among the '''top 50 most downloaded papers''' from PLoS journals in 2016.


  • '''[http://www2.unil.ch/cbg/images/3/3d/PASCAL.zip Pascal package]''' (Download might take a while because the 1KG-EUR data are included)
  • '''[[PascalTestData | Test data]]''' (Additional data that were used for evaluation in the paper)

'''Note''': We found an issue with the genotype files packaged with the version of Pascal prior to June 6th 2017 (thanks to Sujoy Ghosh for pointing us to this issue). Genotypes on chromosome 1 seemed to be truncated leading to loss of gene scores of about 5% overall (other gene scores are unchanged). We now updated the genotypes files. While, the pathway scores are well calibrated in both cases, one would expect a small drop in power. We investigated this issue on a large GWAS collection showing small power gains for the updated genotype files in the investigated settings (see result [[Updated_vs_deprecated_genotypes| here]]).


  • '''Fast and rigorous computation of gene and pathway scores from SNP-based summary statistics'''. ('''[http://regulatorycircuits.org/data/papers/LamparterMarbach2016.pdf PDF]''')
    Lamparter D*, Marbach D*, Rueedi R, Kutalik Z, and Bergmann S.
    [http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004714 ''PLoS Computational Biology'' 12, e1004714, 2016.]

[[File:PascalFigure1.jpg|500px|left]] '''Figure: Overview of methodology to compute gene and pathway scores'''

('''a''') We compute gene scores by aggregating SNP p-values from a GWAS meta-analysis (without the need for individual genotypes), while correcting for linkage disequilibrium (LD) structure. To this end, we use numerical and analytic solutions to compute gene p-values efficiently and accurately given LD information from a reference population. Two options are available: the max and sum of chi-squared statistics, which are based on the most significant SNP and the average association signal across the region, respectively.

('''b''') We use external databases to define gene sets for each reported pathway. We then compute pathway scores by combining the scores of genes that belong to the same pathways, i.e. gene sets. The fast gene scoring method allows us to dynamically recalculate gene scores by aggregating SNP p-values across pathway genes that are in LD and thus cannot be treated independently. This amounts to fusing the genes and computing a new score that takes the full LD structure of the corresponding locus into account. We evaluate pathway enrichment of high-scoring (potentially fused) genes using one of two parameter-free procedures (chi-square or empirical score), avoiding any p-value cutoffs inherent to standard binary enrichment tests.

== Tissue-specific regulatory circuits disrupted in complex disease ==

[[File:Pascal_network_analysis.png|260px|right]] '''The efficiency and accuracy of Pascal opens the door to large-scale analyses''' that would not have been possible with previous tools. For example, summarizing SNP p-values at the level of genes is a crucial step in most network-based GWAS analysis methods. Pascal was key for our recent work, where we integrated '''37 GWAS datasets''' with close to '''400 tissue-specific gene regulatory circuits''' to systematically analyze the inter-connectivity of genes that are perturbed by trait-associated genetic variants. This study showed that disease-associated genetic variants often disturb regulatory modules in cell types or tissues that are highly specific to that disease, giving new insights on disease mechanisms.


  • '''Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases'''. ('''[http://compbio.mit.edu/publications/144_Marbach_NatureMethods_16.pdf PDF]''')
    Marbach D, Lamparter D, Quon G, Kellis M, Kutalik Z, and Bergmann S.
    [http://www.nature.com/nmeth/journal/v13/n4/abs/nmeth.3799.html ''Nature Methods'', 13, 366-370, 2016].
    • [http://dx.doi.org/10.1038/nrg.2016.36 ''Nature Reviews Genetics'' highlight], [http://regulatorycircuits.org/papers/PR_genes_social_network_en_fr.pdf SIB news], [https://www.unil.ch/getactu/wwwfbm/1457426312508/ UNIL news]
    • [http://regulatorycircuits.org regulatorycircuits.org]
2 Feb 2016 — 17:02
» Genome Wide Association Studies


First GWAS on Drosophila height published We recently collaborated with the Hafen group in Zurich on a project to identify natural variants impacting size in Drosophila. We found an association in the kek1 locus, a well-characterized growth regulator. Additionally 33 novel loci were validated. The paper is available in Plos Genetics . 11 Jan 2016

== Introduction ==

Genome Wide Association Studies (GWAS) search for correlations between genetic markers (usually Single Nucleotide Polymorphisms, short SNPs) and any measurable trait in a population of individuals. The motivation is that such associations could provide new candidates for causal variants in genes (or their regulatory elements) that play a role for the phenotype of interest. In the clinical context this may eventually lead to a better understanding of the genetic components of diseases and their risk factors.

Our current focus is on the Cohorte Lausannoise (CoLaus), a population-based sample of more than 6'000 individuals from the Lausanne area. The CoLaus phenotypic dataset includes a large range of measurements, including extensive blood chemistry, anatomic and physiological measures, as well as parameters related to life style and history. Genotypes have been measured for ~500`000 SNPs using Affymetrix 500k SNP arrays. Regressing the various phenotypes onto these SNPs has already revealed a number of highly significant associations (see our [[publications]]).

Current GWAS usually include the following steps:

  • genotype calling from the raw chip-data and basic quality control
  • principle component analysis (PCA) to detect and possibly correct for population stratification
  • genotype imputation (using linkage disequilibrium information from HapMap)
  • testing for association between a single SNP and continuous or categorical phenotypes
  • global significance analysis and correction for multiple testing
  • data presentation (e.g. using quantile-quantile and Manhattan plots)
  • cross-replication and meta-analysis for integration of association data from multiple studies

From the many GWAS that were performed in the last years it became apparent that even well-powered (meta-)studies with many thousands (and even ten-thousands) of samples could at best identify a few (dozen) candidate loci with highly significant associations. While many of these associations have been replicated in independent studies, each locus explains but a tiny (<1%) fraction of the genetic variance of the phenotype (as predicted from twin-studies). Remarkably, models that pool all significant loci into a single predictive scheme still miss out by at least one order of magnitude in explained variance. Thus, while GWAS already today provide new candidates for disease-associated genes and potential drug targets, very few of the currently identified (sets of) genotypic markers are of any practical use for accessing risk for predisposition to any of the complex diseases that have been studied.

Various solutions to this apparent enigma have been proposed: First, it is important to realize that the expected heritabilities usually have been estimated from twin-studies, often several decades ago. It has been argued that these estimates entail problems of its own (independently raised twins shared a common prenatal environment and may have undergone intrauterine competition, etc.).

Second, the genotypic information is still incomplete. Most analyses used microarrays probing only around half a million of SNPs, which is almost one order of magnitude less than the current estimates of about 4 million common variants from the Hapmap CEU panel. While many of these SNPs can be imputed accurately using information on linkage disequilibrium, there still remains a significant fraction of SNPs which are poorly tagged by the measured SNPs. Furthermore, rare variants with a Minor Allele Frequency (MAF) of less than 1% are not accessed at all with SNP-chips, but may nevertheless be the causal agents for many phenotypes. Finally, other genetic variants like Copy Number Variations (CNVs) (or even epigenetics) may also play an important role.

Third, it is important to realize that current analyses usually only employ additive models considering one SNP at a time with few, if any, co-variables, like sex, age and principle components reflecting population substructures. This obviously only covers a small set of all possible interactions between genetic variants and the environment. Even more challenging is taking into account purely genetic interactions, since already the number of all possible pair-wise interactions scales like the number of genetic markers squared.

== More Advanced Statistical Methodology ==

An important and widely used approach to dealing with cryptic population structure PricePC, and key references on genotype imputation ServinImputationMarchiniImputation.

A powerful approach to deal with strain structure or relatedness between individuals KangEMMA.

== Software ==

[http://pngu.mgh.harvard.edu/~purcell/plink PLINK] is an excellent data handling tool, and implements many useful statistical methods. It's the Swiss Army Knife for GWAS.

[http://genepath.med.harvard.edu/~reich/Software.htm EIGENSOFT] is widely used for population structure analysis and correction.

[http://www.stats.ox.ac.uk/%7Emarchini/software/gwas/gwas.html IMPUTE and SNPTEST], or [http://www.sph.umich.edu/csg/abecasis/mach MACH] and [http://mga.bionet.nsc.ru/%7Eyurii/ABEL ProbABEL], or [http://stephenslab.uchicago.edu/software.html BimBam], and all be used to perform more sophisticated model based genotype imputation and association testing.

[http://toby.freeshell.org/software/quicktest.shtml QUICKTEST] is

our own software for association testing using uncertain genotypes. For quantitative trait analysis, we think it is faster and better than SNPTEST.
2 Feb 2016 — 17:02
» Robust gradient formation

Robust gradient formation through intermolecular phosphorylation


Together with the lab of Sophie Martin at DMF, we showed that the intracellular gradient of Pom1 in fission yeast achieves robustness to fluctuation through intermolecular auto-phosphorylation. Gradient robustness, how molecular gradient can convey precise positional information despite large fluctuations in molecular dynamics, has been the subject of many conjectures in the last decades. In particular it was hypothesized in 2003 that such robustness could be achieved by super-linear decay. In this work we show that in the Pom1 gradient, super-linear decay is obtained by a very simple and elegant mechanism namely intermolecular auto-phosphorylation. This provides a first telling example of gradient robustness through super-linear decay through auto-catalysis, which could be a widespread phenomenon. The paper is available in in Molecular Systems Biology.


[[Image:Pombe pom1 gradient model.png|Qualitative model of Pom1 gradient function |thumb| 400px]] Fission yeast cells are rod-shaped cells that elongate until they reach a certain size (about 14 microns) and then split into two cells of equal size. This seems like a simple thing to do, but it begs the question of how does the cell know it has reached the right size and how can the cell know where its middle is. A few years ago, our collaborator Sophie Martin suggested a model that could explain those two processes, namely the timing (when to divide) and the positioning (where to divide) of cell division. According to that model (see Fig. on the right) the Pom1 kinase is constantly brought to the two poles of the cell and diffuses along the cortex of the cell. While it diffuses, it also detaches from the the cortex into the cytoplasm, such that the concentration of Pom1 at the cortex decreases towards the cell middle. In other words, it forms a gradient, or more precisely it forms a double gradient from each pole of the cell. Now, these gradients can be used as rulers, because the further away from the pole, the lower the concentration of Pom1. Indeed a molecular mechanism that is inhibited by Pom1 could only take place when Pom1 is is in sufficiently low concentration, that is when the cell is long enough, and also only in the cell middle, where Pom1 concentration is at its lowest. This molecular mechanism is implemented by the Cdr2 kinase that triggers mitotic entry unless which it inactivated by Pom1 phosphorylation.

[[Image:Power gradient.png| Top: a standard gradient with linear decay. Variability in amplitude is carried over and translates in variability in positional information (L1 and L2). Bottom: a power gradient with super-linear decay. Variability in amplitude is buffered, resulting in a much more precise positional information | thumb| 250px]] This is all nice (but still somewhat debated), but there is one issue. The amount of Pom1 that is brought to the cell tips is highly variable between cell and even within cells, such that the mechanism described above should be very imprecise. Indeed, if the Pom1 concentration at the pole varies a lot, this variation will be carried over to the cell middle, such that the Pom1 concentration will not be able to unambiguously indicate the distance to the pole. However, our collaborators had previously shown that Pom1 auto-phosphorylates multiple times, and that the affinity of Pom1 decreases with each phosphorylation. And this is where math comes in handy. Using a set of Partial Differential Equations (PDEs) we were able to show that if Pom1 auto-phosphorylation is intermolecular, i.e., a molecule of Pom1 can phosphorylate another molecule of Pom1 (not just itself), then the Pom1 gradient becomes a so-called power gradient. Power gradients have the special property that variations in concentration at one point of the gradient are attenuated and do not fully carry over downstream of the gradient (see Figure). Power gradients were hypothesized more than a decade ago as an abstract concept, but were not clearly observed in nature until now. Our paper shows theoretically that inter-molecular phosphorylation coupled to a phosphorylation dependent detachment (or decay) is a way to implement a power gradient.

But there is more to it. Since it is possible to find an analytical solution to our PDEs, we were able to extract three quantitative constitutive relations characteristic of our power gradient. Those constitutive relations are the following power laws:

  • A power law of -0.5 between the length scale of the gradient and the amplitude of Pom1 at the pole.
  • A power law of 2/3 between the amplitude of Pom1 and the amplitude of Tea4 at the pole
  • A power law of 0.5 between the total cortical Pom1 and the amplitude of Pom1 at the pole

Those numbers do not depend on any particular parameters, they are constitutive relations of our system. We then checked if we could observe those values experimentally by quantifying the concentration of Pom1 along the cortex of many cells. And, lo and behold, this is exactly what we observed: -0.52 for the first one, 0.63 for the second one and 0.46 for the third one (see Figures below). It is quite rare in biology that you can predict actual non-trivial numbers irrespectively of any parameters, and we do not think it is possible to find a different reasonable model that would account for those numbers. We thus believe that this work reveals the power of analytical computational models used with experimental data to validate biological hypotheses. Indeed there has been a growing concern among biologists that you can make many computational models say almost anything with suitable sets of parameters. Here we provide an example of computational model that makes quantitative and unambiguous predictions independently of the model parameters. The fact that those predictions precisely match our observations strongly supports the model. In our case, the model was further validated by showing in vivo and vitro that Pom1 auto-phosphorylates intermolecularly. [[Image:Pom1 power laws.png| The power laws translate into lines in the log-log space. The power laws predicted by the model (red lines) closely match the experimental data (black dots and black regression lines) Left: For the relationship the gradient decay length and the Pom1 amplitude at the pole, the model predicts a -0.5 power law (red line) which is very closed to the observed power law of -0.52 (black line). Right: For the relationship between Pom1 and Tea4 amplitude at the pole, the model predicts a 2/3 power law (red line) which is very closed to the observed power law of 0.63 (black line). |thumb| left| 500px]] [[Image:Pom1_cortical.png| The power laws of 0.5 predicted by the model (red lines) between the total cortical Pom1 and the Pom1 amplitude at the pole closely match the experimental data of 0.46 (black dots and black regression lines). |thumb| 350px]]

More details are available in the paper and the supplement. The data and code for analysis is also freely available

14 Jul 2015 — 12:07
» Tim

The proton pump plays a crucial role in the phototropic response of Arabidopsis Plants such as Arabidopsis Thaliana orient towards the light, thus optimizing the source of energy. This so-called phototropic response is mediated by the formation of a gradient of the plant growth hormone auxin. Using computational models validated by biological experiments, we showed in collaboration with the group of Christian Fankhauser from the CIG at UNIL, that the proton pump plays a crucial role in the establishment of this gradient and that this pump is regulated by the plants photoreceptors. The paper has just been published and is available in Molecular Systems Biology 1 Oct 2014

[[File:timCBG.jpg|200px|thumb|right|Tim Hohm, Postdoc]]

Tim Hohm is a postdoc at the [[Main_Page|''Computational Biology Group'']] in the [http://www.unil.ch/dgm Department of Medical Genetics] at the [http://www.unil.ch University of Lausanne]. He received his PhD from [http://www.ethz.ch ETH Zurich] in 2009, developing techniques for parameter estimation on gene regulative networks for multi-cell systems. He is now involved in the [http://www.systemsx.ch/index.php?id=150 Plant Growth] project from [http://www.Systemsx.ch SystemsX.ch], investigating gene regulative networks responsible for [[Phototropism_in_Arabidopsis|phototropism in ''Arabidopsis thaliana'']].

== Contact Details ==

  • address: Rue du Bugnon 27 - BU 01 116 - CH-1005 Lausanne - Switzerland
  • email: tim.hohm at unil.ch
  • phone: +41 - 21 - 692 - 53 78

== Publications == === 2013 ===

  1. phbf2013 pmid=24076239
  2. hpf2012 pmid=23152332

=== 2010 ===

  1. hzs2010a pmid=20169148
  2. wbhb2010a Woehrle M, Brockhoff D, Hohm T, and Bleuler S. ''Investigating Coverage and Connectivity Trade-offs in Wireless Sensor Networks: The Benefits of MOEAs''. In Ehrgott M et al., editors, Multiple Criteria Decision Making for Sustainable Energy and Transportation Systems (MCDM 2008), volume 634 of LNEMS, pages 211–221, Heidelberg, Germany, 2010. Springer. [http://dx.doi.org/10.1007/978-3-642-04045-0_18 doi]
  3. baha2010a Brockhoff D, Auger A, Arnold DV, and Hohm T. ''Mirrored Sampling and Sequential Selection for Evolution Strategies ''. In Schaefer R et al., editors, Parallel Problem Solving from Nature (PPSN XI), volume 6238 of LNCS, pages 11–21, Heidelberg, Germany, 2010. Springer. [http://dx.doi.org/10.1007/978-3-642-15844-5_2 doi]
  4. hz2010a pmid=20851739

=== 2009 ===

  1. hz2009b Hohm T and Zitzler E. ''A Multiobjective Evolutionary Algorithm for Numerical Parameter Space Characterization of Reaction Diffusion Systems''. In Kadirkamanathan V et al., editors, International Conference on Pattern Recognition in Bioinformatics (PRIB 2009), volume 5780 of LNBI, pages 162–174, Heidelberg, Germany, 2009. Springer. [http://dx.doi.org/10.1007/978-3-642-04031-3_15 doi]
  2. hz2009c pmid=19622425
  3. hz2009a Hohm T and Zitzler E. ''Multiobjectivization for parameter estimation: a case-study on the segment polarity network of drosophila''. In Rothlauf F et al., editors, GECCO '09: Genetic and Evolutionary Computation Conference (GECCO 2009), pages 209–216, New York, NY, USA, 2009. ACM. [http://dx.doi.org/10.1145/1569901.1569931 doi]

=== 2008 ===

  1. hegb2008a Hohm T, Egli M, Gaehwiler S, Bleuler S, Feller J, Frick D, Huber R, Karlsson M, Lingenhag R, Ruetimann T, Sasse T, Steiner T, Stocker J, and Zitzler E. ''An Evolutionary Algorithm for the Block Stacking Problem''. In Monmarché N et al., editors, Evolution Artificielle 2007, volume 4926 of LNCS, pages 111–122, Berlin, Germany, 2008. Springer. [http://dx.doi.org/10.1007/978-3-540-79305-2_10 doi]
  2. ghh2008a pmid=18284690

=== 2007 ===

  1. hz2007a Hohm T and Zitzler E. ''Modeling the Shoot Apical Meristem in A. thaliana: Parameter Estimation for Spatial Pattern Formation''. In Marchiori E et al., editors, Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (evoBIO 2007), volume 4447 of LNCS, pages 102–113. Springer, 2007. [http://dx.doi.org/10.1007/978-3-540-71783-6_10 doi]

=== 2006 ===

  1. hlh2006a pmid=16472025

=== 2005 ===

  1. hh2005a Hohm T and Hoffmann D. ''A multi-objective evolutionary approach to peptide structure redesign and stabilization''. In Beyer H-G and O'Reilly U-M, editors, Genetic and Evolutionary Computation Conference (GECCO 2005), pages 423–429, New York, NY, USA, 2005. ACM Press. [http://dx.doi.org/10.1145/1068009.1068077 doi]
1 Oct 2014 — 17:10
» Regulatory network of the shade avoidance

Adaptive hormonal signalling for the plant shade avoidance response


In a joint work with the lab of Christian Fankhauser at CIG, UNIL, we showed that plants adapt their hormonal signal to the availability of resources when avoiding shade. If resources are scarce, the signal is weaker but the sensitivity is enhanced but when the signal is abundant, a stronger and more robust signal is produced. Our study, which thus suggests that the plant optimizes a signal cost-to-robustness trade-off, has just been published in PNAS.


[[Image:ShadeAvoidanceCartoon.jpg|Metaphorical illustration of the theory: Plants with direct access to sun light (left) have more energy and can send a stronger signal, whereas shaded plants (right) have less energy and thus send a weaker signal requiring increased sensitivity. Drawn by Miguel Giraldo. | thumb|250px]] When perceiving neighboring plants, Arabidopsis seedlings send an hormonal signal triggering stem elongation in order to secure access to unfiltered sunlight. This happens when they are already shaded by the neighboring plants but also before those neighbors are shading them. In those two cases, the plants send a growth hormone, auxin, from the leaves and to the stem, leading to its elongation. However the availability of resources is very different in both cases. When the plant is already shaded, its light resources are limited, which is not the case when the plant is in the sun. We thus compared auxin signalling in those two situations and found that in true shade a weaker signal is sent, but the sensitivity to auxin is enhanced. In contrast, when light is abundant, more auxin is produced, leading to a stronger signal and reduced sensitivity. As a result, the signal is indeed more robust when resources are abundant, hinting at a cost-to robustness trade-off, which lies at the heart of information theory.

This work is a result of a collaboration with the lab of Christian Fankhauser and CIG, UNIL within the framework of the SystemsX.ch project "Plant Growth in Changing Environment".

The [[Media:RegNet.tar.gz | source code]] used for this research is available under GPL. You can also listen to a short [https://www.rts.ch/la-1ere/programmes/cqfd/5737053-cqfd-du-15-04-2014.html interview ](in french) about this work on the Swiss Radio.

15 Apr 2014 — 12:04
» Gender bias in neurodevelopmental disorders


Females are more resilient to mutations causing neuro-developmental disorders


In a study initiated by Sébastien Jacquemont from the Service of Medical Genetics in collaboration with the group of Evan Eichler, we investigated the CNV burden of autistic patients and close relatives. We could show that females diagnosed with autism have on average more deleterious mutations in genes involved in neuro-developmental disorders than males, hinting that women can cope with a higher mutational burden than men. Moreover most of the deleterious mutations in genes important for brain function are transmitted by the non-affected mothers, showing that they can tolerate more mutations than the fathers. This study was published in the American Journal of Human Genetics and featured in the French newspaper Le Figaro and in the Economist . The paper was awarded the neuroscience award from the French scientific magazine La Recherche . This yearly award rewards outstanding papers published by French speaking scientists in 12 different disciplines ranging from archeology to engineering.

   23 Oct 2015 — 12:40

It has long been observed that males are more affected by neuro-developmental disorders such as autism than females. In a study initiated by Sébastien Jacquemont from the Service of Medical Genetics in collaboration with the group of Evan Eichler from the University of Washington, we investigated the CNV burden of autistic patients and close relatives. We could show that females diagnosed with autism have on average more deleterious mutations in genes involved in neuro-developmental disorders than males, hinting that women can cope with a higher mutational men. With the same mutational burden, males tend to be more often diagnosed than females. This is unlikely to be due to social factors only, because most of the deleterious mutations in genes important for brain function are transmitted by the non-affected mothers, showing that they can tolerate more mutations than the non-affected fathers. This study was published in the [http://www.cell.com/AJHG/retrieve/pii/S0002929714000597 American Journal of Human Genetics] and featured in the French newspaper [http://sante.lefigaro.fr/actualite/2014/02/27/22048-filles-mieux-armees-face-lautisme Le Figaro] and in the [http://www.economist.com/news/science-and-technology/21597877-women-have-fewer-cognitive-disorders-men-do-because-their-bodies-are-better Economist]

5 Mar 2014 — 17:03
» Augmenting genomics through metabolomics
In a metabolome-wide genome-wide association study (MWGWAS) on the CoLaus cohort, we found two novel gene-metabolite associations, with both gene-metabolite pairs additionally linked to clinical phenotypes. For this "untargeted" MWGWAS, we used metabolic features -- rather than metbolite concentrations -- as phenotypes, and developed a metabolite identification method based on genetic association signals. Details, and future progress, on the method can be found on the metabomatching page. The paper has been published in PLOS Genetics
21 Feb 2014 — 16:02
» Cellophane

New insights in cell cycle regulation through semi-automated quantification of protein gradient In collaboration with the group of Sophie Martin from DMF (UNIL), we developed Cellophane, an ImageJ plugin that semi-automatically quantifies fluorescent protein concentration profiles along the cell cortex. This plugin enabled the quantification of hundreds of profiles of two key regulators of the fission yeast cell cycle, Pom2 and Cdr2. The data analysis, along with other experimental evidence, showed that two important functions of Pom1, deciding when and where to divide, require distinct levels of Pom1. Lower Pom1 level are sufficient for division positioning, but higher levels are required to delay mitotic entry until the proper size is reached. The paper has been published in Cell Cycle . 6 Dec 2013



== Introduction== Cellophane is an ImageJ plugin for the semi-automated quantification of a protein profile along the cell membrane. It allows for the quantification along two channels. The plugin also has a manual mode which has a lower throughput but allows for higher precision.

[[Image:PicCellophane1.jpg| thumb | screenshot | 350px]]

==Download== The plugin is available under a GPL license and the code is available [[Media:Cellophane.zip | here]]. This zip files contains two .java files containing the semi-automated and manual versions of Cellophane, as well as the necessary java library file.

You can also download the short [[Media:CellophaneManual.pdf | manual]]. If you have additional questions, you can contact [[User:Micha | Micha]]

== Credits== This plugin was developed by Micha Hersch in collaboration with the lab of Sophie Martin at the University of Lausanne, in particular with Olivier Hachet. Sascha Dalessi helped testing the software. It uses the [http://www.imagescience.org/meijering/software/ imagescience] library written by Erik Meijering. If you use Cellophane for your published work, please acknowledge it by citing the following paper:

9 Dec 2013 — 15:12
» Phototropism in Arabidopsis

Localizing phototropism in Arabidopsis


Together with the group of Christian Fankhauser from the CIG at UNIL, CBG post-doc Tim Hohm showed that the sites of light perception for phototropism is located in the upper hypocotyl, where asymmetric elongation occurs. Thus, in contrast to monocots where a phototropism signal is sent from the leaves to the stem, in Arabidopsis it all happens "on site". The paper has been published in Current Biology 27 Sept 2013


== Introduction ==

Being sessile organisms, plants posses various mechanisms to react to different and changing environmental stimuli. One of these mechanisms allows plants to adjust their growth direction to the direction of incoming blue light. This ''phototropic response'' involves sensing of light by photoreceptors, here mainly the membrane-associated proteins phot1 and phot2 kk2006a, redirection of the flux of the hormone auxin bbpm2004a,etls2006a,nbps2003a,pbbm2004a, as well as other downstream signaling events ddri2010a,flds2003a,ikmn2008a,hs2007a,lshp2006a,week2008a. Although these key players in phototropism in ''Arabidopsis thaliana'' are known, detailed means of interaction remain hidden.

The current view on phototropism can be summarized as follows: phototropism is a blue light- initiated process with its response being fluence rate dependent. For simplicity, here only low fluence rates of maximally 0.1 μmol m-2s -1 are considered—a scenario in which the phototropic response depends mostly on the activity of the photo receptor phot1. Under these fluence conditions, the second receptor of the same family, phot2, can be neglected. In addition, the two cryptochromes cry1 and cry2 have a mild effect on phototropism kk2006a but are not further considered here.

== Open Questions ==

Considering the fact that during phototropism a lateral auxin gradient with its maximum on the shaded side is formed, the question arises how it is possible that such a gradient is established. Here, it is of special interest why the maximum of the gradient is located on the shaded side since the original blue light stimulus is applied to the opposite side and photo-activation seems to be positively fluence correlated. Still, one can argue that the light absorption of a tissue like a dark grown hypocotyl (with a diameter of about 250μm) hardly absorbs any light but then one would need to question why a gradient is formed at all.

In the course of this project, it is planned to investigate this gradient formation relying on both, experimental techniques as well as computational modeling, collaborating with the groups of [http://www.botany.unibe.ch/associated/systemsx/index.php Richard Smith] and [http://www.unil.ch/cig/page8391.html Christian Fankhauser] as part of the [http://www.systemsx.ch/index.php?id=150 Plant Growth] project from [http://www.Systemsx.ch SystemsX.ch].
27 Sep 2013 — 10:09
» HypoPhen

CBG's high troughput plant phenotyping software HypoPhen helps understand phototropism in plants In collaboration with the group of Christian Fankhauser at CIG, UNIL, we developed the HypoPhen software for the high throughput quantification of seedling elongation and bending from time-lapsed images. Using this tool, hundreds of Arabidopsis seedlings were measured to show that phytochrome A in the nucleus is important for phototropism. The results have been published in Plant Cell on February 28 2012.


== Introduction == Hypophen is an open-source software enabling the semi-automatic phenotyping of growing seedlings from time-lapse images. More precisely, it computes and records the elongation and bending of the seedlings. It is semi-automatic in the sense that manual calibration, verification and adjustments are sometimes needed. In my experience, it allows a throughput of about 50 images of 20 hypocotyls in about 10 minutes, given a reasonable image quality.

The software was developed within the context described in the following paper:


[[Image:HypoPhen_screenshot1.jpg| thumb | screenshot | 350px]]

=== Movies === Here is a [[Media:HypoPhen_movie1.avi | movie]] showing in real time an excerpt from the semi-automated processing of images of 14 hypocotyls. In those 20 seconds, five frames (70 hypocotyls) are processed. Here is another [[Media:HypoPhen_movie2.avi | movie]] showing the analysis from scratch of the 12 example images of 5 hypocotyls, including the manual calibration procedure.

== Prerequisite == Hypophen works on '''Linux''', '''Mac OS X''' version 10.6 or later and '''Windows'''.

For Linux, it needs the [http://opencv.willowgarage.com/wiki/ OpenCV] library (free) and must be compiled from source, The use of [http://www.cmake.org CMake] (also free) makes this rather straightforward. You also need a two button mouse to use the software. More details are given in the manual.

Windows and Mac user can directly download the executable file and launch it from the command prompt

== Download == You can download the [[Media:Hypophen.tar.gz | C++ source code]] (version 0.4). The latest code is available on [https://sourceforge.net/p/hypophen sourceforge]

The windows executable of version 0.4 (along with required dlls) for Windows 7 is [[Media:HypoPhen_win.zip | here]]. Note that this executable is for the standard x86 (32 bits) architectures. Let [[User:Micha | Micha]] know if you need it for a 64 bits architecture.

There is also a [[Media:HypoPhen.dmg.zip | dmg file]] for MacOS (version 0.4)

The [[Media:HypoPhenManual.pdf | manual ]] explains how to install and use the software.

You can also download a small set of hypocotyl [[Media:HypoPhenTestImages.zip | test images]] to test the software. Those images were kindly provided by Emilie Demarsy.

== Quick testing ==

For '''Mac OS''', you can download the [[Media:HypoPhen.dmg.zip | dmg file]] and the [[Media:HypoPhenTestImages.zip | test images]]. Unzip the test images and put the "images" folder in the hypoPhen.app/Contents/MacOS/ folder (yes, you have to enter the hypoPhen application folder). Double click on hypoPhen file in this same folder, will launch the software on the test images. Refer to the manual for usage and more detailed instructions.

For '''Windows 7''', you can download the [[Media:HypoPhen_win.zip | windows executable]] and the [[Media:HypoPhenTestImages.zip | test images]]. Unzip both files and put the "images" folder in the "HypoPhen_win" folder. Double click on hypoPhen file in this same folder, will launch the software on the test images. Refer to the manual for usage and more detailed instructions.

== Related software == Software trying to achieve similar goals include [http://brie.cshl.edu/~liyawang/HYPOTrace HypoTrace] and [http://cactus.salk.edu/hyde/ HyDe] both of which are matlab-based and not open-source.

== Benefiting and contributing == This software is provided "as is", in the hope that it will be useful but without any warranty of any kind. If you use this software for research purposes, please be kind enough to mention it in your scientific publications by citing the [http://www.plantcell.org/cgi/content/short/tpc.111.095083?keytype=ref&ijkey=krNVCQ5WJMpbgHV Plant Cell paper] above. If you find any bug, have any problem with the installation or would like to contribute to further developing this software, please write an email to [[User:Micha|Micha Hersch]]. The project is also available on [https://sourceforge.net/p/hypophen sourceforge]

== Credits == Hypophen was written by [[User:Micha|Micha Hersch]] in collaboration with Chitose Kami and Christian Fankhauser from the Center for Integrative Genomics at the University of Lausanne. The project was initiated with the help of Ioannis Xenarios from Vital-IT and [[User:Sven|Sven Bergmann]], head of the CBG. It uses the OpenCV library and some code written by Basilio Noris. Emilie Demarsy provided useful feedback and example of images.

The development of this software was funded by [http://www.systemsx.ch SystemsX] throught the [https://wiki.systemsx.ch/display/PGRTDproj/Home Plant Growth] project

[[Image:Plantgrowthlogo.png| plant growth| thumb |100 px]] [[File:SystemsXlogo.png| sytemsX |250 px]]

20 Aug 2012 — 17:08
»  Novel hypertension susceptibility locus in the promoter region of eNOS

A genome-wide association study by the HYPERGENES Consortium unravelled a novel hypertension susceptibility locus in the promoter region of the eNOS gene and essential hypertension. The article appeared online in Hypertension on 19 December 2011.

20 Aug 2012 — 17:08
»  Modeling morphogen gradient formation

In a recent work, we developed a general formalism allowing to model diffusive gradient formation from an arbitrary source. This formalism applies to various diffusion problems and we illustrate our theory with the explicit example of the Bicoid gradient establishment in Drosophila embryos. The article appeared online in Journal of Theoretical Biology on 10 November 2011.

20 Aug 2012 — 17:08
» SIB Young Bioinformatician award 2010

On June 25 2010, during the 8th [BC]2 Computational Biology Conference in Basel, SIB Swiss Institute of Bioinformatics announced that CBG member [[Aitana Morton de Lachapelle]] is the winner of the SIB Young Bioinformatician Award 2010.

On June 25 2010, during the 8th [BC]2 [http://www.bc2.ch/ Computational Biology Conference] in Basel, [http://www.isb-sib.ch/ SIB Swiss Institute of Bioinformatics] announced the winner of the [http://www.isb-sib.ch/research/sib-awards/sib-2010-young-bioinformatician-award.html SIB Young Bioinformatician Award 2010].

[[Image:aitana_sib_award_2010.JPG|thumb|Aitana Morton de Lachapelle, Basel, June 25 2010|300px]]

The winner is SIB Member [[Aitana Morton de Lachapelle]], 27, PhD student in the Computational Biology Group led by Prof. Sven Bergmann at the Department of Medical Genetics of the University of Lausanne, which she joined after graduating in Physics from the EPFL (Swiss Federal Institute of Technology in Lausanne). During her PhD thesis, she has been investigating how robust pattern formation can be achieved during development.

Within a developing organism, cells need to know where they are in order to differentiate into the correct cell type. Pattern formation is the process by which cells acquire positional information and thus determine their fate. This can be achieved by the production and release of a diffusible signaling molecule, called a “morphogen”, which forms a concentration gradient: exposure to different morphogen levels then leads to different cell fates. Though morphogens have been known for decades, Mrs. Morton de Lachapelle explains that “it is not yet clear how these gradients form and yield such robust patterns. We have been investigating the properties of Bicoid and Decapentaplegic, two morphogens involved in the patterning of the anterior-posterior axis of Drosophila embryo and wing primordium, respectively”. In particular, she is interested in understanding how the pattern proportions are maintained across embryos of different sizes or within a growing tissue, which is essential to yield a correctly proportioned organism or organ. Ultimately, the general understanding of how cells respond to signals and coordinate their actions could bring new insights into some diseases like cancer and, theoretically, provide the ground to make artificial tissues.

[[Image:aitana_sven_sib_award_2010.JPG|thumb|Aitana Morton de Lachapelle & Sven Bergmann, Basel, June 25 2010|left|400px]]

In their published work, Mrs. Morton de Lachapelle and Prof. Bergmann investigated two systems properties of Drosophila early embryo development: using staining images for three gap genes and the pair-rule gene Eve, they investigated the precision and scaling of their expression domains. Their results suggest that these properties are, at least in part, already achieved at the level of the Bicoid gradient itself and then passed on to its target genes. Investigating models that can reproduce the position-dependent signatures of precision and scaling, they identified two necessary ingredients: it is essential to include nuclear trapping and an external pre-steady state morphogen gradient to achieve both maximal precision at mid-embryo and almost perfect scaling away from the source. Current work within the SystemsX.ch WingX collaboration aims at understanding how scaling can be achieved by the Decapentaplegic signaling pathway during wing imaginal disc growth.

The Young Bioinformatician Award is given yearly by SIB Swiss Institute of Bioinformatics. It recognises a graduate student or young researcher who has carried out a research project centered on the in silico analysis of biological sequences, structures and processes. The award is given competitively by a jury of experts and is doted with a cash prize of CHF 10'000.
20 Aug 2012 — 17:08

ISA application note A new application note has been published recently in Bioinformatics, about the '''isa''' and '''eisa''' packages and the Iterative Signature Algorithm. 24 Jul 2010 — 17:49

[[image:expmat.png|An ISA transcription module|300px|right|link=ISA]]
Large sets of data, like expression profile from many samples, require analytic tools to reduce their complexity. The '''Iterative Signature Algorithm (ISA)''' was designed to reduce the complexity of very large sets of data by decomposing it into so-called "modules". In the context of gene expression data these modules consist of subsets of genes that exhibit a coherent expression profile only over a subset of microarray experiments. Genes and arrays may be attributed to multiple modules and the level of required coherence can be varied resulting in different "resolutions" of the modular mapping. Since the ISA does not rely on the computation of correlation matrices (like many other tools), it is extremely fast even for very large datasets.

= Software for Gene expression data =

We developed the eisa [http://www.r-project.org GNU R] package to facilitate the modular analysis of gene expression data. The package uses standard [http://www.bioconductor.org BioConductor] data structures and includes various visualization tools as well.

=== Requirements, download and installation ===

To use eisa you will need a working [http://www.r-project.org GNU R] installation.

As of the 23rd of April, 2010, the eisa package is an official [http://www.bioconductor.org BioConductor] package.

eisa depends on a number of other R packages: isa2, Biobase, AnnotationDbi, Category, genefilter, DBI. The good news is that all these dependencies are installed automatically, and all you need to do is to start R and type in


at your R prompt. See [http://bioconductor.org/packages/release/bioc/html/eisa.html the eisa package page at the BioConductor website] for details.

Alternatively, you can also download the package from here:

  • '''[http://www.unil.ch/cbg/homepage/downloads/eisa_1.0.0.zip Microsoft Windows (32 bit)]'''
    Download [http://www.unil.ch/cbg/homepage/downloads/eisa_1.0.0.zip this file], save it in a temporary directory, and then start R. From the Packages menu choose 'Install packages from local zip files' and select the saved file.
  • '''[http://www.unil.ch/cbg/homepage/downloads/win64/eisa_1.0.0.zip Microsoft Windows (64 bit)]'''
    Download [http://www.unil.ch/cbg/homepage/downloads/win64/eisa_1.0.0.zip this file], save it in a temporary directory, and then start R. From the Packages menu choose 'Install packages from local zip files' and select the saved file.
  • '''[http://www.unil.ch/cbg/homepage/downloads/eisa_1.0.0.tgz Mac OSX (Leopard)]'''
    Download and install [http://www.unil.ch/cbg/homepage/downloads/eisa_1.0.0.tgz this file].
  • '''[http://www.unil.ch/cbg/homepage/downloads/eisa_1.0.0.tar.gz Linux and Unix systems, R source package]'''
    Download [http://www.unil.ch/cbg/homepage/downloads/eisa_1.0.0.tar.gz this file], save it in a temporary directory, and start R. Install the downloaded package using the install.packages() function: give the full path of the saved file and use the 'repos=NULL' argument of install.packages().

=== License ===

The eisa package is licensed under the GNU General Public License, version 2 or later. For details, see http://www.gnu.org/licenses/old-licenses/gpl-2.0.html.

= Software for any tabular data =

The ISA can be applied to identify coherent substructures (i.e. modules) from any rectangular matrix of data. You can use the isa2 R package for such an analysis.

=== Requirements ===

No additional R package is required to install and use isa2. But on Linux and Unix systems you will need a C compiler to install it. E.g. on Ubuntu Linux you will need to install the build-essential package.

=== Installation ===

The isa2 package is available from [http://cran.r-project.org/ CRAN], the standard R package repository. You can install it on any platform that is supported by GNU R, e.g. Microsoft Windows, Mac OSX and Linux systems. To install it, start R and type in


at the prompt. On Linux and Unix-like systems, you will need a working C compiler for a successful installation.

=== License ===

The isa2 package is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

= Tutorials =

===[[EISA tutorial|The Iterative Signature Algorithm for Gene Expression Data]]=== Shows the typical steps of modular analysis, from loading you expression data to the visualization of transcription modules.
[[EISA tutorial|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/EISA_tutorial.pdf PDF] [http://www2.unil.ch/cbg/homepage/downloads/EISA_tutorial.Rnw Rnw] [http://www2.unil.ch/cbg/homepage/downloads/EISA_tutorial.R R code]

===[[EISA and the biclust package|ISA and the biclust package]]=== The biclust package implements several biclustering algorithms. It is possible to convert the results of biclust to transcription modules and vice-versa.
[[EISA and the biclust package|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/EISA_biclust.pdf PDF] [http://www2.unil.ch/cbg/homepage/downloads/EISA_biclust.Rnw Rnw] [http://www2.unil.ch/cbg/homepage/downloads/EISA_biclust.R R code]

===[[Tissue specific expression with the Iterative Signature Algorithm]]=== [[Tissue specific expression with the Iterative Signature Algorithm|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/tissues.pdf PDF] [http://www2.unil.ch/cbg/homepage/downloads/tissues.Rnw Rnw] [http://www2.unil.ch/cbg/homepage/downloads/tissues.R R code]

===[[EISA module trees|Hierarchical module trees]]=== A module tree is the hierarchical modular organization of a data set.
[[EISA module trees|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/EISA_module_trees.pdf PDF] [http://www2.unil.ch/cbg/homepage/downloads/EISA_module_trees.Rnw Rnw] [http://www2.unil.ch/cbg/homepage/downloads/EISA_module_trees.R R code]

===[[ISA tutorial|The Iterative Signature Algorithm]]=== Tutorial for the analysis of tabular data with the isa2 R package.
[[ISA tutorial|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/ISA_tutorial.pdf PDF] [http://www2.unil.ch/cbg/homepage/downloads/ISA_tutorial.Rnw Rnw] [http://www2.unil.ch/cbg/homepage/downloads/ISA_tutorial.R R code]

===[[Running ISA in parallel]]=== Shows how to run ISA on a computer cluster or multi-processor machine, using MPI and the Rmpi and snow R packages.
[[Running ISA in parallel|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/ISA_parallel.pdf PDF] [http://www2.unil.ch/cbg/homepage/downloads/ISA_parallel.Rnw Rnw] [http://www2.unil.ch/cbg/homepage/downloads/ISA_parallel.R R code]

===[[ISA internals]]=== [[ISA internals|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/ISA_internals.pdf PDF] [http://www2.unil.ch/cbg/homepage/downloads/ISA_internals.Rnw Rnw] [http://www2.unil.ch/cbg/homepage/downloads/ISA_internals.R R code]

=Matlab package= You can download it from [[Media:ISApackage-1.03.zip|here]]. It also includes the implementation of the Ping-pong algorithm Kutalik2008NB. The "testPP.m" file explains how the algorithm is applied to a pair of toy data sets. To test the ISA functionalities, the "testISA.m" needs to be launched.

= Papers =

18464786 15606968 15044247 14737187 12689096 12134151 ''PDF files:" [[Media:PPA.pdf|Kutalik2008]] [[Media:review.pdf|Ihmels 2004]] [[Media:bioISA.pdf| Ihmels 2004a]] [[Media:comparative.pdf| Bergmann 2004]] [[Media:ISA.pdf|Bergmann 2003]] [[Media:SA.pdf|Ihmels2002]]
20 Aug 2012 — 17:08
» Calcium Meta-Analysis

Our calcium meta-analysis paper was published in PLoS Genetics.

20 Aug 2012 — 17:08
» ExpressionView

ExpressionView application note We published an application note about the ExpressionView bicluster visualization tool in Bioinformatics. Please see the [[ExpressionView]] page for more — documentation, downloads, screenshots — on ExpressionView. 29 Jul 2010 — 13:32

[[image:Expressionview.screenshot.png|Screenshot of the ExpressionView applet|500px|left]]

ExpressionView is an R package that provides an interactive environment to explore biclusters identified in gene expression data. A sophisticated ordering algorithm is used to present the biclusters in a visually appealing layout. From this overview, the user can select individual biclusters and access all the biologically relevant data associated with it. The package is aimed to facilitate the collaboration between bioinformaticians and life scientists who are not familiar with the R language.

= Demos =

  • [http://www2.unil.ch/cbg/software/expressionview/flash/ExpressionView.html?filename=../data/Expressionview.sampledata.all.small.evf Launch ExpressionView with adult T-cell acute lymphocytic leukemia (ALL) data (8 modules)]
  • [http://www2.unil.ch/cbg/software/expressionview/flash/ExpressionView.html?filename=../data/Expressionview.sampledata.all.large.evf Launch ExpressionView with adult T-cell acute lymphocytic leukemia (ALL) data (108 modules)]
  • [http://www2.unil.ch/cbg/software/expressionview/flash/ExpressionView.html Launch ExpressionView]

= Requirements and installation =

== Download the R package (includes the Flash applet) ==

To use the ExpressionView R package you will need a working [http://www.r-project.org GNU R] installation.

As of the 23rd of April, 2010, the ExpressionView package is an official [http://www.bioconductor.org BioConductor] package.

ExpressionView depends on a number of other R packages: isa2, Biobase, AnnotationDbi, etc. The good news is that all these dependencies are installed automatically, and all you need to do is to start R and type in


at your R prompt. See [http://bioconductor.org/packages/2.8/bioc/html/ExpressionView.html the ExpressionView package page at the BioConductor website] for details.

Alternatively, you can also download the package from here:

  • '''[http://www.unil.ch/cbg/homepage/downloads/ExpressionView_1.0.0.zip Microsoft Windows (32 bit)]'''
    Download [http://www.unil.ch/cbg/homepage/downloads/ExpressionView_1.0.0.zip this file], save it in a temporary directory, and then start R. From the Packages menu choose 'Install packages from local zip files' and select the saved file.
  • '''[http://www.unil.ch/cbg/homepage/downloads/win64/ExpressionView_1.0.0.zip Microsoft Windows (64 bit)]'''
    Download [http://www.unil.ch/cbg/homepage/downloads/win64/ExpressionView_1.0.0.zip this file], save it in a temporary directory, and then start R. From the Packages menu choose 'Install packages from local zip files' and select the saved file.
  • '''[http://www.unil.ch/cbg/homepage/downloads/ExpressionView_1.0.0.tgz Mac OSX (Leopard)]'''
    Download and install [http://www.unil.ch/cbg/homepage/downloads/ExpressionView_1.0.0.tgz this file].
  • '''[http://www.unil.ch/cbg/homepage/downloads/ExpressionView_1.0.0.tar.gz Linux and Unix systems, R source package]'''
    Download [http://www.unil.ch/cbg/homepage/downloads/ExpressionView_1.0.0.tar.gz this file], save it in a temporary directory, and start R. Install the downloaded package using the install.packages() function: give the full path of the saved file and use the 'repos=NULL' argument of install.packages().

The Flash applet requires a Flash-enabled web browser. Please install Adobe Flash Player from the [http://get.adobe.com/flashplayer/ Adobe web site] if your browser does not have it yet.

== Download the stand-alone viewer (Adobe AIR) == If you prefer a stand-alone viewer, you can download and install the Adobe AIR build [http://www2.unil.ch/cbg/software/expressionview/air/ExpressionView.air ExpressionView.air] (right-click to download file).
To run the program, you need the AIR runtime environment which you can get from [http://get.adobe.com/air Adobe].
ExpressionView creates files associations to .evf files, allowing you to simply double-click on such files to launch the viewer and load the data.

== Download sample data ==

  • [http://www2.unil.ch/cbg/software/expressionview/data/Expressionview.sampledata.all.small.evf Gene expression profile of adult T-cell acute lymphocytic leukemia (ALL) with 8 modules] (right-click to download file).
  • [http://www2.unil.ch/cbg/software/expressionview/data/Expressionview.sampledata.all.large.evf Gene expression profile of adult T-cell acute lymphocytic leukemia (ALL) with 108 modules] (right-click to download file).

= License =

The ExpressionView package is licensed under the GNU General Public License, version 2 or later. For details, see http://www.gnu.org/licenses/old-licenses/gpl-2.0.html.

= Screenshots = image:Expressionview.screenshot.1.png|Startup screen image:Expressionview.screenshot.2.png|Global view after loading dataset image:Expressionview.screenshot.3.png|Highlighting modules image:Expressionview.screenshot.4.png|Zoom functions image:Expressionview.screenshot.5.png|Highlighting genes (probes) and samples image:Expressionview.screenshot.6.png|GO and KEGG associations image:Expressionview.screenshot.7.png|Modules view emphasizing the underlying gene expression data image:Expressionview.screenshot.8.png|Global view without gene expression data image:Expressionview.screenshot.9.png|Experiment description

= Tutorials = There are several tutorials describing how to use ExpressionView. The features of the R package are documented within the program. Just have a look at the ExpressionView help page after you have installed the package. Below, you can download the tutorial presenting the basic workflow and the description of the ordering algorithm. For the Flash applet, we have produced a few videos showing you how to use the program.

===[[Getting started with ExpressionView|Getting started with ExpressionView]]=== [[Getting started with ExpressionView|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/ExpressionView.tutorial.pdf PDF] [http://www2.unil.ch/cbg/homepage/downloads/ExpressionView.tutorial.Rnw Rnw] [http://www2.unil.ch/cbg/homepage/downloads/ExpressionView.tutorial.R R code]

===[[Ordering algorithm used in ExpressionView|Ordering algorithm used in ExpressionView]]=== [[Ordering algorithm used in ExpressionView|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/ExpressionView.ordering.pdf PDF]

===[[ExpressionView File Format|ExpressionView File Format]]=== [[ExpressionView File Format|HTML]] [http://www2.unil.ch/cbg/homepage/downloads/ExpressionView.format.pdf PDF]

===[http://www2.unil.ch/cbg/software/expressionview/r/ExpressionView.pdf ExpressionView R package manual]=== [http://www2.unil.ch/cbg/software/expressionview/r/ExpressionView.pdf PDF]

=== Flash applet ===

  • [[Media:Expressionview.quickhelp.pdf|Quick help (pdf)]]
  • [http://www2.unil.ch/cbg/software/expressionview/videos/Expressionview.videotutorial.getting.started.mov Getting started (video tutorial, 6 minutes)]
  • [http://www2.unil.ch/cbg/software/expressionview/videos/Expressionview.videotutorial.tables.mov Using the tables (video tutorial, 6 minutes)]
  • [http://www2.unil.ch/cbg/software/expressionview/videos/Expressionview.videotutorial.modularview.mov Modular view (video tutorial, 4 minutes)]
  • [http://www2.unil.ch/cbg/software/expressionview/videos/Expressionview.videotutorial.view.mov Fullscreen feature (video tutorial, 1 minute)]

=== Installing the stand-alone version ===

  • [http://www2.unil.ch/cbg/software/expressionview/videos/Expressionview.videotutorial.standalone.mov Stand-alone installation (video tutorial, 2 minutes)]

= Additional documentation and downloads = The ExpressionView data file is an XML file. We have created a corresponding XML Schema file that defines its structure.

  • [http://www2.unil.ch/cbg/software/expressionview/data/expressionview.xsd ExpressionView XML data file schema]

The Flash applet is written in ActionScript. It is open source and can be built from the command line using the Adobe Flex SDK or more conveniently with the Adobe Flex Builder IDE. For more information, visit the [http://www.adobe.com/flex Flex website].

  • [http://www2.unil.ch/cbg/software/expressionview/source/ExpressionView.tar.gz ExpressionView source code (for the Flash applet)]
  • [http://www2.unil.ch/cbg/software/expressionview/doc ActionScript source code documentation]

For the Flash applet, we have implemented components that could also be used in other applications. The most important one being the LargeBitmapData class that allows one to work with BitmapData of arbitrary size. In the [http://livedocs.adobe.com/flex/3/langref/flash/display/BitmapData.html standard BitmapData class], the maximum size for a BitmapData object is 8,192 pixels in width or height, and the total number of pixels cannot exceed 16,777,216 pixels. Note that the ResizablePanel class is no longer used in ExpressionView.

  • [[Media:Expressionview.largebitmapdata.tar.gz|ActionScript implementation of the LargeBitmapData class (allows to use bitmaps of arbitrary dimensions)]]
  • [[Media:Expressionview.resizablepanel.tar.gz|ActionScript implementation of the ResizablePanel class (a panel with open, maximize, minimize, close and resize buttons)]]
20 Aug 2012 — 17:08
» Genome-wide Association Study reveals new HLA-haplotype strongly protective against narcolepsy

In collaboration with Mehdi Tafti's research group we have published our recent discovery on a newly identified HLA haplotype that protects individuals from narcolepsy even if they carry the famous risk haplotype. Our article appeared online in Nature Genetics on 15 August 2010.

20 Aug 2012 — 17:08
» GIANT height

Hundreds of genomic variants are associated with human anthropometric traits


Via the CoLaus and Hypergenes cohorts our group contributed to the meta-analysis of the GIANT consortium that revealed hundreds of genetic variants associated with human height; 18 new loci for body mass index; and 13 new loci for waist-hip-ratio.

   12 Oct 2010 — 9:18

'''Via the CoLaus and Hypergenes cohorts our group contributed to the meta-analysis of the GIANT consortium that revealed hundreds of genetic variants associated with [http://www.nature.com/nature/journal/vaop/ncurrent/full/nature09410.html human height]; 18 new loci for [http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.686.html body mass index]; and 13 new loci for [http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.685.html waist-hip-ratio]. '''

Our group - as the analyst of the CoLaus and Hypergenes cohorts - provided [[Genome Wide Association Studies | genome-wide association]] summary statistics for the meta-analytic effort of the GIANT consortium. The meta-analysis used 183,727 individuals to reveal 180 loci harbouring genetic variants associated with adult height. These variants are clustered in genomic loci and biological pathways.

A further association analysis of 249,796 individuals revealed 18 new loci associated with body mass index. Some loci (at MC4R, POMC, SH2B1 and BDNF) map near key hypothalamic regulators of energy balance, and one of these loci is near GIPR, an incretin receptor. Furthermore, genes in other newly associated loci may provide new insights into human body weight regulation.

Finally, a meta-analysis identified 13 new loci associated with waist-hip ratio. Seven of these loci exhibited marked sexual dimorphism, all with a stronger effect on WHR in women than men. These findings provide evidence for multiple loci that modulate body fat distribution independent of overall adiposity and reveal strong gene-by-sex interactions.
20 Aug 2012 — 17:08
» Total Explained Variance

Novel method to estimate explained variance of GWAS hits reveals large fraction of the missing heritability


In collaboration with John Whittaker (GSK) we have published a new methodology to infer total explained variance of [[Genome Wide Association Studies | GWAS]] hits. Our method was applied to the most recent GIANT association summary statistics and revealed that GWAS hits explain at least 30% of human height variations. The article appeared online in Genetic Epidemiology on 6 April 2011.

   6 April 2011 — 21:35

Genome-wide association studies (GWAS) are conducted with the promise to discover novel genetic variants associated with diverse traits. For most traits, associated markers individually explain just a modest fraction of the phenotypic variation, but their number can well be in the hundreds. We developed a maximum likelihood method that allows us to infer the distribution of associated variants even when many of them were missed by chance. Compared to previous approaches, the novelty of our method is that it (a) does not require having an independent (unbiased) estimate of the effect sizes; (b) makes use of the complete distribution of P-values while allowing for the false discovery rate; (c) takes into account allelic heterogeneity and the SNP pruning strategy. We applied our method to the latest GWAS meta-analysis results of the GIANT consortium. It revealed that while the explained variance of genome-wide (GW) significant SNPs is around 1% for waist-hip ratio (WHR), the observed P-values provide evidence for the existence of variants explaining 10% (CI=[8.5–11.5%]) of the phenotypic variance in total. Similarly, the total explained variance likely to exist for height is estimated to be 29% (CI=[28–30%]), three times higher than what the observed GW significant SNPs give rise to. This methodology also enables us to predict the benefit of future GWA studies that aim to reveal more associated genetic markers via increased sample size. For more details click [http://onlinelibrary.wiley.com/doi/10.1002/gepi.20582/abstract;jsessionid=9C088B31B90307D9588A29394535DDDF.d03t01 here].

A simple Matlab package of the algorithm can be downloaded from [[Media:TotVar.zip|here]]. Read, modify and launch the ''main.m'' file.

20 Aug 2012 — 17:08
»  Understanding Dpp gradient formation mechanism

In collaboration with the Basler group (University of Zurich), we developed a theoretical model allowing to understand which is the leading mechanism involved in the Dpp long range gradient formation. The article appeared online in PLoS Biology on 26 July 2011.

20 Aug 2012 — 17:08
»  The evolution of gene expression levels in mammalian organs

A collaborative study with the Kaessmann group on "The evolution of gene expression levels in mammalian organs" where we first applied the [[ISA]] to RNAseq data has been published online as article in Nature on 19 October 2011.

20 Aug 2012 — 17:08
» Sven Bergmann

Sven Bergmann is Associate Professor Sven Bergmann has successfully completed his tenure-track as Assistant Professor and is Associate Professor since August 2010. 1 Aug 2010 — 9:12

[[File:Sven_cat_pic.jpg|240px|thumb|left|Sven Bergmann, PI]] Sven Bergmann heads the [http://www2.unil.ch/cbg ''Computational Biology Group''] in the [http://www.unil.ch/dgm ''Department of Computational Biology (formerly Department of Medical Genetics''] at the [http://www.unil.ch ''University of Lausanne'']. He joined the [http://www.unil.ch/fbm Faculty of Biology and Medicine] in 2005 as Assistant Professor and became Associate Professor in 2010 after successfully completing his tenure track. He is also affiliated with the [http://www.isb-sib.ch/ Swiss Institute of Bioinformatics] since 2006.

Sven studied theoretical particle physics with [http://www.weizmann.ac.il/home/ftnir Prof. Yosef Nir] at the [http://www.weizmann.ac.il Weizmann Institute of Science] (Israel) where he received his PhD in 2001 for [http://inspirehep.net/search?p=find+author+bergmann%2C+s+and+not+author+storchi&FORMAT=WWW&SEQUENCE= studies of neutrino oscillations and CP violation]. He then joined the laboratory of [http://barkai-serv.weizmann.ac.il/GroupPage/ Prof. Naama Barkai] in the Department of Molecular Genetics at the same institute, where he first worked as a [http://www.weizmann.ac.il/RGP_open/postdoc/Weizmann-Postdoc.html Koshland postdoctoral fellow] and later as staff scientist.

His work in the field of computational biology includes designing and applying novel algorithms for the analysis of large-scale biological and medical data, as well as modeling of genetic networks pertaining to the development of the Drosophila embryo and the response of plants to environmental changes. A list of publications is available [http://www2.unil.ch/cbg/index.php?title=Publications here] or at [https://scholar.google.co.uk/citations?user=kuTL6u8AAAAJ&hl=en Google Scholar].


Permanent Address: Rue du Bugnon 27 - DGM 135 - CH-1005 Lausanne - Switzerland

Phone at work: +41-21-692-5452

Cell phone: +41-78-663-4980

e-mail: Sven.Bergmann_AT_unil.ch


PS: Do you know how to get ''smoothly'' from A to B? Well, you just need to minimize a functional expression, see this [[http://arxiv.org/PS_cache/physics/pdf/0105/0105039v1.pdf paper]] for details!
6 Aug 2010 — 11:08