Revision as of 10:38, 1 February 2016 by Micha (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

An ISA transcription module

Large sets of data, like expression profile from many samples, require analytic tools to reduce their complexity. The Iterative Signature Algorithm (ISA) was designed to reduce the complexity of very large sets of data by decomposing it into so-called "modules". In the context of gene expression data these modules consist of subsets of genes that exhibit a coherent expression profile only over a subset of microarray experiments. Genes and arrays may be attributed to multiple modules and the level of required coherence can be varied resulting in different "resolutions" of the modular mapping. Since the ISA does not rely on the computation of correlation matrices (like many other tools), it is extremely fast even for very large datasets.

Software for Gene expression data

We developed the eisa GNU R package to facilitate the modular analysis of gene expression data. The package uses standard BioConductor data structures and includes various visualization tools as well.

Requirements, download and installation

To use eisa you will need a working GNU R installation.

As of the 23rd of April, 2010, the eisa package is an official BioConductor package.

eisa depends on a number of other R packages: isa2, Biobase, AnnotationDbi, Category, genefilter, DBI. The good news is that all these dependencies are installed automatically, and all you need to do is to start R and type in


at your R prompt. See the eisa package page at the BioConductor website for details.

Alternatively, you can also download the package from here:


The eisa package is licensed under the GNU General Public License, version 2 or later. For details, see

Software for any tabular data

The ISA can be applied to identify coherent substructures (i.e. modules) from any rectangular matrix of data. You can use the isa2 R package for such an analysis.


No additional R package is required to install and use isa2. But on Linux and Unix systems you will need a C compiler to install it. E.g. on Ubuntu Linux you will need to install the build-essential package.


The isa2 package is available from CRAN, the standard R package repository. You can install it on any platform that is supported by GNU R, e.g. Microsoft Windows, Mac OSX and Linux systems. To install it, start R and type in


at the prompt. On Linux and Unix-like systems, you will need a working C compiler for a successful installation.


The isa2 package is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. To view a copy of this license, visit or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.


The Iterative Signature Algorithm for Gene Expression Data

Shows the typical steps of modular analysis, from loading you expression data to the visualization of transcription modules.
HTML PDF Rnw R code

ISA and the biclust package

The biclust package implements several biclustering algorithms. It is possible to convert the results of biclust to transcription modules and vice-versa.
HTML PDF Rnw R code

Tissue specific expression with the Iterative Signature Algorithm

HTML PDF Rnw R code

Hierarchical module trees

A module tree is the hierarchical modular organization of a data set.
HTML PDF Rnw R code

The Iterative Signature Algorithm

Tutorial for the analysis of tabular data with the isa2 R package.
HTML PDF Rnw R code

Running ISA in parallel

Shows how to run ISA on a computer cluster or multi-processor machine, using MPI and the Rmpi and snow R packages.
HTML PDF Rnw R code

ISA internals

HTML PDF Rnw R code

Matlab package

You can download it from here. It also includes the implementation of the Ping-pong algorithm Kutalik2008NB. The "testPP.m" file explains how the algorithm is applied to a pair of toy data sets. To test the ISA functionalities, the "testISA.m" needs to be launched.


Kutalik Z, Beckmann JS, Bergmann S
A modular approach for integrative analysis of large-scale gene-expression and drug-response data.
Nat Biotechnol: 2008 May, 26(5);531-9
[PubMed:18464786] [ ISSN ESSN ] [DOI] ( p)

Ihmels JH, Bergmann S
Challenges and prospects in the analysis of large-scale gene expression data.
Brief Bioinform: 2004 Dec, 5(4);313-27
[PubMed:15606968] [ ISSN ESSN ] [DOI] ( p)

Ihmels J, Bergmann S, Barkai N
Defining transcription modules using large-scale gene expression data.
Bioinformatics: 2004 Sep 1, 20(13);1993-2003
[PubMed:15044247] [ ISSN ESSN ] [DOI] ( o)

Bergmann S, Ihmels J, Barkai N
Similarities and differences in genome-wide expression data of six organisms.
PLoS Biol: 2004 Jan, 2(1);E9
[PubMed:14737187] [ ISSN ESSN ] [DOI] ( o)

Bergmann S, Ihmels J, Barkai N
Iterative signature algorithm for the analysis of large-scale gene expression data.
Phys Rev E Stat Nonlin Soft Matter Phys: 2003 Mar, 67(3 Pt 1);031902
[PubMed:12689096] [ ISSN ESSN ] [DOI] ( o)

Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, Barkai N
Revealing modular organization in the yeast transcriptional network.
Nat Genet: 2002 Aug, 31(4);370-7
[PubMed:12134151] [ ISSN ESSN ] [DOI] ( o)

PDF files:" Kutalik2008 Ihmels 2004 Ihmels 2004a Bergmann 2004 Bergmann 2003 Ihmels2002