Difference between revisions of "ISA"
m |
m |
||
Line 1: | Line 1: | ||
+ | [[image:expmat.png|An ISA transcription module|300px|right|link=ISA]] | ||
+ | <br/> | ||
Large sets of data, like expression profile from many samples, require | Large sets of data, like expression profile from many samples, require | ||
analytic tools to reduce their complexity. Classical (bi-)clustering | analytic tools to reduce their complexity. Classical (bi-)clustering |
Revision as of 11:38, 6 December 2009
Large sets of data, like expression profile from many samples, require
analytic tools to reduce their complexity. Classical (bi-)clustering
algorithms typically attribute elements (genes, arrays) to disjoint groups
("clusters"). Yet, in some cases overlapping cluster assignments would suit
the biological reality much better.
The Iterative Signature Algorithm (ISA) was designed to overcome this and other limitations of standard clustering algorithms. It aims to reduce the complexity of very large sets of data by decomposing it into so-called "modules". In the context of gene expression data these modules consist of subsets of genes that exhibit a coherent expression profile only over a subset of microarray experiments. Genes and arrays may be attributed to multiple modules and the level of required coherence can be varied resulting in different "resolutions" of the modular mapping. Since the ISA does not rely on the computation of correlation matrices (like many other tools), it is extremely fast even for very large datasets.
Software for Gene expression data
We developed the eisa
GNU R package to facilitate the modular analysis of gene expression data. The package uses standard BioConductor data structures and includes various visualization tools as well.
Requirements
To use eisa
you will need a working GNU R and BioConductor installation. You will also need the isa2
, Category
and genefilter
R packages. You can install these by typing
install.packages("isa2") source("http://bioconductor.org/biocLite.R") biocLite(c("Category", "genefilter"))
at your R prompt.
Download and Installation
The eisa
package is currently being reviewed by the BioConductor team. Until it is available from the standard BioConductor repositories, it can be downloaded from here. The most recent version of the eisa
package is 0.2. Please follow the installation instructions for your platform.
- Microsoft Windows (all versions)
Download this file, save it in a temporary directory, and then start R. From the Packages menu choose 'Install packages from local zip files
' and select the saved file. - Mac OSX (all versions)
Currently not available. - Linux and Unix systems, R source package
Download this file, save it in a temporary directory, and start R. Install the downloaded package using theinstall.packages()
function: give the full path of the saved file and use the 'repos=NULL
' argument ofinstall.packages()
.
License
The eisa package is licensed under the GNU General Public License, version 2 or later. For details, see http://www.gnu.org/licenses/old-licenses/gpl-2.0.html.
Software for any tabular data
The ISA can be applied to identify coherent substructures (i.e. modules) from any rectangular matrix of data. You can use the isa2
R package for such an analysis.
Requirements
No additional R package is required to install and use isa2
. But on Linux and Unix systems you will need a C compiler to install it. E.g. on Ubuntu Linux you will need to install the 'build-essential
package.
Installation
The isa2
package is available from CRAN, the standard R package repository. You can install it on any platform that is supported by GNU R, e.g. Microsoft Windows, Mac OSX and Linux systems. To install it, start R and type in
install.packages("isa2")
at the prompt. On Linux and Unix-like systems, you will need a working C compiler for a successful installation.
License
The isa2
package is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
Tutorials
The Iterative Signature Algorithm for Gene Expression Data
Shows the typical steps of modular analysis, from loading you expression data to the visualization of transcription modules.
HTML
PDF
Rnw
R code
ISA and the biclust package
The biclust
package implements several biclustering algorithms. It is possible to convert the results of biclust
to transcription modules and vice-versa.
HTML
PDF
Rnw
R code
Hierarchical module trees
A module tree is the hierarchical modular organization of a data set.
HTML
PDF
Rnw
R code
The Iterative Signature Algorithm
Tutorial for the analysis of tabular data with the isa2
R package.
HTML
PDF
Rnw
R code
Running ISA in parallel
Shows how to run ISA on a computer cluster or multi-processor machine, using MPI and the Rmpi
and snow
R packages.
HTML
PDF
Rnw
R code
Papers
<biblio>
- Ihmels2004 pmid=15606968 // PDF
- Ihmels2004a pmid=15044247 // PDF
- Bergmann2004 pmid=14737187 // PDF
- Bergmann2003 pmid=12689096 //PDF
- Ihmels2002 pmid=12134151 // PDF
</biblio>