ISA

From Computational Biology Group

(Difference between revisions)
Jump to: navigation, search
m (Download and Installation)
m
Line 16: Line 16:
 
is extremely fast even for very large datasets.
 
is extremely fast even for very large datasets.
  
== Software ==
+
= Software for Gene expression data =
 
+
=== Gene expression data ===
+
  
 
We developed the <code>eisa</code> [http://www.r-project.org GNU R] package to facilitate the modular analysis of gene expression data. The package uses standard [http://www.bioconductor.org BioConductor] data structures and includes various visualization tools as well.
 
We developed the <code>eisa</code> [http://www.r-project.org GNU R] package to facilitate the modular analysis of gene expression data. The package uses standard [http://www.bioconductor.org BioConductor] data structures and includes various visualization tools as well.
  
==== Requirements ====
+
=== Requirements ===
  
 
To use <code>eisa</code> you will need a working [http://www.r-project.org GNU R] and [http://www.bioconductor.org BioConductor] installation. You will also need the <code>isa2</code>, <code>Category</code> and <code>genefilter</code> R packages. You can install these by typing  
 
To use <code>eisa</code> you will need a working [http://www.r-project.org GNU R] and [http://www.bioconductor.org BioConductor] installation. You will also need the <code>isa2</code>, <code>Category</code> and <code>genefilter</code> R packages. You can install these by typing  
Line 32: Line 30:
 
at your R prompt.
 
at your R prompt.
  
==== Download and Installation ====
+
=== Download and Installation ===
  
 
The <code>eisa</code> package is currently being reviewed by the BioConductor team. Until it is available from the standard BioConductor repositories, it can be downloaded from here. The most recent version of the <code>eisa</code> package is 0.2. Please follow the installation instructions for your platform.
 
The <code>eisa</code> package is currently being reviewed by the BioConductor team. Until it is available from the standard BioConductor repositories, it can be downloaded from here. The most recent version of the <code>eisa</code> package is 0.2. Please follow the installation instructions for your platform.
Line 40: Line 38:
 
* '''[http://www.unil.ch/cbg/homepage/downloads/eisa_0.2.tar.gz Linux and Unix systems, R source package]''' <br/> Download [http://www.unil.ch/cbg/homepage/downloads/eisa_0.2.tar.gz this file], save it in a temporary directory, and start R. Install the downloaded package using the <code>install.packages()</code> function: give the full path of the saved file and use the '<code>repos=NULL</code>' argument of <code>install.packages()</code>.
 
* '''[http://www.unil.ch/cbg/homepage/downloads/eisa_0.2.tar.gz Linux and Unix systems, R source package]''' <br/> Download [http://www.unil.ch/cbg/homepage/downloads/eisa_0.2.tar.gz this file], save it in a temporary directory, and start R. Install the downloaded package using the <code>install.packages()</code> function: give the full path of the saved file and use the '<code>repos=NULL</code>' argument of <code>install.packages()</code>.
  
==== License ====
+
=== License ===
  
 
The eisa package is licensed under the GNU General Public License, version 2 or later. For details, see http://www.gnu.org/licenses/old-licenses/gpl-2.0.html.
 
The eisa package is licensed under the GNU General Public License, version 2 or later. For details, see http://www.gnu.org/licenses/old-licenses/gpl-2.0.html.
  
=== Other tabular data ===
+
= Software for any tabular data =
  
 
The ISA can be applied to identify coherent substructures (i.e. modules) from any rectangular matrix of data. You can use the <code>isa2</code> R package for such an analysis.  
 
The ISA can be applied to identify coherent substructures (i.e. modules) from any rectangular matrix of data. You can use the <code>isa2</code> R package for such an analysis.  
  
==== Installation ====
+
=== Requirements ===
 +
 
 +
No additional R package is required to install and use <code>isa2</code>. But on Linux and Unix systems you will need a C compiler to install it. E.g. on Ubuntu Linux you will need to install the '<code>build-essential</code> package.
 +
 
 +
=== Installation ===
  
 
The <code>isa2</code> package is available from [http://cran.r-project.org/ CRAN], the standard R package repository. You can install it on any platform that is supported by GNU R, e.g. Microsoft Windows, Mac OSX and Linux systems. To install it, start R and type in
 
The <code>isa2</code> package is available from [http://cran.r-project.org/ CRAN], the standard R package repository. You can install it on any platform that is supported by GNU R, e.g. Microsoft Windows, Mac OSX and Linux systems. To install it, start R and type in
Line 56: Line 58:
 
at the prompt. On Linux and Unix-like systems, you will need a working C compiler for a successful installation.
 
at the prompt. On Linux and Unix-like systems, you will need a working C compiler for a successful installation.
  
==== License ====
+
=== License ===
  
 
The <code>isa2</code> package is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
 
The <code>isa2</code> package is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
  
== Papers ==
+
= Papers =
  
 
<biblio>
 
<biblio>
Line 69: Line 71:
 
</biblio>
 
</biblio>
  
== Tutorials ==
+
= Tutorials =
  
 
* [[ISA tutorial]]
 
* [[ISA tutorial]]

Revision as of 10:39, 6 December 2009

Large sets of data, like expression profile from many samples, require analytic tools to reduce their complexity. Classical (bi-)clustering algorithms typically attribute elements (genes, arrays) to disjoint groups ("clusters"). Yet, in some cases overlapping cluster assignments would suit the biological reality much better.

The Iterative Signature Algorithm (ISA) was designed to overcome this and other limitations of standard clustering algorithms. It aims to reduce the complexity of very large sets of data by decomposing it into so-called "modules". In the context of gene expression data these modules consist of subsets of genes that exhibit a coherent expression profile only over a subset of microarray experiments. Genes and arrays may be attributed to multiple modules and the level of required coherence can be varied resulting in different "resolutions" of the modular mapping. Since the ISA does not rely on the computation of correlation matrices (like many other tools), it is extremely fast even for very large datasets.

Contents

Software for Gene expression data

We developed the eisa GNU R package to facilitate the modular analysis of gene expression data. The package uses standard BioConductor data structures and includes various visualization tools as well.

Requirements

To use eisa you will need a working GNU R and BioConductor installation. You will also need the isa2, Category and genefilter R packages. You can install these by typing

 install.packages("isa2")
 source("http://bioconductor.org/biocLite.R")
 biocLite(c("Category", "genefilter"))

at your R prompt.

Download and Installation

The eisa package is currently being reviewed by the BioConductor team. Until it is available from the standard BioConductor repositories, it can be downloaded from here. The most recent version of the eisa package is 0.2. Please follow the installation instructions for your platform.

  • Microsoft Windows (all versions)
    Download this file, save it in a temporary directory, and then start R. From the Packages menu choose 'Install packages from local zip files' and select the saved file.
  • Mac OSX (all versions)
    Currently not available.
  • Linux and Unix systems, R source package
    Download this file, save it in a temporary directory, and start R. Install the downloaded package using the install.packages() function: give the full path of the saved file and use the 'repos=NULL' argument of install.packages().

License

The eisa package is licensed under the GNU General Public License, version 2 or later. For details, see http://www.gnu.org/licenses/old-licenses/gpl-2.0.html.

Software for any tabular data

The ISA can be applied to identify coherent substructures (i.e. modules) from any rectangular matrix of data. You can use the isa2 R package for such an analysis.

Requirements

No additional R package is required to install and use isa2. But on Linux and Unix systems you will need a C compiler to install it. E.g. on Ubuntu Linux you will need to install the 'build-essential package.

Installation

The isa2 package is available from CRAN, the standard R package repository. You can install it on any platform that is supported by GNU R, e.g. Microsoft Windows, Mac OSX and Linux systems. To install it, start R and type in

 install.packages("isa2")

at the prompt. On Linux and Unix-like systems, you will need a working C compiler for a successful installation.

License

The isa2 package is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Papers

  1. Ihmels JH and Bergmann S. Challenges and prospects in the analysis of large-scale gene expression data. Brief Bioinform 2004 Dec; 5(4) 313-27. pmid:15606968. PubMed HubMed [Ihmels2004]
    PDF

  2. Ihmels J, Bergmann S, and Barkai N. Defining transcription modules using large-scale gene expression data. Bioinformatics 2004 Sep 1; 20(13) 1993-2003. doi:10.1093/bioinformatics/bth166 pmid:15044247. PubMed HubMed [Ihmels2004a]
    PDF

  3. Bergmann S, Ihmels J, and Barkai N. Similarities and differences in genome-wide expression data of six organisms. PLoS Biol 2004 Jan; 2(1) E9. doi:10.1371/journal.pbio.0020009 pmid:14737187. PubMed HubMed [Bergmann2004]
    PDF

  4. Ihmels J, Friedlander G, Bergmann S, Sarig O, Ziv Y, and Barkai N. Revealing modular organization in the yeast transcriptional network. Nat Genet 2002 Aug; 31(4) 370-7. doi:10.1038/ng941 pmid:12134151. PubMed HubMed [Ihmels2002]
    PDF

All Medline abstracts: PubMed HubMed

Tutorials