Robust inference of gene regulatory networks using bootstrapping

Revision as of 11:08, 21 February 2014 by DanielMarbach (talk | contribs)

Background: Genome-scale inference of transcriptional gene regulation has become possible with the advent of high-throughput technologies such as microarrays and RNA sequencing, as they provide snapshots of the transcriptome under many tested experimental conditions. From these data, the challenge is to computationally predict direct regulatory interactions between a transcription factor and its target genes; the aggregate of all predicted interactions comprises the gene regulatory network. A wide range of network inference methods have been developed to address this challenge. We have previously organized a competition (the DREAM network inference challenge), where we rigorously assessed the state-of-the-art in gene network inference (see our paper to learn more). However, robustness of predictions to variability in the input data has so far not been characterized.

Goal: The aims of this project are to: (1) investigate the performance robustness of top-performing network inference methods from the DREAM5 challenge to variability in the input data, (2) improve the quality of predicted networks using a bootstrapping approach, (3) generate an improved prediction for the transcriptional regulatory network of E. coli and analyze its structural properties.

Mathematical tools: This project has a computational flavor. Students will familiarize themselves (at a high level) with gene network inference approaches, ensemble based approaches in machine learning (bootstrapping, bagging), and basic network properties such as degree distribution. A programming environment such as R or Matlab will be used. Network inference tools may have to be run from the command line (Unix console).

Biological or Medical aspects: The students will predict and analyze a genome-wide transcriptional regulatory network for E. coli.

Supervisor: Daniel Marbach

Students: