Difference between revisions of "Module 4: How does feature selection impact integrative clustering analysis?"

Revision as of 11:16, 6 March 2015

Title: "How to make valid prognostic models with gene expression signatures?"

Paper to be examined: “The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups”, Nature 486(7403):346-52 (2012)[1]

Key claim of the paper: "We have generated a robust, population-based molecular subgrouping of breast cancer based on multiple genomic views. [...] The joint clustering of CNAs and gene expression profiles further resolves the considerable heterogeneity of the expression-only subgroups."

Data and Code

Schedule:
- H1: General introduction to the paper/motivation
- H2: Write code to import the data and practice with the iClusterPlus R package with vignette example
- H3: Reproduce results from Figure 4 on subsample(s) of the data
- H4-5: Write code to import second dataset and reproduce clustering results
- H6: Discussion: "What features discriminate the resulting clusters? Do we see the issue? How can we improve?"
- H7-8: Based on discussion, modify feature selection and redo the analyses on one (two) datasets
- H9: Summarize results (e.g. on this wiki)

Key bioinformatics concept of this module:
- Feature selection (and its importance for cluster analyses)
- integrative analysis

back to UNIL MSc course: "Forensics in Bioinformatics 2015"

Retrieved from "http://www2.unil.ch/cbg/index.php?title=Module_4:_How_does_feature_selection_impact_integrative_clustering_analysis%3F&oldid=4169"