Difference between revisions of "Module 4: How does feature selection impact integrative clustering analysis?"
Line 8: | Line 8: | ||
* Schedule: | * Schedule: | ||
− | |||
** H1: General introduction to the paper/motivation | ** H1: General introduction to the paper/motivation | ||
** H2: Write code to import the data and practice with the iClusterPlus R package with vignette example | ** H2: Write code to import the data and practice with the iClusterPlus R package with vignette example |
Revision as of 11:16, 6 March 2015
- Title: "How to make valid prognostic models with gene expression signatures?"
- Paper to be examined: “The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups”, Nature 486(7403):346-52 (2012)[1]
- Key claim of the paper: "We have generated a robust, population-based molecular subgrouping of breast cancer based on multiple genomic views. [...] The joint clustering of CNAs and gene expression profiles further resolves the considerable heterogeneity of the expression-only subgroups."
- Data and Code
- Schedule:
- H1: General introduction to the paper/motivation
- H2: Write code to import the data and practice with the iClusterPlus R package with vignette example
- H3: Reproduce results from Figure 4 on subsample(s) of the data
- H4-5: Write code to import second dataset and reproduce clustering results
- H6: Discussion: "What features discriminate the resulting clusters? Do we see the issue? How can we improve?"
- H7-8: Based on discussion, modify feature selection and redo the analyses on one (two) datasets
- H9: Summarize results (e.g. on this wiki)
- Key bioinformatics concept of this module:
- Feature selection (and its importance for cluster analyses)
- integrative analysis