Difference between revisions of "Module 3: How to make valid prognostic models with gene expression signatures?"
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | * Title: "How to make valid prognostic models | + | * Title: "How to make valid prognostic models when data contain many features like gene expression signatures?" |
− | |||
− | * Key claim of the paper: " | + | * Paper to be examined / reproduced: |
+ | “Pitfalls in the Use of DNA Microarray Data for Diagnostic and Prognostic Classification”, | ||
+ | JNCI J Natl Cancer Inst (2003) 95 (1): 14-18; doi: 10.1093/jnci/95.1.14 | ||
+ | [http://jnci.oxfordjournals.org/content/95/1/14.full] | ||
+ | by R. Simon, M. D. Radmacher, K. Dobbin, L. M. McShane. | ||
+ | |||
+ | Richard Simon team is at the Biometric Research Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD | ||
+ | (http://linus.nci.nih.gov/index.html) | ||
+ | |||
+ | |||
+ | * Key claim of the paper: "Many publications report erroneous classification performances due to incorrect application of cross-validation methodology." | ||
* Data and Code | * Data and Code | ||
+ | The study is based on simulated data with known results and shows the impact of variations in the cross-validation implementation | ||
+ | with a well-chosen "toy example". | ||
− | * Schedule: | + | * Approximate Schedule: |
− | ** H1: General introduction to the | + | ** H1: General introduction to the to the field and to useful terms |
− | ** H2 | + | ** H2: Reading sections of the papers, extract main messages, information about what was done exactly, discussion |
− | ** | + | ** H3-6: Programming by students to reproduce the results of the paper, at least partially. Writing of a short report to be |
+ | mailed to the teacher along with the code used. | ||
+ | |||
+ | ** H7: Presentation of the results obtained in the course. Discussion of the take-home messages and of possible extensions to the study. | ||
+ | ** H8-9: Programming to adjust the code used previously or to explore extensions of the investigation. | ||
+ | Writing of a revised short report to be mailed to the teacher along with the code used. | ||
− | |||
− | |||
− | |||
* Key bioinformatics concept of this module: | * Key bioinformatics concept of this module: | ||
− | ** | + | ** Prediction models - classifiers |
− | ** cross validation | + | ** cross validation |
+ | |||
+ | |||
+ | * Requirement to students of this module: | ||
+ | ** Ability to program in R. Students should come to the course with R ready to use. | ||
+ | |||
− | * back to [[UNIL MSc course: " | + | * back to [[UNIL MSc course: "Case studies in bioinformatics 2015"]] |
Latest revision as of 11:03, 28 October 2015
- Title: "How to make valid prognostic models when data contain many features like gene expression signatures?"
- Paper to be examined / reproduced:
“Pitfalls in the Use of DNA Microarray Data for Diagnostic and Prognostic Classification”,
JNCI J Natl Cancer Inst (2003) 95 (1): 14-18; doi: 10.1093/jnci/95.1.14 [1] by R. Simon, M. D. Radmacher, K. Dobbin, L. M. McShane.
Richard Simon team is at the Biometric Research Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD (http://linus.nci.nih.gov/index.html)
- Key claim of the paper: "Many publications report erroneous classification performances due to incorrect application of cross-validation methodology."
- Data and Code
The study is based on simulated data with known results and shows the impact of variations in the cross-validation implementation with a well-chosen "toy example".
- Approximate Schedule:
- H1: General introduction to the to the field and to useful terms
- H2: Reading sections of the papers, extract main messages, information about what was done exactly, discussion
- H3-6: Programming by students to reproduce the results of the paper, at least partially. Writing of a short report to be
mailed to the teacher along with the code used.
- H7: Presentation of the results obtained in the course. Discussion of the take-home messages and of possible extensions to the study.
- H8-9: Programming to adjust the code used previously or to explore extensions of the investigation.
Writing of a revised short report to be mailed to the teacher along with the code used.
- Key bioinformatics concept of this module:
- Prediction models - classifiers
- cross validation
- Requirement to students of this module:
- Ability to program in R. Students should come to the course with R ready to use.