Difference between revisions of "Predicting Blood Pressure from the retina using Deep Learning"

(Background and Motivation)
Line 5: Line 5:
 
== Background and Motivation ==
 
== Background and Motivation ==
  
Intro content...
+
Heart disease has been the leading cause of death in the world for the last twenty years. It is therefore of great importance to look for ways to prevent it. In this project, funduscopy images of retinas of tens of thousands of participants collected by the UK biobank and data of biologically relevant variables collected in a dataset are used for two different purposes. First, GWAS analysis of some of the variables in the dataset allows us to look at their concrete importance in the genome. Second, the dataset was used as a means of refining the selection of retinal images so that they could be subjected to a classification model called Dense Net with as output a prediction of hypertension. A key point associated with both of these analyses - especially for the classification part - is that mathematically adequate data cleaning should enhance the relevant GWAS p-values, or accuracy of hypertension prediction.
  
<math> \delta = \frac{|L-R|}{L+R} </math>
+
== Data cleaning processes ==
 +
 
 +
The data has been collected from the UK biobank and consists of :
 +
 
 +
1. Retina images of left eyes, right eyes, or both left and right eyes of the participants. Also, a few hundreds of participants have had replica images of either their left or right eye taken.
 +
 
 +
2. A 92366x47 dataset with rows corresponding to every left or right retina images. Columns refer to biologically relevant data previously measured on those images.
 +
 
 +
The cleaning process has involved :
 +
 
 +
1. Cutting 20 variables by recommendation of the assistants and dividing the dataset into two : one containing only participants which had both their left (labelled "L") and right (labelled "R") eyes taken and nothing else, and the other containing each replica (labelled "1") image alongside its original (labelled "0").
 +
 
 +
2. Applying <math> \delta = \frac{|L-R|}{L+R} </math> to the left-right dataset and <math> \delta = \frac{|0-1|}{0+1} </math> to the original-replica dataset.
  
Heart disease has been the leading cause of death in the world for the last twenty years. It is therefore of great importance to look for ways to prevent it. In this project, funduscopy images of retinas of tens of thousands of participants collected by the UK biobank and data of biologically relevant variables collected in a dataset are used for two different purposes. First, GWAS analysis of some of the variables in the dataset allows us to look at their concrete importance in the genome. Second, the dataset was used as a means of refining the selection of retinal images so that they could be subjected to a classification model called Dense Net with as output a prediction of hypertension. A key point associated with both of these analyses - especially the for the classification part - is that mathematically adequate data cleaning should enhance the relevant GWAS p-values, or accuracy of hypertension prediction.
 
  
 
== Deep Learning Model ==
 
== Deep Learning Model ==

Revision as of 19:53, 5 June 2022

File:Retina DNN analysis Alex.pdf

Retina Image Analysis

Background and Motivation

Heart disease has been the leading cause of death in the world for the last twenty years. It is therefore of great importance to look for ways to prevent it. In this project, funduscopy images of retinas of tens of thousands of participants collected by the UK biobank and data of biologically relevant variables collected in a dataset are used for two different purposes. First, GWAS analysis of some of the variables in the dataset allows us to look at their concrete importance in the genome. Second, the dataset was used as a means of refining the selection of retinal images so that they could be subjected to a classification model called Dense Net with as output a prediction of hypertension. A key point associated with both of these analyses - especially for the classification part - is that mathematically adequate data cleaning should enhance the relevant GWAS p-values, or accuracy of hypertension prediction.

Data cleaning processes

The data has been collected from the UK biobank and consists of :

1. Retina images of left eyes, right eyes, or both left and right eyes of the participants. Also, a few hundreds of participants have had replica images of either their left or right eye taken.

2. A 92366x47 dataset with rows corresponding to every left or right retina images. Columns refer to biologically relevant data previously measured on those images.

The cleaning process has involved :

1. Cutting 20 variables by recommendation of the assistants and dividing the dataset into two : one containing only participants which had both their left (labelled "L") and right (labelled "R") eyes taken and nothing else, and the other containing each replica (labelled "1") image alongside its original (labelled "0").

2. Applying <math> \delta = \frac{|L-R|}{L+R} </math> to the left-right dataset and <math> \delta = \frac{|0-1|}{0+1} </math> to the original-replica dataset.


Deep Learning Model

This section focused on using the previously defined Delta variable to sort the images used as input for the classifier. A CNN model was built by the CBG to predict hypertension from retina fundus images. We wished to improve the predictions by reducing technical error in the input images.

GWAS