Now some more about population specific betas, and their interpretation. We will look at slides 176-181 in some details, using equations from slides 150-154, but without the sample size correction since we have allele frequencies.

np<-4
#Bruce's example (slides 176-177)
paf1<-c(0.5,0.5)
paf2<-c(0.3,0.7)
pin1<-c(0.28,0.72)
pin2<-c(0.32,0.68)
mp<-cbind(paf1,paf2,pin1,pin2) # all pop all freq
hs<-1-t(mp)%*%mp #all diversities
(hb<-1/np/(np-1)*sum(hs-diag(4)*diag(hs))) # world Hb
## [1] 0.4601333
(hba<-1/2*sum(hs[1:2,1:2]-diag(2)*hs[1:2,1:2])) # afr Hb
## [1] 0.5
(hbi<-1/2*sum(hs[3:4,3:4]-diag(2)*hs[3:4,3:4])) # In Hb
## [1] 0.4208
(1-diag(hs)/hb) # betai world
##        paf1        paf2        pin1        pin2 
## -0.08664155  0.08722110  0.12373225  0.05418719
(1-hs[1,1]/hba) # beta af1 rel africa
## [1] 0
(1-hs[2,2]/hba) # beta af2 rel africa
## [1] 0.16
(1-hs[3,3]/hbi) # beta in1 rel in
## [1] 0.0418251
(1-hs[4,4]/hbi) # beta in2 rel in
## [1] -0.03422053

Now we can write a function to carry out the calculations, basically a wrapper around the code of teh previous section:

betas.freq<-function(freq){
np<-dim(freq)[2]
hs<-1-t(freq)%*%freq #all diversities
hb<-1/np/(np-1)*sum(hs-diag(np)*diag(hs)) # world Hb
betas<-1-diag(hs)/hb
betaw<-mean(betas)
return(list(betaw=betaw,betai=betas,hs=hs,hb=hb))
}
betas.freq(mp) #world
## $betaw
## [1] 0.04462475
## 
## $betai
##        paf1        paf2        pin1        pin2 
## -0.08664155  0.08722110  0.12373225  0.05418719 
## 
## $hs
##      paf1  paf2   pin1   pin2
## paf1  0.5 0.500 0.5000 0.5000
## paf2  0.5 0.420 0.4120 0.4280
## pin1  0.5 0.412 0.4032 0.4208
## pin2  0.5 0.428 0.4208 0.4352
## 
## $hb
## [1] 0.4601333
betas.freq(mp[,1:2]) #africa only
## $betaw
## [1] 0.08
## 
## $betai
## paf1 paf2 
## 0.00 0.16 
## 
## $hs
##      paf1 paf2
## paf1  0.5 0.50
## paf2  0.5 0.42
## 
## $hb
## [1] 0.5
betas.freq(mp[,3:4]) #Inuit only
## $betaw
## [1] 0.003802281
## 
## $betai
##        pin1        pin2 
##  0.04182510 -0.03422053 
## 
## $hs
##        pin1   pin2
## pin1 0.4032 0.4208
## pin2 0.4208 0.4352
## 
## $hb
## [1] 0.4208

Now, imagine that we are dealing with microsats (STRs) rather than snps. Each locus can have many alleles. For instance, 4 alleles. We are thinking of the first 2 populations being from Africa, with a lot of variation, while the other two are for instance inuit, and have been populated from a sample of the african variation. For instance:

paf1<-c(0.23,0.22,0.25,0.30)
paf2<-c(0.24,0.23,0.27,0.26)
pin1<-c(0.82,0.18,0,0)
pin2<-c(0.87,0.13,0,0)

mp<-cbind(paf1,paf2,pin1,pin2) # all pop all freq

We can now estimate beta worlwide, or by continent:

betas.freq(mp) #world
## $betaw
## [1] 0.2582575
## 
## $betai
##       paf1       paf2       pin1       pin2 
## -0.0978642 -0.1019838  0.5656801  0.6671980 
## 
## $hs
##        paf1   paf2   pin1   pin2
## paf1 0.7462 0.7487 0.7718 0.7713
## paf2 0.7487 0.7490 0.7618 0.7613
## pin1 0.7718 0.7618 0.2952 0.2632
## pin2 0.7713 0.7613 0.2632 0.2262
## 
## $hb
## [1] 0.6796833
betas.freq(mp[,1:2]) #africa only
## $betaw
## [1] 0.001469213
## 
## $betai
##          paf1          paf2 
##  0.0033391211 -0.0004006945 
## 
## $hs
##        paf1   paf2
## paf1 0.7462 0.7487
## paf2 0.7487 0.7490
## 
## $hb
## [1] 0.7487
betas.freq(mp[,3:4]) #Inuit only
## $betaw
## [1] 0.00949848
## 
## $betai
##       pin1       pin2 
## -0.1215805  0.1405775 
## 
## $hs
##        pin1   pin2
## pin1 0.2952 0.2632
## pin2 0.2632 0.2262
## 
## $hb
## [1] 0.2632