These functions are used in the upcoming manuscript on the application of Haseman-Elston regression to estimate heritability (h2) and genetic correlations (rg) among traits measured in Diversity Outbred (DO) mouse populations. See Test_HE_Reg.pdf for a knit of the .Rmd file under /code.
Haseman-Elston regression is a method for estimating the heritability (h2) of traits or genetic correlations (rg) among pairs of traits. At a high level, this involves regressing phenotypic similarity (defined here as the product of z-normalized trait values) on kinship among pairs of individuals. The regression coefficient (β) can be used to estimate the covariance of traits as a function of kinship (⍴):
⍴ = β / (2*σ1*σ2)
where σ1 is the standard deviation of the first trait and σ2 is the standard deviation of the second. For a single trait, ⍴ = h2. For a pair of traits ⍴ can be used to estimate rg via a structural equation model if the heritabilities of the pair of traits are known.
Here, we have implemented Haseman-Elston regression for use with data derived from studies in Diversity Outbred (DO) mice. However, these functions can, in theory, be applied to any set of z-score normalized phenotype data with an accompanying kinship matrix of the sort generated via R/qtl2.
Functions include:
- z-score normalization of vectors of trait values.
- generation of combinatorial pairwise products from pairs of z-normalized trait vectors.
- estimation of h2 or r g.
- adjustments to the standard error of the estimates as a function of trait h2 and sample size.
The main HER_functions.R file is a collection of functions that can be imported in R to estimate h2 and rg for sets of z score normalized phenotypes. This repo also contains a markdown file Test_HE_Reg.Rmd that serves as documentation and demonstrates the utility of each of the functions.
Contains a set of phenotype data DO_Phenotypes_non_molecular.RData (corresponding to the 240507 version of the non-molecular data) and corresponding kinship information kinship_loco.RData that can be used to test the functions. Functions are loaded via:
load('/path/to/HER_functions.R')
- dplyr
- parallel
For the .Rmd specifically:
- scales