rNeighborGWAS

Overview

This R package provides a set of functions to test neighbor effects in marker-based regressions. In this vignette, we first estimate an effective range of neighbor effects, and then perform an association mapping of neighbor effects. See Sato et al. (2019) for model description.

Input files

First of all, let us see the data structure of input files necessary for the neighbor GWAS. Here is an example using a simulated dataset. Genotype data are a matrix including individuals (rows) x markers (columns) with -1 or +1 digit for bialleles. A spatial map indicates a individual distribution along x-axis and y-axis in a 2D space.

library(rNeighborGWAS)

#simulate "fake_nei" dataset using nei_simu()
set.seed(1)
g <- matrix(sample(c(-1,1),100*1000,replace = TRUE),100,1000)
gmap <- cbind(c(rep(1,nrow(g)/2),rep(2,nrow(g)/2)),c(1:ncol(g)))
x <- runif(nrow(g),1,100)
y <- runif(nrow(g),1,100)
smap <- cbind(x,y)
grouping <- c(rep(1,nrow(g)/2), rep(2,nrow(g)/2))
pheno <- nei_simu(geno=g, smap=smap, scale=44,
                  grouping=grouping, n_causal=50,
                  pveB=0.4, pve=0.8
                  )

fake_nei <- list()
fake_nei[[1]] <- g
fake_nei[[2]] <- gmap
fake_nei[[3]] <- smap
fake_nei[[4]] <- data.frame(pheno,grouping)
names(fake_nei) <- c("geno","gmap","smap","pheno")

fake_nei$geno[1:5,1:10]
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]   -1    1    1   -1   -1   -1   -1    1   -1     1
#> [2,]    1    1    1   -1   -1   -1   -1    1   -1     1
#> [3,]   -1    1   -1   -1   -1   -1    1    1    1    -1
#> [4,]   -1   -1    1   -1    1   -1   -1    1    1     1
#> [5,]    1   -1    1   -1   -1   -1   -1   -1   -1     1
head(fake_nei$smap)
#>             x        y
#> [1,] 70.35128 31.43208
#> [2,] 70.47472 42.45353
#> [3,] 31.98069 56.06230
#> [4,] 55.68317 78.83193
#> [5,] 87.37672 39.53956
#> [6,] 68.43125 98.53647

Variation partitioning across a space

To estimate an effective range of neighbor effects, we calculate a heritability-like metric, that is the proportion of phenotypic variation explained by neighbor effects (PVE_nei). The optimal scale of neighbor effects is analyzed by how PVE_nei approaches to a plateau.

min_s <- min_dist(fake_nei$smap, fake_nei$pheno$grouping)
scale_seq <- c(min_s, quantile(dist(fake_nei$smap),c(0.2*rep(1:5))))

pve_out <- calc_PVEnei(geno=fake_nei$geno, pheno=fake_nei$pheno[,1],
                       smap=fake_nei$smap, scale_seq=scale_seq,
                       addcovar=as.matrix(fake_nei$pheno$grouping),
                       grouping=fake_nei$pheno$grouping
                       )
#> scale = 21.9252977049056
#> scale = 28.5810503901963
#> scale = 43.2690896363057
#> scale = 57.3188266134066
#> scale = 72.5152771863378
#> scale = 122.47901977593
delta_PVE(pve_out)

#>      scale     PVEnei 
#> 43.2690896  0.8655676

Association mapping

Based on the estimated scale of neighbor effects, we then perform genome-wide association mapping of neighbor effects as follows.

scale <- 43
gwas_out <- neiGWAS(geno=fake_nei$geno, pheno=fake_nei$pheno[,1],
                    gmap=fake_nei$gmap, smap=fake_nei$smap,
                    scale=scale, addcovar=as.matrix(fake_nei$pheno$grouping),
                    grouping=fake_nei$pheno$grouping
                    )

gaston::manhattan(gwas_out)

gaston::qqplot.pvalues(gwas_out$p)

References

Sato Y, Yamamoto E, Shimizu KK, Nagano AJ (2019) Neighbor GWAS: incorporating neighbor genotypic identity into genome-wide association studies of field herbivory on Arabidopsis thaliana. bioRxiv https://doi.org/10.1101/845735