View on GitHub

apex

Toolkit for QTL mapping and meta-analysis.

APEX: trans-xQTL analysis guide

This page describes trans-xQTL analysis using APEX. Once installed, you can quickly get started by running ./apex trans --help.

Overview

The command apex trans can be used to analyze genome-wide associations between molecular traits and all genetic variants. This is in contrast to apex cis, which tests only analyzes genetic variants within a window of each moleculatr trait. The underlying statistical methods are broadly similar between modes cis and trans; however, we introduce additional optimizations in mode trans to reduce computation time, memory, and storage.
Similar to apex cis mode, trans-xQTL analysis in APEX (apex trans) uses either a) ordinary least squares (OLS) b) a linear mixed model (LMM) using either a genetic relatedness matrix (GRM) or low-rank matrix of random effect covariates. For detailed descriptions of input file formats, please see the input file documentation page.

Table of Contents
  1. OLS trans-xQTL analysis
  2. LMM trans-xQTL analysis
  3. Command line options

Return to APEX main page.

OLS trans-xQTL analysis

Example command:
./apex trans --vcf {vcf} --bed {expression-file} --cov {covariate-file} --prefix {output-prefix}

QTL software concordance. When no GRM or random effects are specified, APEX single-variant output is equivalent to the R regression model lm(traits[,j] ~ covariates + genotype[,k]) for each trait j and genotype k. APEX output is additionally equivalent to FastQTL single-variant output. Note that some tools, such as QTLtools, instead fit the model lm(residuals[,j] ~ genotype[,k]) where residuals[,j] = resid(lm(traits[,j] ~ covariates)). APEX can mimic this model if the flag --no-resid-geno is specified. This approach is slightly faster that standard OLS, but can cause conservative p-values (loss of statistical power). To see accepted input file formats, please see here.

LMM trans-xQTL analysis

Example command:

## Estimate null LMM models for all molecular traits and 
## store estimates for later use:
 ./apex lmm --vcf {vcf} --bed {expression-file} --cov {covariate-file} --grm {grm-file} --fit-null --prefix {theta-prefix}
## Run trans-xQTL analysis, re-using variance component 
## estimates from the previous step:
 ./apex trans --vcf {vcf} --bed {expression-file} --cov {covariate-file} --grm {grm-file} --theta-file {theta-prefix}.theta.gz --prefix {output-prefix}


Here, a linear mixed model (LMM) is used to account for cryptic or familial relatedness in trans-eQTL analysis. To use this feature, specify a genetic relatedness matrix (GRM) file to APEX using --grm {grm-file}. To see accepted input file formats, please see here.
Here, LMM analysis is divided into two steps. First, we estimate variance component parameters for all molecular traits under the null hypothesis (no single-variant genetic effects), and store these estimates for later use. Second, we use these estimates to quickly calculate trans-xQTL association statistics. When jobs are parallelized across chromosomes, this 2-step approach saves substantial computational resources, as the null model for each molecular trait need only be estimated once.

LMM software concordance. APEX’s LMM estimates are consistent with the R packages GMMAT and GENESIS using AI-REML.

Command line arguments

A partial list of options is given below. Please run ./apex trans --help to see a complete list of command line flags and options.