APEX: trans-xQTL analysis guide

This page describes trans-xQTL analysis using APEX. Once installed, you can quickly get started by running ./apex trans --help.

Overview

The command apex trans can be used to analyze genome-wide associations between molecular traits and all genetic variants. This is in contrast to apex cis, which tests only analyzes genetic variants within a window of each moleculatr trait. The underlying statistical methods are broadly similar between modes cis and trans; however, we introduce additional optimizations in mode trans to reduce computation time, memory, and storage.
Similar to apex cis mode, trans-xQTL analysis in APEX (apex trans) uses either a) ordinary least squares (OLS) b) a linear mixed model (LMM) using either a genetic relatedness matrix (GRM) or low-rank matrix of random effect covariates. For detailed descriptions of input file formats, please see the input file documentation page.

OLS trans-xQTL analysis
LMM trans-xQTL analysis
Command line options

Return to APEX main page.

OLS trans-xQTL analysis

Example command:
./apex trans --vcf {vcf} --bed {expression-file} --cov {covariate-file} --prefix {output-prefix}

QTL software concordance. When no GRM or random effects are specified, APEX single-variant output is equivalent to the R regression model lm(traits[,j] ~ covariates + genotype[,k]) for each trait j and genotype k. APEX output is additionally equivalent to FastQTL single-variant output. Note that some tools, such as QTLtools, instead fit the model lm(residuals[,j] ~ genotype[,k]) where residuals[,j] = resid(lm(traits[,j] ~ covariates)). APEX can mimic this model if the flag --no-resid-geno is specified. This approach is slightly faster that standard OLS, but can cause conservative p-values (loss of statistical power). To see accepted input file formats, please see here.

LMM trans-xQTL analysis

Example command:

## Estimate null LMM models for all molecular traits and 
## store estimates for later use:
 ./apex lmm --vcf {vcf} --bed {expression-file} --cov {covariate-file} --grm {grm-file} --fit-null --prefix {theta-prefix}
## Run trans-xQTL analysis, re-using variance component 
## estimates from the previous step:
 ./apex trans --vcf {vcf} --bed {expression-file} --cov {covariate-file} --grm {grm-file} --theta-file {theta-prefix}.theta.gz --prefix {output-prefix}

Here, a linear mixed model (LMM) is used to account for cryptic or familial relatedness in trans-eQTL analysis. To use this feature, specify a genetic relatedness matrix (GRM) file to APEX using --grm {grm-file}. To see accepted input file formats, please see here.
Here, LMM analysis is divided into two steps. First, we estimate variance component parameters for all molecular traits under the null hypothesis (no single-variant genetic effects), and store these estimates for later use. Second, we use these estimates to quickly calculate trans-xQTL association statistics. When jobs are parallelized across chromosomes, this 2-step approach saves substantial computational resources, as the null model for each molecular trait need only be estimated once.

LMM software concordance. APEX’s LMM estimates are consistent with the R packages GMMAT and GENESIS using AI-REML.

Command line arguments

A partial list of options is given below. Please run ./apex trans --help to see a complete list of command line flags and options.

General options
- --pvalue {P} : Only report trans-xQTL associations with p-value <= {P}.
Output options
- --prefix, -o : Output file prefix.
- --long, -l : Write trans-eQTL results in long-table format.
Scale and transform options
- --rankNormal : Apply rank normal transform to trait values.
- --rankNormal-resid : Apply rank normal transform to residuals (can be used with rankNormal). [Not compatible with LMM].
- --no-resid-geno : Do not residualize genotypes (not recommended). Output using this flag is concordant with QTLtools and some other tools.
Computational resources
- --threads {N} : No. threads to be used (not to exceed no. available cores).
- --low-mem : Reduce memory usage by reading and processing genotypes in chunks.
Subsetting samples
- --exclude-iids {LIST} : Comma-delimited list of sample IDs to exclude.
- --include-iids {LIST} : Only include the specified comma-delimited sample IDs.
Filtering regions and variants
- --region {chr:start-end} : Only analysis variants and traits within specified region.
- --gene {LIST} : Only analyze the specified comma-delimited molecular traits IDs.
- --exclude-snps {LIST} : Comma-delimited list of SNPs to exclude.
- --include-snps {LIST} : Only include the specified comma-delimited SNPs.

apex

Toolkit for QTL mapping and meta-analysis.

APEX: trans-xQTL analysis guide

Overview

Table of Contents

OLS trans-xQTL analysis

LMM trans-xQTL analysis

Command line arguments