APEX: xQTL meta-analysis guide
This page describes xQTL meta-analysis using APEX. Once installed, you can quickly get started by running ./apex meta --help
.
Overview
The command apex meta
can be used for xQTL meta-analysis, including single-variant and joint/conditional multiple-variant analysis, from xQTL association summary statistics. xQTL summary statistics should be generated by apex cis
( or follow a simiar format), and study-specific variance-covariance data (only required for multiple-variant analysis) can be generated using apex store
.
Table of Contents
- Generating QTL summary statistics
- Single-variant xQTL meta-analysis
- Multiple-variant xQTL meta-analysis
- Command line options
Generating QTL summary statistics
Example command:
# Generate sumstats
./apex cis --region chr1 --bed {study1_bed} --cov {study1_cov} --vcf {study1_bcf} --out {study1_chr1} --window 1000000
# Generate vcov (covariate-adjusted LD)
./apex store --region chr1 --bed {study1_bed} --cov {study1_cov} --vcf {study1_bcf} --out {study1_chr1} --window 1000000
The above commands generate cis-QTL summary statistics and vcov (covariate-adjusted LD) files for downstream analysis from summary statistics. Here, we specify a 1 Mbp window around each gene, and therefore LD will be calculated in a 2-Mbp sliding window. Note: generating LD files can be time-consuming due to compression, and may take several hours. We recommend running the commands one chromosome at a time. See here for information on expected output file sizes.
Single-variant xQTL meta-analysis
Example command:
./apex meta --meta --ivw1 --sumstats {study1_chr1,study2_chr1,...} --out {output-prefix}
Software concordance. APEX single-variant meta-analysis is equivalent to the inverse-variance weighted meta-analysis as implemented in METAL and multiple R packages.
Multiple-variant xQTL meta-analysis
Example command:
-
Assume the conditional SNPs have same effects across studies
– Assume the SNP of interest have same effects acorss studies; pvalue threshold for the step-wise selection procedure is based on ACAT pvalues (pvalues that corrected for the number of tested SNPs)
./apex meta --sumstats {study1_chr1,study2_chr1,...} --stepwise --tests hom --pvalue 0.05 --backward --out {output-prefix}
– Assume the SNP of interest have different effects acorss studies (calculate the pvalues under both het and alt assumptions )
./apex meta --sumstats {prefix1,prefix2} --stepwise --tests het,alt --pvalue 0.05 --backward --prefix {output-prefix}
-
Assume the conditional SNPs have different effects across studies
– Assume the SNP of interest have same effects acorss studies; pvalue threshold for the step-wise selection procedure is based on marginal pvalues (raw pvalues , not corrected for the number of tested SNPs)
./apex meta --sumstats {study1_chr1,study2_chr1,...} --stepwise --tests hom --het --marginal --pvalue 2.5E-6 --backward --out {output-prefix}
– Assume the SNP of interest have different effects acorss studies; in the step-wise selection procedure, do not perform backward selection (do not drop SNPs falling the pvalue threshold in the joint model)
./apex meta --sumstats {study1_chr1,study2_chr1,...} --stepwise --tests het,alt --het --pvalue 0.05 --out {output-prefix}
Software concordance. Regression slopes and standard errors from APEX multiple-variant meta-analysis (where all studies have unrelated samples) are equivalent to the R regression model lm(trait ~ genotypes + study*covariates, weight = 1/study_mse )
, where study_mse
is the mean squared error from the null model (no genotypes) fit within each study.
Command line arguments
A partial list of options is given below. Please run ./apex meta --help
to see a complete list of command line flags and options.
- Analysis options
-
--tests=[hom,het,alt]
: Assumptions under which the pvalues for the SNPs of interestes are estimated. Comma-seperated options. Will estimate under all of the assumptions specified, i.e. [hom,het] will provide pvalues assuming incomplete! here.--het
: if specified, assume the conditional SNPs have heterogeneous effects across studies; otherwise assume homogeneous effects.--rsq
: maximum multiple R2 threshold, consider only SNPs with multiple R2 less than this threshold to avoid collinearity. If not specified, default is 0.7--marginal
: if specified, use the raw (unadjusted for the number of tested SNPs) in the stepwise selection procedure; otherwise, use the ACAT pavlues (adjusted for the number of tested SNPs).--pvalue
: pvalue threshold for the stepwise selection procedure.--backward
: if specified, perform forward and backward selection in the stepwise selection procedure; otherwise, only perform forward selection, i.e. do not drop SNPs failling the pvalue threshold in the joint model of all selected SNPs.
- Output options
--out
,-o
: Output file prefix.
- Computational resources
--threads {N}
: No. threads to be used (not to exceed no. available cores).
- Filtering regions and variants
--region {chr:start-end}
: Only analysis variants and traits within specified region.--gene {LIST}
: Only analyze the specified comma-delimited molecular traits IDs.