GWAS analysis of regulatory or functional information enrichment with LD correction
GARFIELD is a functional enrichment analysis approach described in the paper GARFIELD: GWAS analysis of regulatory or functional information enrichment with LD correction. Briefly, it is a method that leverages GWAS findings with regulatory or functional annotations (primarily from ENCODE and Roadmap epigenomics data) to find features relevant to a phenotype of interest. It performs greedy pruning of GWAS SNPs (LD r2 > 0.1) and then annotates them based on functional information overlap. Next, it quantifies enrichment using odds ratios (OR) at various GWAS p-value cutoffs and assesses their significance by employing generalized linear model testing, while accounting for minor allele frequency, distance to nearest transcription start site and number of LD proxies (r2 > 0.8). Within this framework, GARFILED accounts for major sources of confounding that current methods do no offer.
We have implemented GARFIELD into a standalone tool using C++ for data pre-processing, and R for enrichment estimation, significance testing and visualisation. It provides a way for assessing the enrichment of association analysis signals in 1005 features extracted from ENCODE, GENCODE and Roadmap Epigenomics projects, including genic annotations, chromatin states, histone modifications, DNaseI hypersensitive sites and transcription factor binding sites, among others, in a number of publicly available cell lines.
GARFIELD package
Source code for your own compilation can be dowloaded together with all necessary to use data files as compressed tarballs from
- garfield-v2.tar.gz (19Kb compressed)
- garfield-data.tar.gz (5.9Gb compressed, 83Gb uncompressed)
The software is distributed under GPL v2 license.
Documentation
Full documentation describing the required data for running GARFIELD, which is also supplied together with the package and the options and usage of the tool itself can be obtained from here
Further information
All LD and allele frequency data have been calculated from the UK10K data and as such the data provided from us will be suitable for the analysis of GWAS studies in European population cohorts. For other populations this data needs to be recalculated before using the method. Additionally, we have provided information on all UK10K variants, so variants not in that genotype set would not have any information.
How to cite
Valentina Iotchkova, Graham Ritchie, Matthias Geihs, Josine Min, Klaudia Walter, Ian Dunham, Ewan Birney and Nicole Soranzo. GARFIELD - GWAS Analysis of Regulatory or Functional Information Enrichment with LD correction. In preparation
Bug fixes
16 March 2018 bug fixed in the garfield_annotate_uk10k.sh script from garfield-v2 package .
For contacts
If you have any problems with or questions about this resource please contact Valentina Iotchkova.
Manuscript Custom Code
Custom manuscript code can be downloaded from: manuscript_custom_code.tar.gz
Previous versions
Source code and documentation for garfield version 1 (using permutations testing approach) can be found below
- garfield.tar.gz (19Kb compressed)
- garfield-data.tar.gz (5.9Gb compressed, 83Gb uncompressed)