Potential functional motifs per tissue type
Annotated variatns using funMotifsDB
All scored motifs per tissue type
Data files used to build funMotifsDB
The input should be a table of genomic coordinates with at least three columns: chromosome number, start and end positions. The coordinates should be
BED 0-based
.
For variants, the input should have at least five columns, first three columns as above, and a 4th column for the reference allele and a 5th column for the alternative allele. Any additional column from the user will be ignored.
The funMotifs database uses the hg19 version of the human reference genome.
The columns can be separated by tab, comma, space, or semi-colon.
Check the Header checkbox if the input contains a header line.
*By default input regions or variants that do not overlap any TF motif are not reported in output file. Uncheck the box 'Exclude input regions with no matches?' to retrieve all entries in the input list regardless their overlap with TF motifs.
We provide extracted files from the database in the download section. For batch analysis, we recommend downloading the funMotifs file for the desired tissue type and use IntersectBed to identify the overlapping motifs.
In order to search in all motifs regardless of their functionality you can download the
all_tissues
archive file in the download section. The file contains all motifs (~85 million) and the fscore for each tissue type.
Be aware that, chromosome X,Y and M are represented as 23, 24, and 25, respectively.
About
funMotifs database contains annotated transcription factor motifs. The aim is to aid researchers in interpreting the noncoding genome. It enables analysis of noncoding variants and regions. funMotifs is built based on assays and datasets from ENCODE, FANTOM, RoadMap Epigenomics, GTEx and other published works. For further details about the content and the methods used to build funMotifs please refer to the manual page.
PostgreSQL 10.0 is used to store the annotations and the motifs. The motifs are indexed based on genomic ranges to enable highly efficient query retrieval.
Availability
The motif annotations per tissue type can be downloaded for batch analysis. We also provide the pipeline source code that is used to build the funMotifs database. The funMotifs pipeline that is freely available on github allows re-generation of the database using additional datasets:
(funMotifs source code)
funMotifs can also be accessed via the ENSEMBL Variant Effect Predictor (VEP) tool.
Citation
Use the following to cite funMotifs:
Umer et al. 2018. funMotifs: Tissue-specific transcription factor motifs. bioRxiv doi:10.1101/683722
Contact:
For questions and issues regarding funMotifs please contact
funMotifs Support
. For issues related to the pipelines please post a new issue on the
github repository