funMotifs DB

Enter or upload genomic regions

A sample from the results

Download the results for the full set by clicking on the 'Download Output File' button on the left (note that not all rows and columns are shown here)

Download extracted files from funMotifsDB

Potential functional motifs per tissue type

Annotated variatns using funMotifsDB

Putative regulatory variants from the 1000 Genomes project: candidate_variants_1000G.tar.gz.
Putative regulatory variants from the GTEx project (eQTLs): GTEx_eQTLs_candidates.tar.gz.
Putative regulatory variants from GWAS : GWAS_SNPs_LD_candiadtes.tar.gz.

All scored motifs per tissue type

All motifs: fscore per tissue type (.tbi)

Data files used to build funMotifsDB

datafiles.tar.gz: information about the contents of this archive is listed here.

User Guide

The current version of the funMotifs database (v1.0) contains annotations for predicted motifs of 510 Transcription Factors (TFs) from the JASPAR2016 vertebrates CORE database, click here to see the entire TF list.

Annotations and assays from the following resources were used to annotate the motifs:

ChIP-seq datasets from ENCODE.
DNase1-seq datasets from the ENCODE and RoadMap Epigenomics projects.
CAGE peaks in promoters and enhancers from the FANTOM5 project.
Chromatin states from RoadMaps15-state core marks model.
Replication domains from Liu F. et al.
Hi-C contacting domains from Rao, S.S., et al.
Gene expression data from GTEx and ENCODE.
Regulatory elements from MPRAs Ernst, J.. et al., Tewhey, R., et al, and Vockley, C.M., et al.

Further details on the preprocessing of the datasets are listed in their corresponding files on the project repository on github.

What input format is accepted?

The input should be a table of genomic coordinates with at least three columns: chromosome number, start and end positions. The coordinates should be BED 0-based .

For variants, the input should have at least five columns, first three columns as above, and a 4th column for the reference allele and a 5th column for the alternative allele. Any additional column from the user will be ignored.

The funMotifs database uses the hg19 version of the human reference genome.

The columns can be separated by tab, comma, space, or semi-colon.

Check the Header checkbox if the input contains a header line.

What do the output columns represent?

*By default input regions or variants that do not overlap any TF motif are not reported in output file. Uncheck the box 'Exclude input regions with no matches?' to retrieve all entries in the input list regardless their overlap with TF motifs.

Batch Analysis

We provide extracted files from the database in the download section. For batch analysis, we recommend downloading the funMotifs file for the desired tissue type and use IntersectBed to identify the overlapping motifs.

In order to search in all motifs regardless of their functionality you can download the all_tissues archive file in the download section. The file contains all motifs (~85 million) and the fscore for each tissue type.

Be aware that, chromosome X,Y and M are represented as 23, 24, and 25, respectively.

About
funMotifs database contains annotated transcription factor motifs. The aim is to aid researchers in interpreting the noncoding genome. It enables analysis of noncoding variants and regions. funMotifs is built based on assays and datasets from ENCODE, FANTOM, RoadMap Epigenomics, GTEx and other published works. For further details about the content and the methods used to build funMotifs please refer to the manual page.
PostgreSQL 10.0 is used to store the annotations and the motifs. The motifs are indexed based on genomic ranges to enable highly efficient query retrieval.

Availability
The motif annotations per tissue type can be downloaded for batch analysis. We also provide the pipeline source code that is used to build the funMotifs database. The funMotifs pipeline that is freely available on github allows re-generation of the database using additional datasets: (funMotifs source code)

funMotifs can also be accessed via the ENSEMBL Variant Effect Predictor (VEP) tool.

Citation
Use the following to cite funMotifs: Umer et al. 2018. funMotifs: Tissue-specific transcription factor motifs. bioRxiv doi:10.1101/683722

Contact:
For questions and issues regarding funMotifs please contact funMotifs Support . For issues related to the pipelines please post a new issue on the github repository