Systematic identification and annotation of multiple-variant compound effects at transcription factor binding sites in human genome

Cheng, Si-Jin, Jiang, Shuai, Shi, Fang-Yuan, Ding, Yang, Gao, Ge
Journal of genetics and genomics 2018 v.45 no.7 pp. 373-379
binding sites, genetic variation, genome, genomics, humans, regulatory sequences, transcription factors
Understanding the functional effects of genetic variants is crucial in modern genomics and genetics. Transcription factor binding sites (TFBSs) are one of the most important cis-regulatory elements. While multiple tools have been developed to assess functional effects of genetic variants at TFBSs, they usually assume that each variant works in isolation and neglect the potential “interference” among multiple variants within the same TFBS. In this study, we presented COPE-TFBS (Context-Oriented Predictor for variant Effect on Transcription Factor Binding Site), a novel method that considers sequence context to accurately predict variant effects on TFBSs. We systematically re-analyzed the sequencing data from both the 1000 Genomes Project and the Genotype-Tissue Expression (GTEx) Project via COPE-TFBS, and identified numbers of novel TFBSs, transformed TFBSs and discordantly annotated TFBSs resulting from multiple variants, further highlighting the necessity of sequence context in accurately annotating genetic variants. COPE-TFBS is freely available for academic use at