Main content area

RCDA: A highly sensitive and specific alternatively spliced transcript assembly tool featuring upstream consecutive exon structures

Sturgeon, Xiaolu H., Gardiner, Katheleen J.
Genomics 2012 v.100 no.6 pp. 357-362
alternative splicing, chromosomes, data collection, exons, genes, humans, messenger RNA
When applied to complex transcript datasets, current tools for automated assembly of mRNA sequences require long run times and produce exponentially increasing numbers of splice variants. Here, we describe RCDA, a genome-based transcript assembly tool comprising RCluster, that recursively clusters transcripts, and DAssemble, that generates composite transcript sequences through path-finding using a directed acyclic graph. Each exon included in a final transcript is associated with an array of all upstream consecutive exon structures obtained from original transcripts. When a depth-first-search path reaches an exon, the path is retained only if it contains a structure from that exon's array. RCDA assemblies, therefore, include only those transcripts with experimentally supported exon patterns. When applied to >23,000 transcripts from human chromosome 21, using biologically reasonable filters, RCDA execution time was approximately 4h. RCDA outperformed ECgene in reconstructing RefSeq transcripts and in limiting the total number of transcripts and transcripts per gene.