Main content area

ButterflyBase: a platform for lepidopteran genomics

Papanicolaou, Alexie, Gebauer-Jung, Steffi, Blaxter, Mark L., Owen McMillan, W., Jiggins, Chris D.
Nucleic acids research 2008 v.36 no.suppl_1 pp. D582
Lepidoptera, data collection, expressed sequence tags, gene ontology, genomics, insects, microarray technology, nucleic acids, pests, phylogeny, prediction, proteins, proteomics, translation (genetics), unigenes
With over 100 000 species and a large community of evolutionary biologists, population ecologists, pest biologists and genome researchers, the Lepidoptera are an important insect group. Genomic resources [expressed sequence tags (ESTs), genome sequence, genetic and physical maps, proteomic and microarray datasets] are growing, but there has up to now been no single access and analysis portal for this group. Here we present ButterflyBase (, a unified resource for lepidopteran genomics. A total of 273 077 ESTs from more than 30 different species have been clustered to generate stable unigene sets, and robust protein translations derived from each unigene cluster. Clusters and their protein translations are annotated with BLAST-based similarity, gene ontology (GO), enzyme classification (EC) and Kyoto encyclopaedia of genes and genomes (KEGG) terms, and are also searchable using similarity tools such as BLAST and MS-BLAST. The database supports many needs of the lepidopteran research community, including molecular marker development, orthologue prediction for deep phylogenetics, and detection of rapidly evolving proteins likely involved in host-pathogen or other evolutionary processes. ButterflyBase is expanding to include additional genomic sequence, ecological and mapping data for key species.