Main content area

AnaCoDa: analyzing codon data with Bayesian mixture models

Landerer, Cedric, Cope, Alexander, Zaretzki, Russell, Gilchrist, Michael A
Bioinformatics 2018 v.34 no.14 pp. 2496-2498
Bayesian theory, algorithms, bioinformatics, computer software, data collection, gene expression, genes, genomics, models, ribosomes, translation (genetics)
AnaCoDa is an R package for estimating biologically relevant parameters of mixture models, such as selection against translation inefficiency, non-sense errors and ribosome pausing time, from genomic and high throughput datasets. AnaCoDa provides an adaptive Bayesian MCMC algorithm, fully implemented in C++ for high performance with an ergonomic R interface to improve usability. AnaCoDa employs a generic object-oriented design to allow users to extend the framework and implement their own models. Current models implemented in AnaCoDa can accurately estimate biologically relevant parameters given either protein coding sequences or ribosome foot-printing data. Optionally, AnaCoDa can utilize additional data sources, such as gene expression measurements, to aid model fitting and parameter estimation. By utilizing a hierarchical object structure, some parameters can vary between sets of genes while others can be shared. Genes may be assigned to clusters or membership may be estimated by AnaCoDa. This flexibility allows users to estimate the same model parameter under different biological conditions and categorize genes into different sets based on shared model properties embedded within the data. AnaCoDa also allows users to generate simulated data which can be used to aid model development and model analysis as well as evaluate model adequacy. Finally, AnaCoDa contains a set of visualization routines and the ability to revisit or re-initiate previous model fitting, providing researchers with a well rounded easy to use framework to analyze genome scale data. AnaCoDa is freely available under the Mozilla Public License 2.0 on CRAN (