Main content area

Evolutionary design of multiple genes encoding the same protein

Terai, Goro, Kamegai, Satoshi, Taneda, Akito, Asai, Kiyoshi
Bioinformatics 2017 v.33 no.11 pp. 1613-1620
Internet, algorithms, bioinformatics, genes, homologous recombination, nucleotide sequences, synthetic biology
Motivation: Enhancing expression levels of a target protein is an important goal in synthetic biology. A widely used strategy is to integrate multiple copies of genes encoding a target protein into a host organism genome. Integrating highly similar sequences, however, can induce homologous recombination between them, resulting in the ultimate reduction of the number of integrated genes. Results: We propose a method for designing multiple protein-coding sequences (i.e. CDSs) that are unlikely to induce homologous recombination, while encoding the same protein. The method, which is based on multi-objective genetic algorithm, is intended to design a set of CDSs whose nucleotide sequences are as different as possible and whose codon usage frequencies are as highly adapted as possible to the host organism. We show that our method not only successfully designs a set of intended CDSs, but also provides insight into the trade-off between nucleotide differences among gene copies and codon usage frequencies. Availability and Implementation: Our method, named Tandem Designer, is available as a web-based application at Contact: or Supplementary information: Supplementary data are available at Bioinformatics online.