Background
Synthetic biology is a branch of science in which genetic elements are constructed using bioengineering with the goal of creating biologically based solutions or products. Like natural genetic elements, synthetic ones can develop mutations, which could affect the final product’s function. Since many genes overlap and reading frames can shift, a mutation that might seem benign in one frame can have dire consequences in another frame in humans or microbes. Many overlapping genes tend to coincide with higher mutation rates. Some method must be employed to prevent the synthetic oligos from compounding mutation rates or if the synthetic elements are incorporated as recombinant DNA, to confine them to the intended host. Blazejewski et al. devised a method to both confine and safeguard synthetic biology constructs against mutations using a computational platform, Constraining Adaptive Mutations using Engineered Overlapping Sequences (CAMEOS).
CAMEOS considers point mutations and indels that impact protein coding, as well as mutations that disrupt long-range interactions. It focuses on the overlapping nature of genes and computationally determines if a mutation can be tolerated by multiple proteins that are encoded from the same sequence. This is important because host cells that develop a mutation in the synthetic gene construct can grow and overtake the population, and the final product will be different or have low yield. Although gene flow can increase genetic variation, if the synthetic gene overlaps with an essential gene, then the same mutation would die out instead of take over the population.
Methods
Blazejewski et al. tested their algorithm on essential genes and other genes of interest. They chose genes that regulate antibiotic resistance so they could test the gene’s viability by measuring the effect on E. coli growth. They generated some plasmids with synthesized double encoded sequences using gBlocks™ Gene Fragments. They first verified whether CAMEOS could design functional proteins using synthetic biology without overlapping the host genome. Next, they co-encoded essential genes with biosynthetic genes. To control for plasmid effects, they also directly inserted genes into the E. coli genome.
One theory based on this research is that overlapping genes protect from accumulating mutations in the population. If a mutation is not tolerated when incorporated into multiple encoded proteins, it will result in gene deletion. To test this theory, they overlapped or “entangled” genes that were saturated with mutations and evaluated gene fitness. They also developed a high-throughput method to functionally test and validate the in silico predictions. The researchers pooled the synthetic oligos before transforming them. The viable E. coli cells were selected for transformed plasmids using MOPS dropout media.
Lastly, they tested the hypothesis that overlapping or entangling the sequences would generate biocontainment barriers that would prevent horizontal gene transfer (HGT). This hypothesis was tested by embedding a toxin into the gene of interest and transforming a plasmid containing an antitoxin. HGT recipients lacking the antitoxin would die.
Results
CAMEOS successfully designed several functional genes. When genes were overlapped, Blazejewski et al. found that mutations at the beginning of the overlapping sequence were more likely to be deleterious. Genes that overlapped with essential genes were more likely to be constrained. Their high-throughput model yielded six functional, unique sequences between the two genes tested, suggesting that synthesizing functional overlaps could be a viable option for many gene pairs. Lastly, their HGT test results showed that overlapping biosynthetic genes with a toxin limited HGT. Genes that did undergo HGT carried nonfunctional gene-of-interest mutants.
The presentation of this platform showcases another achievement toward improving gene synthesis. It builds on previous computational protein design work. Other recent advances in throughput [1], Markov random field optimization [2], and protein structure integration [3] together can further the goal of developing next-generation synthetic biology products that will function as predicted under stringently defined settings.