DNA tilts and stretches underlie differences in mutation rates across genomes
Each cell in the body stores its genetic information in DNA in a stable and protected form that is readily accessible for the cell to carry on its activities. Nevertheless, mutations—changes in genetic information—occur throughout the human genome and can have a powerful influence on human health and evolution.
"Our team is interested in a classical question about mutation—why do mutation rates in the genome vary so tremendously from one DNA location to another? We just do not have a clear understanding of why this occurs," said Dr. Md. Abul Hassan Samee, assistant professor of integrative physiology at Baylor College of Medicine and corresponding author of the work.
Previous studies have shown that the DNA sequences flanking a mutated position—the sequence context—play a strong role in the mutation rate. "But this explanation still leaves unanswered questions," Samee said. "For example, one type of mutation occurs frequently in a specific sequence context while a different type of mutation occurs infrequently in that same sequence context. So, we think that a different mechanism could explain how mutation rates vary in the genome. We know that each building block or base that makes up a DNA sequence has its own 3D chemical shape. We proposed, therefore, that there is a connection between DNA shape and mutations rates, and this paper shows that our idea was correct."
The genetic code is "written" as a string of bases that is furthermore underwound or overwound and constrained into loops, all of which is known to influence every aspect of DNA activity. Surprisingly, most genome analyses treat DNA merely as a string of bases and ignore the fact that each base has a shape.
"We built a statistical model using only DNA structural information, otherwise ignoring the sequence data," said first author Zian Liu, a graduate student in the Samee lab. "We used the model to pinpoint which DNA shape features, such as stretches, twists or tilts, underlie variations of mutation rates in the human genome. Surprisingly, we found that although the sequence context may look very different from one mutation to another, the structural properties are remarkably similar."
"We found that stretch—the distance between paired building blocks in the two DNA strings forming the double helix—is one of the top structural properties that defines whether a location is mutable," Liu said.
"Although we were expecting these results, we did not anticipate that for all types of mutations the same DNA structural feature, stretch, would be similarly important in affecting the mutation rate," Samee said. "DNA tilt was the second structural feature that most influenced mutation rate of all types. We confirmed that DNA shape is important in functionally relevant regions of the human genome, such as protein-DNA binding sites that regulate gene expression, and that this structural mechanism is conserved across many species."
The DNA-shape models of mutation rates developed by Liu and Samee showed similar or improved performance when compared to sequence-based models and accurately characterized mutation hotspots. This study supports considering DNA shape when studying mechanisms of mutation rate variations in the human genome.
Dr. Lynn Zechiedrich, professor of molecular virology and microbiology at Baylor and not an author on the paper, notes, "I'm so excited by this advance by Liu and Samee. Of course, DNA is more than just a string of letters, so genomic analysis tools that treat it as such miss out."
"For the last 20 years, the human genome has been seen as a linear sequence of building blocks. But studies like ours and others show that DNA is much more than that; it has a 3D structure that carries important meaning," Samee said. "A meaning that is very consistent when it comes to explaining variation of mutation rates and is likely conserved among species."
The study is published in the journal Nucleic Acids Research.
More information: Zian Liu et al, Structural underpinnings of mutation rate variations in the human genome, Nucleic Acids Research (2023). DOI: 10.1093/nar/gkad551