An international team of scientists has successfully employed single molecule, real-time (SMRT) DNA sequencing technology from Pacific Biosciences of California, Inc. (NASDAQ: PACB) to provide valuable insights into the pathogenicity and evolutionary origins of the highly virulent bacterium responsible for the German E. coli outbreak. Published online today in the New England Journal of Medicine, the results provide the most detailed genetic profile to date of the outbreak strain, including medically relevant information.
The researchers determined the outbreak strain was a member of the enteroaggregative pathotype of E. coli (EAEC) with serotype O104:H4. The outbreak isolates are distinguished from other O104:H4 strains because they contain genes encoding Shiga toxin 2 (Stx2) and a distinct set of additional virulence and antibiotic resistance factors. In addition, the team found that expression of the stx2 gene was increased by certain antibiotics including ciprofloxacin, suggesting caution should be used before using certain classes of antibiotics to counteract this newly emerged pathogen.
By sequencing the outbreak strain and 11 related strains with the PacBio RS, the team concluded that horizontal genetic exchange with the Shiga toxin-producing enterohemorrhagic E. coli (EHEC) strain enabled the emergence of the highly virulent Shiga toxin-producing O104:H4 EAEC strain. The genetic analysis also indicates that evolution of this new form was a relatively recent event.
The team identified many virulence factor genes commonly found in EAEC. Furthermore, the exceptionally long sequencing reads that are characteristic of PacBio SMRT DNA sequencing technology enabled the team to also detect larger-scale deletions, insertions, inversions and other structural variation between the O104:H4 outbreak samples and the other O104:H4 EAEC samples that were sequenced. Several of these structurally divergent regions house genes that encode virulence factors. Another feature in which the current outbreak diverges from common EAEC isolates is in the number and nature of SPATE proteases. Taken together, the results provide a possible explanation for the increased virulence of the German E. coli outbreak strain.
The authors included scientists in the U.S. and Denmark from Pacific Biosciences, the University of Maryland School of Medicine, the University of Virginia School of Medicine, the World Health Organization Collaborating Centre for Reference and Research on Escherichia coli and Klebsiella, the Statens Serum Institut, Hvidovre University Hospital, Brigham and Women's Hospital and Harvard Medical School.
"This multi-strain sequencing data and analysis significantly increases the amount of scientific information available for the study of this new deadly form of E. coli and has yielded critical insights into its causative agent," said co-author, David A. Rasko, Ph.D., Assistant Professor, University of Maryland School of Medicine, Institute for Genome Sciences and Department of Microbiology and Immunology. "Our results provide the most complete published genome of this strain to date and highlight the importance of DNA sequencing to understanding how the plasticity of bacterial genomes facilitates the emergence of new pathogens."
Whole genome sequencing involves decoding the precise order of nucleotide bases that make up an organism's complete set of DNA and provides more comprehensive information than other analysis methods such as DNA fingerprinting or arrays. With advances in technology and decreasing cost, whole genome sequencing is emerging as the gold standard method for identifying and classifying infectious agents. SMRT technology is the latest advance in DNA sequencing, capable of generating long sequence reads to resolve structural variations and complex genomes at ultra-fast speeds by 'eavesdropping' on DNA replicating in real time.
Eric Schadt, Ph.D., Chief Scientific Officer of Pacific Biosciences and co-author of the paper commented: "We have reached a new era in which communities of researchers can rapidly share large-scale data sets and analyses vital for public health. Sequencing genomes in hours, as opposed to days or weeks, with unprecedented read lengths is the emerging hallmark of third generation DNA sequencing. The long PacBio RS reads enabled a PacBio-only de novo genome assembly, a key component of new pathogen characterization, as well as deeper insights into structural variants."