A scientific response to the recent reporting on the origin of SARS-CoV-2

Scientific response to the recent reporting on the origin of SARS-CoV-2
Synthetic RNA virus assembly.Directed assembly of ∼30kb CoV genomes requires several design considerations A) Several identical type II enzymes cannot be used for directed genome assembly as this leads to random fragment sequences, inverted fragments, and loops. Use of different type II enzymes that cut in their recognition sequence for every junction requires working with numerous buffers, running numerous reactions at different temperatures, and may require modifying numerous recognition sites in the genome. The use of fewer distinct enzymes is preferred. B) Endonucleases that cleave outside of their recognition sequence (type II shifted or type IIS) can produce distinct sticky ends allowing for directed assembly of complex viral genomes. C) For IVGA, individual fragments derived from PCR or DNA synthesis are first amplified in bacterial plasmids. D) Fragments are then cut out of the plasmids using type IIS endonucleases. E) Unique sticky ends at each section enable directed assembly in a full-length cDNA or bacterial artificial chromosome. F) Use of a different type IIS endonuclease with sites flanking a region of interest (ROI) allows for efficient substitutions of that region. G) This method does not alter viral proteins. However, it does leave a distinct pattern (fingerprint) of regularly spaced type IIS recognition sites of the endonucleases that were used for synthetic assembly. Credit: bioRxiv (2022). DOI: 10.1101/2022.10.18.512756

The origin of SARS-CoV-2 remains unresolved. In a non-peer-reviewed preprint published on bioRxiv on Oct. 20, 2022 three authors present analyses that, according to their interpretation, suggest a "synthetic emergence" of SARS-CoV-2 and its release in the context of a "laboratory accident".

Experts from the University of Wuerzburg and the University Hospital have reviewed the preprint on the origin of SARS-CoV-2. In summary, the analyses presented in the study do not provide sufficient evidence for the authors' conclusion that SARS-CoV-2 is of synthetic origin.

Valentin Bruttel (affiliation: Department for Obstetrics and Gynecology, University Clinics of Würzburg, Germany), Alex Washburne, and Antonius Van Dongen present analyses that, according to their interpretation, suggest a "synthetic emergence" of SARS-CoV-2 and its release in the context of a "laboratory accident".

Summary of the preprint

The core message of the preprint is that the of SARS-CoV-2 has an "abnormal pattern" of recognition sites for certain restriction enzymes (BsaI and BsmBI) and therefore is highly unlikely to have arisen by .

Based on statistical analyses, the authors conclude that this pattern most likely arose as a "fingerprint" in the SARS-CoV-2 genome during the establishment of a reverse genetics system for the (most likely bat-derived SARS-CoV-2 predecessor) in a research laboratory.


Restriction enzymes such as BsaI and BsmBI are used to clone genomes of coronaviruses isolated from animals such as bats and render them accessible for systematic research.

Such so-called reverse genetics systems are of great importance for virological research. They allow the genetic conservation of otherwise mutation-prone viruses as well as the investigation of individual viral gene functions. Using these tools, coronavirus genomes, which are relatively large at just under 30,000 nucleotides, can be cloned and modified in bacteria in 5-8 small subfragments.

For the cloning of the various fragments, individual bases of the viral genome occasionally have to be exchanged in order to insert the restriction sites necessary for cloning, or to remove unwanted restriction sites. Subsequently, the individual DNA fragments can be re-assembled to give rise to a complete viral genome.

Scientific evaluation

1. Contrary to the authors' claim, the restriction site pattern of SARS-CoV-2 may well have arisen naturally—similar patterns are also found in coronaviruses closely related to SARS-CoV-2

All 5 restriction sites—BsmBI (n=3) and BsaI (n=2)—central to the analyses in the preprint are also commonly found in closely related coronaviruses. Thus, the existence of these 5 sites in the SARS-CoV-2 genome can be explained without human manipulation. Some known coronaviruses closely related to SARS-CoV-2 were obviously not included in the analyses performed in the preprint.

The authors of the preprint further argue that restriction sites in the SARS-CoV-2 genome that are unfavorable for genetic work are absent and presumably have been artificially altered ("deleted"). However, while many other coronaviruses actually show significantly more restriction sites for the two restriction enzymes analyzed, some closely related bat coronaviruses such as BANAL-20-103 and BANAL-116 also show only 5 and 7 restriction sites, respectively, with similarly sized genome fragments.

2. The position of the two BsaI restriction sites in the region of the S gene does not indicate genetic manipulation of the SARS-CoV-2 genome.

The Spike protein of coronaviruses is of particular interest because it determines whether human cells can be infected or not. For reverse genetics models, it was therefore of particular interest for researchers to be able to exchange or modify the Spike protein coding region of coronavirus genomes.

As shown by the authors, the two restriction sites for BsaI could be used to easily manipulate the most important part of the Spike protein of SARS-CoV-2, namely the receptor-binding domain (RBD) and the furin cleavage sites (FCS). For no other coronavirus isolate does this appear to be so easily possible with BsaI, since either the appropriate restriction sites are missing or additional sites would lead to additional unwanted fragments upon digestion with BsaI.

The authors argue that this suggests that the SARS-CoV-2 genome has been optimized for easy replacement and manipulation of the most important parts of the Spike protein. In fact, the combination of BsaI and BsmBI has been used in the past by research groups in Wuhan to clone coronavirus genomes from bats and perform so-called gain-of-function experiments. However, in this case, the two BsaI restriction sites were each positioned to allow the exchange of the entire Spike protein.

This is, however, not possible for SARS-CoV-2. If the two BsaI restriction sites had been in exactly the same positions as in previously published reverse genetics models, this would indeed have provided strong evidence for human manipulation. The two observed BsaI restriction sites are, however, also frequently found in closely related coronaviruses of SARS-CoV-2.

It is noteworthy, however, that these coronaviruses usually possess at least one further BsaI site, which would have to be eliminated (i.e., mutated). Given the high mutation rate of circulating SARS-CoV-2 variants, one would also expect that artificially inserted synonymous (wobble) mutations, inserted to create or eliminate defined restriction sites, would disappear and artificially eliminated ones would reappear over the course of the more than two-year pandemic.

In fact, however, the Omicron variants still have the same interface distribution pattern as the original Wuhan virus.

3. The statistical analyses of the paper on the distribution of restriction sites are flawed or incomplete in important respects.

The combination of BsaI and BsmBI analyzed in the preprint is indeed not suitable for the vast majority of coronaviruses to dissect their genomes into an appropriate number of fragments (5 to 7) of suitable size (<8000 nucleotides). However, as our own analyses showed, this is easily possible with other similar restriction enzymes. With the algorithms used and comprehensibly documented in the preprint, it can be shown that an appropriately suitable combination of restriction enzymes can be found for virtually any coronavirus genome.

The analysis of a single, selectively chosen combination of two restriction enzymes (here: BsaI and BsmBI) is not suitable to prove human intervention. If one analyzes only one combination of two restriction enzymes suitable for a particular virus, it is predictable that this combination will be significantly less suitable for other virus isolates to construct a reverse genetics model.

This leads to the authors' misinterpretation that the virus, in this case SARS-CoV-2, for which the selection of the analyzed combination of restriction endonucleases was optimized, is not of natural origin.

4. The analyses on the in silico evolution of two closely related coronaviruses with the aim of obtaining a restriction pattern comparable to that of SARS-CoV-2 are not convincing.

The assumption of purely random mutations in a viral genome is not valid, since most mutations disrupt or destroy the amino acid sequence of the viral proteins and are thus under selective pressure. In addition, the authors would also have had to analyze all acceptable combinations of in this case as well.

The calculation described in the preprint of the probability for natural evolution of the observed restriction site pattern of SARS-CoV-2 is flawed. For this purpose, the authors combine the probabilities for a total of five different criteria. However, these are neither independent of each other, nor is the method used to combine these probability values appropriate. Furthermore, each individual probability calculated is affected by the same potential sources of error as listed above.


In their preprint, the authors present statistical analyses of the genome sequence of SARS-CoV-2 from which they conclude a synthetic origin of SARS-CoV-2. The preprint is carefully prepared and meets basic scientific requirements, particularly with respect to a sound and transparent presentation of the methodology used.

However, the analyses presented in the preprint show considerable methodological weaknesses. As a result, the authors' main conclusions do not stand up to scientific scrutiny or result from over-interpretation of their analyses. In contrast to the statements formulated in the preprint, the pattern of restriction sites found in the genome of SARS-CoV-2 does not suggest genetic manipulation with the claimed probability.

In summary, the analyses presented in the study do not provide sufficient evidence for the authors' conclusion that SARS-CoV-2 is of synthetic origin. The origin of SARS-CoV-2 thus remains unresolved.

More information: Valentin Bruttel et al, Endonuclease fingerprint indicates a synthetic origin of SARS-CoV-2, bioRxiv (2022). DOI: 10.1101/2022.10.18.512756

Provided by Universitätsklinikum Würzburg
Citation: A scientific response to the recent reporting on the origin of SARS-CoV-2 (2022, October 26) retrieved 27 May 2024 from https://medicalxpress.com/news/2022-10-scientific-response-sars-cov-.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Estimating the risk of SARS-related coronaviruses from bats in Southeast Asia


Feedback to editors