Blockchain not just for bitcoin. It can secure and store genomes too
Blockchain is a digital technology that allows a secure and decentralized record of transactions that is increasingly used for everything from cryptocurrencies to artwork. But Yale researchers have found a new use for blockchain: they've leveraged the technology to give individuals control of their own genomes.
Their findings are published June 29 in the journal Genome Biology.
"Our primary goal is to give ownership of genomic data back to the individual," said senior author Mark Gerstein, the Albert L. Williams Professor of Biomedical Informatics and professor of molecular biophysics and biochemistry, of computer science, and of statistics and data science.
Millions of people seeking insights into their ancestry or information about medical risks have already donated their genetic information to private commercial companies. Whether they know it or not, however, they also have given up control over how that information is used or sold.
The new technology, dubbed SAMchain, ensures that individual genomic information remains secure and under the control of the individual. Since information cannot be changed once it is stored in blockchains, the technology also protects against occasional corruption of DNA data stored on the cloud, where most genomic information is now stored on far-flung networks of computers.
"As genomic data becomes increasingly integral to our understanding of human health and disease, its integrity and security must be a priority when providing solutions to storage and analysis," Gerstein said. "Corruption, change, or loss of personal genomes could create problems in patient care and research integrity in the future."
The SAMchain technology could also speed up the advance of truly personalized medicine, the study authors say. For instance, patients would be able to provide direct access to their genomic data to doctors who can then use the information to help diagnose and treat medical conditions. They could also give permission to medical researchers to use their genetic information as part of their investigations or even sell it to pharmaceutical companies.
Researchers say the development of blockchain technology for medical purposes has been hampered by a huge roadblock: the immense size of data contained within our DNA. Unlike a financial transaction facilitated by blockchain, such as a bitcoin trade, which requires a limited amount of data storage, data from the sequencing of a single human chromosome can contain millions of "reads" or short fragments of DNA.
The Yale team, which was led by lead authors Gamze Gürsoy, a former Yale postdoctoral research associate who is now at Columbia University, and Charlotte Brannon, a member of Gerstein's lab, worked around that problem by comparing an individual's DNA against a standard reference genome. They then stored only the differences in linked blocks of the blockchain. The blocks, in turn, are indexed in a special way to allow for rapid query.
Those individual differences can be linked to conditions with known genetic risk factors, which can inform patients not only of their personal risk of developing disease but also help guide treatments for existing disorders.
Gerstein also hopes to expand the ability of SAMchain technology to store information on gene expression profiles—genes that are metabolically active in an individual.
The new approach, he said, would be made available as an open source and available to all researchers free of charge—with an individual's permission.
"We think this will actually make genomic research easier," Gürsoy said.
More information: Gamze Gürsoy et al, Storing and analyzing a genome on a blockchain, Genome Biology (2022). DOI: 10.1186/s13059-022-02699-7