OPEN APP
Home >Politics >News >After Covid-19 data is deleted, NIH reviews how its gene archive is handled

After Covid-19 data is deleted, NIH reviews how its gene archive is handled

The data—a series of gene sequences from coronavirus samples obtained from Covid-19 patients in Wuhan in January and February 2020—could hold clues about the origin of the pandemic (Photo: AP)Premium
The data—a series of gene sequences from coronavirus samples obtained from Covid-19 patients in Wuhan in January and February 2020—could hold clues about the origin of the pandemic (Photo: AP)
wsj

Removal of coronavirus gene sequences that might hold clues to the pandemic’s origin sparked concern among scientists and US senators

Listen to this article

Removal of coronavirus gene sequences that might hold clues to the pandemic’s origin sparked concern among scientists and U.S. senators

The National Institutes of Health said it was reviewing the removal of genetic data about the Covid-19 virus from an agency-run archive after a scientist raised concerns about the episode earlier this summer.

The data—a series of gene sequences from coronavirus samples obtained from Covid-19 patients in Wuhan in January and February 2020—could hold clues about the origin of the pandemic. The sequences were deleted from the Sequence Read Archive (SRA) last year at the request of one of the Wuhan University researchers who had originally provided them—a move that three Republican U.S. senators questioned in June in a sternly worded letter to NIH Director Francis Collins.

“The efforts by Chinese researchers to delete the data demands additional explanation," Senators Marsha Blackburn (R., Tenn.), Charles E. Grassley (R., Iowa) and Roger Marshall (R., Kansas), wrote in the letter. The senators cited as the reason for their inquiry a June 23 Wall Street Journal article about the deletion of the sequences.

In a reply to the senators dated Sept. 8, Dr. Collins said a review was under way to determine “whether appropriate steps were taken to assess this withdrawal request." An NIH spokeswoman on Sunday said that the review had been completed and that NIH leaders would weigh the findings.

“After all the American people have been through since the pandemic started, they deserve straight answers to basic questions which the Biden administration has failed to give them to date," the three senators said on Sunday in a statement after receiving Dr. Collins’s letter. The NIH and its parent agency, the Department of Health and Human Services, “have failed to be fully transparent with Congress and the American people," the senators said.

Aides to all three senators said they intended to seek greater clarity from the NIH on its decision to comply with the request and whether that request was handled appropriately.

The agency is withholding the names of the individuals involved in the data’s removal to protect their privacy, the NIH spokeswoman said.

The exchange of letters comes as scientists and the federal government work to determine the origin of the pandemic while criticizing China for withholding information that might be helpful.

An international scientific team led by the World Health Organization reported in March that the pandemic virus, SARS-CoV-2, likely spread to humans by contact with an unknown animal that had been infected by another animal, possibly a bat. But that finding has been sharply criticized in recent months, with some scientists saying there isn’t enough evidence to determine whether that hypothesis or the other leading one—that the deadly virus began spreading in humans after escaping from a lab—is the correct one.

U.S. intelligence agencies recently delivered a report to President Biden saying a lack of data made it difficult to reach a definitive conclusion on the origin of the pandemic.

Scientists routinely scour gene sequences of the sort removed from the archive as a way to find clues about the origin and evolution of viral pathogens. Gene sequences often mutate as a virus spreads from person to person, and studying the mutations can shed light on when, where and how pathogens like the Covid-19 virus get their start.

The controversy began in June, when a virologist at the Fred Hutchinson Cancer Research Center in Seattle, Wash., reported in a paper posted online that he had discovered that the sequences had been deleted from the NIH-run database, which is widely used by scientists around the world. As a result, “nobody was aware these sequences existed," the virologist, Jesse Bloom, wrote in the paper, adding that the deletion “suggests a less than wholehearted effort to maximize information about viral sequences from early in the Wuhan epidemic."

Two weeks after Dr. Bloom’s paper was posted, the Chinese researchers uploaded the deleted sequences to a public database maintained by the China National Center for Bioinformation. The researchers published information about the sequences in a scientific journal in June 2020.

The researchers didn’t respond to an email requesting comment.

China’s National Health Commission said the request for the deletion came about as the result of a misunderstanding between the Chinese researchers and the journal that had published the paper describing the sequences, according to an online post identified as that of an employee of the state-affiliated Xinhua News Agency. The commission’s vice-minister said that Dr. Bloom had “made up a conspiracy theory that it was a cover up" and that the deleted sequences were of little value for tracing the origin of the pandemic virus, according to a translation in the post.

China’s National Health Commission didn’t respond to a request for comment.

Dr. Collins said in his letter that the Wuhan University researcher requested the withdrawal of the sequences because updated data was being uploaded to another database and the researchers wanted to prevent confusion. Dr. Bloom said he later analyzed the sequences in the Chinese database and found them to be identical to the ones removed from the U.S. database.

“To me anyway, it seems like the policies might have ended up being abused to obscure the existence of these sequences," Dr. Bloom said.

It is unusual for data submitted to the Sequence Read Archive to be deleted later on. From March 2020 to March 2021, the archive received about 2.4 million submissions of sequence data, according to a spokeswoman for the National Center for Biotechnology Information, the NIH division that maintains the archive. In that same period, 2.09% of the submissions were updated and 0.19% were withdrawn, the spokeswoman said.

NCBI officials said that it was hard to determine the validity of requests to update or remove data from the Sequence Read Archive and that they take such requests at face value. “We can’t adjudicate the truth," said Stephen Sherry, acting director of the NCBI.

The NIH said it retains withdrawn data for the scientific record and in case of disaster recovery.

The review encompassed the archive’s procedures and training practices generally as well as the specific request from the Chinese researcher to remove the sequences from the Covid-19 virus, according to Dr. Sherry.

The NIH spokeswoman said the agency is still discussing whether policy changes are needed. “In the meantime,“ she added, "should the owners of the original data wish to redeposit the SARS-CoV-2 sequences into SRA, we will make that data available."

Subscribe to Mint Newsletters
* Enter a valid email
* Thank you for subscribing to our newsletter.

Never miss a story! Stay connected and informed with Mint. Download our App Now!!

Close
×
Edit Profile
My ReadsRedeem a Gift CardLogout