Medical studies with striking results often prove false

by Eryn Brown, Los Angeles Times

If a medical study seems too good to be true, it probably is, according to a new analysis.

In a of nearly 230,000 trials compiled from a variety of disciplines, study results that claimed a "very large effect" rarely held up when other research teams tried to replicate them, researchers reported in Wednesday's edition of the .

"The effects largely go away; they become much smaller," said Dr. John Ioannidis, the Stanford researcher who was the report's senior author. "It's likely that most interventions that are effective have modest effects."

Ioannidis and his colleagues came to this conclusion after examining 228,220 trials grouped into more than 85,000 "topics" - collections of studies that paired a single (such as taking a non-steroidal anti- for postoperative pain) with a single outcome (such as experiencing 50 percent relief over six hours). In 16 percent of those topics, at least one study in the group claimed that the intervention made patients at least five times more likely to either benefit or suffer compared with control patients who did not receive the treatment.

In at least 90 percent of those cases, the team found, including data from subsequent trials reduced those odds.

The analysis revealed several reasons to question the significance of the very-large-effect studies, Ioannidis said.

Studies that reported striking results were more likely to be small, with fewer than 100 subjects who experienced fewer than 20 medical events. With such small sample sizes, Ioannidis said, large effects are more likely to be the result of chance.

"Trials need to be of a magnitude that can give useful information," he said.

What's more, the studies that claimed a very large effect tended to measure intermediate effects - for example, whether patients who took a reduced their levels of in their blood - rather than incidence of disease or death itself, outcomes that are more meaningful in assessing medical treatments.

The analysis did not examine individual study characteristics, such as whether the experimental methods were flawed.

The report should remind patients, physicians and policymakers not to give too much credence to small, early studies that show huge treatment effects, Ioannidis said.

One such example: the cancer drug Avastin. Clinical trials suggested the drug might double the time breast cancer patients could live with their disease without getting worse. But follow-up studies found no improvements in progression-free survival, overall survival or patients' quality of life. As a result, the U.S. Food and Drug Administration in 2011 withdrew its approval to use the drug to treat breast cancer, though it is still approved to treat several other types of cancer.

With early glowing reports, Ioannidis said, "one should be cautious and wait for a better trial."

Dr. Rita Redberg, a cardiologist at the University of California, San Francisco who was not involved in the study, said devices and drugs frequently get accelerated approval on the basis of small studies that use intermediate end points.

"Perhaps we don't need to be in such a rush to approve them," she said.

The notion that dramatic results don't hold up under closer scrutiny isn't new. Ioannidis, a well-known critic of the methods used in medical research, has written for years about the ways studies published in peer-reviewed journals fall short. (He's perhaps best known for a 2005 essay in the journal PLoS Medicine titled, "Why Most Published Research Findings Are False.")

But the scope of the JAMA analysis sets it apart from Ioannidis' earlier efforts, said Dr. Gordon Guyatt, a clinical epidemiologist at McMaster University in Hamilton, Canada, who was not involved in the work.

"They looked through a lot of stuff," he said.

Despite widespread recognition that big effects are likely to disappear upon further scrutiny, people still "get excited, and misguidedly so" when presented with home-run results, Guyatt said.

He emphasized that modest effects could benefit patients and were often "very important" on a cumulative basis.

4 /5 (1 vote)
add to favorites email to friend print save as pdf

Related Stories

New guidelines issued for reporting of genetic risk research

Mar 28, 2011

(PhysOrg.com) -- Apples to Apples is more than just a popular card game. It’s an important concept when comparing the results of published scientific studies. It’s impossible to draw accurate conclusions, for example, ...

5 Questions: Ioannidis on the need to test medical 'truths'

Jan 06, 2012

(Medical Xpress) -- How many established standards of medical care are wrong? Disturbingly, no one knows for sure, but one study suggests that it could be almost half, according to a commentary published in the Jan. 4 issue ...

Recommended for you

Same cancer, different time zone

8 hours ago

Just as no two people possess the same genetic makeup, a recent study has shown that no two single tumor cells in breast cancer patients have an identical genome.

Brazilian researchers identify RNA that regulates cell death

12 hours ago

Researchers from the University of São Paulo (USP) have identified an RNA known as INXS that, although containing no instructions for the production of a protein, modulates the action of an important gene in the process ...

User comments

Adjust slider to filter visible comments by rank

Display comments: newest first

Lurker2358
not rated yet Oct 24, 2012
...rather than incidence of disease or death itself, outcomes that are more meaningful in assessing medical treatments.


Really? Death is pretty universal, and is likely to be so for at least the next several decades, if not quite a bit longer than that. This standard is a bit silly, since many diseases don't actually have a permanent "cure" but only treatments for symptoms or perhaps to slow down the progression.

The analysis did not examine individual study characteristics, such as whether the experimental methods were flawed.


All medical experiments are flawed, because it's impossible to control for everything that could be contaminating the results. Patient age, genetics, lifestyle, diet, exercise, employment, random accidents like unknown chemical or cosmic ray exposure or God only knows what else.

One rides a bike to work, the other rides in a car. The guy on the bike gets lung cancer from breathing unfiltered air, but you don't see a connection in the lab...
riogranderift
not rated yet Oct 25, 2012
Only confirms what we are all taught in training. I hope tax money wasn't spent doing this study. Something bothers me about those who make a living mining data generated by those doing actual research.