Evidence-based medicine lacks solid supporting evidence Assumptions for generalizing clinical trial 

Assumptions for generalizing clinical trial data to particular patients rarely withstand scrutiny

For millennia, medicine was more art than science.

From at least the time of Hippocrates in ancient Greece, physicians were taught to use their intuition, based on their experience.

“For it is by the same symptoms in all cases that you will know the diseases,” he wrote. “He who would make accurate forecasts as to those who will recover, and those who will die … must understand all the symptoms thoroughly.”

In other words, doctors drew general conclusions from experience to forecast the course of disease in particular patients.

But Hippocratic medicine also incorporated “scientific” theory — the idea that four “humors” (blood, black bile, yellow bile and phlegm) controlled the body’s health. Excess or deficiency of any of the humors made you sick, so treating patients consisted of trying to put the humors back in balance. Bloodletting, used for centuries to treat everything from fevers to seizures, serves as an example of theory-based medicine in action.

Nowadays medical practice is supposedly (more) scientific. But actually, medical theory seems to have taken a backseat to the lessons-from-experience approach. Today’s catch phrase is “evidence-based medicine,” and that “evidence” typically takes the form of results from clinical trials, in which potential treatments are tested on large groups of people. It’s basically just a more systematic approach to Hippocrates’ advice that doctors base diagnosis, treatments and prognosis on experience with previous patients. But instead of doctors applying their own personal clinical experience, they rely on generalizing the results of large trials to their particular patients.

You should call this approach the “Risk Generalization-Particularization” model of medical prediction, Jonathan Fuller and Luis Flores write in a paper to be published in Studies in History and Philosophy of Biological and Biomedical Sciences. (It’s OK to call it ‘Risk GP’ for short, they say.) “Risk GP” they note, is “the model that many practitioners implicitly rely upon when making evidence-based decisions.”

Risk GP as a model for making medical judgments is the outgrowth of demands for evidence-based medicine, write Fuller, on the medicine faculty at the University of Toronto in Canada, and Flores, a philosopher at King’s College London in England. It “advocates applying the results of population studies over mechanistic reasoning … in diagnosis, prognosis and therapy.” Evidence-based medicine has set a new standard for clinical reasoning, Fuller and Flores declare; it “has become dominant in medical research and education, accepted by leading medical schools and all of the major medical journals.”

So it seems like a good idea to ask whether the “evidence” actually justifies this evidence-based approach. In fact, it doesn’t.

“There are serious problems with the Risk GP Model, especially with its assumptions, which are often difficult to warrant with evidence and will often fail in practice,” Fuller and Flores assert.

In their paper, they outline serious problems with both the generalization and particularization sides of the GP coin. If you treat a patient on the basis of data from clinical trials, you ought to be sure that the patient actually is a member of the population sampled to perform the trial. You’d also like to be sure that the sample of patients in the trial actually did fairly represent the whole population it was sampled from. In practice, these requirements are never fully met. Physicians simply assume that the study populations are “sufficiently similar” to the people being treated (the “target population”). But, as Fuller and Flores point out, that assumption is rarely questioned, and evidence supporting it is lacking.

“Our target might be a population that was ineligible for the trial, such as older patients or patients with other concurrent diseases,” they write. In fact, patients with other diseases or people taking multiple medications — those typically not allowed in trials — are often exactly the people seeking treatment.

“Given the demographics of patients in hospital and community practice, target populations often include these very patients that trials exclude,” Fuller and Flores write.

Generalizing from a trial’s results to the target population is therefore risky. But even if that generalization is fair, applying trial results to treating a particular patient may still be unjustified. A patient may belong to a defined target population but still not respond to a treatment the way the “average” patient in a trial did.

After all, trials report aggregate outcomes. Suppose a drug reduces the incidence of fatal heart attacks in a trial population by 20 percent. In other words, say, 100 people died in the group not getting the drug, while only 80 people died in the group receiving it. But that doesn’t mean the drug will reduce the risk to any given individual by 20 percent. Genetic differences among the patients in a trial may have determined who survived because of the drug. For a patient without the favorable gene, the drug might actually make things worse. No one knows.

Fuller and Flores go into considerable quantitative detail to illustrate the flaws in the Risk GP approach to medical practice. Their underlying point is not that the Risk GP method is always wrong or never useful. It’s just that the assumptions that justify its use are seldom explicitly recognized and are rarely tested.

It’s worth asking, of course, whether such philosophical objections have truly serious implications in the real world. Maybe clinical research, while not resting on a rigorously logical foundation, is generally good enough for most practical purposes. Sadly, a survey of the research relevant to this issue suggests otherwise.

“Evidence from clinical studies … often fails to predict the clinical utility of drugs,” health researchers Huseyin Naci and John Ioannidis write in the current issue of Annual Review of Pharmacology and Toxicology.

Questionable evidence

In their review, Naci and Ioannidis find all sorts of turbulence in the medical evidence stream. Often clinical studies fall short of the rigorous methodology demanded by “gold standard” clinical trials, in which patients are assigned to treatment groups at random and nobody knows which group is which. And even randomized studies have “important limitations,” the researchers write.

Apart from methodological weaknesses, clinical research also suffers from biases stemming from regulatory and commercial considerations. Drug companies typically test new products against placebos rather than head-to-head against other drugs, so doctors don’t get good evidence about which drug is the better choice. And restrictions on what patients are admitted to trials (as Fuller and Flores noted) make the test groups very unlike the people doctors actually treat. “As a result, drugs are approved on the basis of studies of very narrow clinical populations but are subsequently used much more broadly in clinical practice,” Naci and Ioannidis point out.

Their assessment documents all sorts of other problems. Tests of drug effects often rely on short-term surrogate indicators (say, change in cholesterol level), for instance, rather than eventual meaningful outcomes (say, heart attacks). Surrogates often overstate a drug’s effectiveness and imply beneficial effects twice as often as studies recording actual clinical outcomes.

Medical evidence is also skewed by secrecy. Studies with good news about a drug are more likely to be published than bad news results, and bad news that does get reported may be delayed for years. Evidence synthesis, as in meta-analyses of multiple studies, therefore seldom really paints the whole picture. Studies suggest that meta-analyses exaggerate treatment effects, Naci and Ioannidis report.

All these factors, many the offspring of regulatory requirements and profit-making pressure, render the evidence base for modern medicine of questionable value. “Driven largely by commercial interests, many clinical studies generate more noise than meaningful evidence,” Naci and Ioannidis conclude.

Thus Fuller and Flores’ concern about the Risk GP approach’s assumptions is joined by real-world pressures that exacerbate evidence-based medicine’s shortcomings, suggesting the need for different, or at least other, approaches. In some situations, a theory-based approach might work better.

Nobody advocates a return to the four humors and wholesale bloodletting. But Fuller and Flores do argue for more flexibility in choosing a basis for predicting treatment outcomes. A mechanistic understanding of how diseases operate on the biochemical level offers one alternative.

This approach is not exactly absent from medicine today. Much progress has been made over the last century or so in identifying the biochemical basis for many sorts of medical maladies. But apart from a few specific cases (such as some genetic links relevant to choosing breast cancer treatments) its advantages have not been widely realized.

Ideally, mechanistic-based methods based on medical theory and biochemical knowledge would improve decisions based solely on generalization and particularization. And, as Fuller and Flores note, other models for making predictions also exist, such as a doctor’s personal experience with patients or even one individual patient.

Many doctors do apply different approaches on a case-by-case basis. No doubt some particular doctors have superb intuition for when to rely on clinical trials and when to go with their gut. But modern medical philosophy seems to have strayed from the Hippocratic ideal of combining theory with experience.  

Medicine today would be better off if it became standard practice to assess multiple models for making predictions and “match the model to the circumstances,” Fuller and Flores say. “When it comes to medical prediction, many models are surely better than one.”

Source : ScienceNews

Posted from WordPress for Android – Google Nexus 5


The NHLBI ARDS Network enrolled 5,527 patients across ten randomized controlled trials and one observational study.


Ketoconazole for ALI/ARDS (KARMA)

Mar 1996 – Feb 1998

The first clinical trial completed by the Network was a randomized, controlled trial of Ketoconazole versus placebo in patients with acute lung injury and ARDS. It enrolled 234 participants.

Lower Tidal Volume Trial (ARMA)

Mar 1996 – Jul 1999

The ARMA study was a randomized, controlled multi-center 2×2 factorial study consisting of a drug treatment (Ketoconazole vs. placebo) and a ventilation strategy (6ml/kg tidal volume vs. 12ml/kg tidal volume). It enrolled 861 participants.

Lisofylline for ALI/ARDS (LARMA)

Feb 1998 – Jun 1999

The LARMA study was a randomized, double-blind, placebo-controlled multi-center study with where each patient was randomized between Lisofylline and Placebo. It was designed to test whether the administration of lisofylline early after the onset of ALI or ARDS would reduce mortality and morbidity. It was also conducted in a 2×2 factorial in the later stages of the ARMA trial. It enrolled 236 participants.

Late Steroid Rescue Study (LaSRS)

Aug 1997 – Nov 2003

The late phase of ARDS is often characterized by excessive fibroproliferation leading to gas exchange and compliance abnormalities. The objective of the LaSRS study was to determine if the administration of corticosteroids, in the form of methylprednisolone sodium succinate, in severe late-phase ARDS, would have a positive effect on this fibroproliferation, thereby reducing mortality and morbidity. It enrolled 180 particpants.

Higher vs Lower PEEP (ALVEOLI)

Nov 1999 – Mar 2002

The ALVEOLI study was a prospective, randomized, controlled multi-center trial. The objective was to compare clinical outcomes of patients with acute lung injury (ALI)and acute respiratory distress syndrome (ARDS) treated with a higher end-expiratory lung volume/lower FiO2 versus a lower end-expiratory lung volume/higher FiO2 ventilation strategy. It enrolled 549 participants.

Fluid and Catheter Treatment Trial (FACTT)

Jun 2000 – Oct 2005

The FACTT study was a prospective, randomized, multi-center trial evaluating the use of a pulmonary artery catheter versus a less invasive alternative, the central venous catheter, for the management of patients with acute lung injury (ALI) or acute respiratory distress syndrome (ARDS). It was combined in a 2×2 factorial design with a second study contrasting a conservative and a liberal fluid management strategy in patients with ALI or ARDS. It enrolled 1000 participants.

Albuterol for the Treatment of ALI (ALTA)

Aug 2007 – Sep 2008

A prospective, randomized trial of Aerosolized Albuterol vs. Placebo to test the safety and efficacy of aerosolized beta-2 adrenergic agonist therapy for improving clinical outcomes in patients with acute lung injury. It enrolled 282 participants.

Early vs. Delayed Enteral Nutrition (EDEN)

Nov 2006 – Mar 2011

Prospective, Randomized Trial of initial trophic enteral feeding followed by advancement to full-calorie enteral feeding vs. early advancement to full-calorie enteral feeding. It enrolled 1000 participants. This trial was originally run as a 2×2 factorial trial with the Omega trial. When the Omega arm was stopped for futility, the EDEN arm continued to completion.

Omega Nutrition Supplement Trial (Omega)

Nov 2006 – Apr 2009

@TODO A trial of omega-3 fatty acid, gamma-linolenic acid, and anti-oxidant supplementation vs. a comparator. It enrolled 272 participants. It was run as a part of a 2×2 factorial trial with the EDEN study. The Omega arm was stopped for futility.

H1N1 Registry

Nov 2009 – Jun 2010

A registry created in collaboration with the CDC to track severe cases of H1N1. It enrolled 683 participants.

Rosuvastatin vs. Placebo (SAILS)

Mar 2010 – Sep 2013

Statins for Acutely Injured Lungs from Sepsis is a trial of rosuvastatin versus placebo comparator for the treatment of patients with ALI or ARDS. It enrolled 745 participants.