I mentioned Dr. John Ioannidis a number of times in Genesis and Genes, as well as in several posts. A reader has kindly referred me to an excellent article about Dr. Ioannidis that appeared in The Atlantic.[1] Below are some pertinent points from the article, interspersed with my comments.
David H. Freedman, who wrote the article in The Atlantic, notes that “Medical research is not especially plagued with wrongness. Other meta-research experts[2] have confirmed that similar issues distort research in all fields of science, from physics to economics (where the highly regarded economists J. Bradford DeLong and Kevin Lang once showed how a remarkably consistent paucity of strong evidence in published economics studies made it unlikely that any of them were right).”
Understanding the factors that can distort research is a crucial step in becoming an informed consumer of science. Below, we look at some issues that are raised in the Atlantic article, and suggest how they may be relevant to other fields of science.
***
John Ioannidis may be one of the most influential – and popular – scientists today. In 2005, he published a paper in PLoS [Public Library of Science] Medicine that remains the most downloaded in the journal’s history. He has published papers with 1,328 different co-authors at 538 institutions in 43 countries. In 2009 he received, by his estimate, invitations to speak at 1,000 conferences and institutions around the world. Ioannidis is one of the world’s foremost experts on the credibility of medical research. He and his team have shown, again and again, that much of what biomedical researchers conclude in peer-reviewed published studies – conclusions that doctors keep in mind when they prescribe antibiotics or blood-pressure medication, or when they advise us to consume more fibre or less meat, or when they recommend surgery for heart disease or back pain – is misleading, exaggerated, and often just wrong. Ioannidis charges that as much as 90 percent of the published medical information that doctors rely on is flawed.
In the PLoS Medicine paper, Ioannidis laid out a detailed mathematical proof that, assuming modest levels of researcher bias, typically imperfect research techniques, and the tendency to focus on exciting rather than plausible theories, medical researchers will come up with wrong findings most of the time. His model predicted, in different fields of medical research, rates of wrongness roughly corresponding to the observed rates at which findings were later convincingly refuted: 80 percent of non-randomized studies (by far the most common type) turn out to be wrong, as do 25 percent of supposedly gold-standard randomized trials, and as much as 10 percent of the platinum-standard large randomized trials. [Vioxx, Zelnorm, and Baycol were among the widely prescribed drugs found to be safe and effective in large randomized controlled trials before the drugs were yanked from the market as unsafe or not so effective, or both.] The article articulated Ioannidis’ conclusion that researchers were frequently manipulating data analyses, chasing career-advancing findings rather than good science, and using the peer-review process to suppress unpopular views. These are all phenomena that are well-known to informed consumers of science, but still invisible, to a significant extent, to the general public.
In a seminal paper that was published in the Journal of the American Medical Association, Ioannidis zoomed in on 49 of the most highly regarded research findings in medicine over the previous 13 years, as judged by the science community’s two standard measures: the papers had appeared in the journals most widely cited in research articles, and the 49 articles themselves were the most widely cited articles in these journals. These were articles that helped lead to the widespread popularity of treatments such as the use of hormone-replacement therapy for menopausal women, vitamin E to reduce the risk of heart disease, coronary stents to ward off heart attacks, and daily low-dose aspirin to control blood pressure and prevent heart attacks and strokes. Of the 49 articles, 45 claimed to have uncovered effective interventions. Thirty-four of these claims had been retested, and 14 of these, or 41 percent, had been convincingly shown to be wrong or significantly exaggerated. So a large fraction of the most acclaimed research in medicine is untrustworthy.
***
There are many reasons for the dismal record of medical research, and we shall only consider a few factors. Ioannidis suggests that the desperate quest for research grants has gone a long way toward weakening the reliability of medical research. Readers of Genesis and Genes will recall the passage from Seed:
Cash-for-science practices between the nutrition and drug companies and the academics that conduct their research may also be playing a role. A survey of published results on beverages earlier this year found that research sponsored by industry is much more likely to report favorable findings than papers with other sources of funding. Although not a direct indication of bias, findings like these feed suspicion that the cherry-picking of data, hindrance of negative results, or adjustment of research is surreptitiously corrupting accuracy. In his essay, Ioannidis wrote, “The greater the financial and other interest and prejudices in a scientific field, the less likely the research findings are to be true.”[3]
In The Atlantic article, Ioannidis is blunt about one important factor in this situation. “The studies were biased,” he says. “Sometimes they were overtly biased. Sometimes it was difficult to see the bias, but it was there.” Researchers headed into their studies wanting certain results – and, lo and behold, they were getting them. We think of the scientific process as being objective and rigorous, but in fact it’s easy to manipulate results, sometimes unintentionally or unconsciously. “At every step in the process, there is room to distort results, a way to make a stronger claim or to select what is going to be concluded,” says Ioannidis. “There is an intellectual conflict of interest that pressures researchers to find whatever it is that is most likely to get them funded.” The fact that financial conflicts of interest are a feature of contemporary science is familiar to readers of Genesis and Genes:
I randomly pulled out from my shelf an issue of Scientific American. It happened to be the September 23, 2004 issue. It carried this announcement, made by the Center for Science in the Public Interest: “Some scientists and consumer advocates have called for a re-evaluation of studies that led to lower cholesterol guidelines. Among other concerns: eight of nine authors of the recommendations had ties to firms that make cholesterol-lowering statin drugs.” This is a thoroughly typical news item in science magazines. This particular note was so ordinary that it warranted all of a tiny mention on page 17. Anyone who reads science publications will periodically come across such items.
Ioannidis says that perhaps only a minority of researchers were succumbing to this type of bias, but their distorted findings were having an outsize effect on published research. To get funding and tenured positions, and often merely to stay afloat, researchers have to get their work published in well-regarded journals, where rejection rates can climb above 90 percent. Not surprisingly, the studies that tend to make the grade are those with eye-catching findings. But while coming up with eye-catching theories is relatively easy, getting reality to bear them out is another matter. The great majority collapse under the weight of contradictory data when studied rigorously. Imagine, though, that five different research teams test an interesting theory that’s making the rounds, and four of the groups correctly prove the idea false, while the single less cautious group incorrectly “proves” it true through some combination of error, fluke, and clever selection of data. Guess whose findings your doctor ends up reading about in the journal?
***
Another issue discussed by Ioannidis is the process of peer-review. The average member of the public (who is, needless to say, not an informed consumer of science) considers peer-review to be a magic pill. Peer-review is supposed to be an objective process, manned by referees who have no personal stake in the research they are reviewing, and who have all the time in the world to devote to carefully checking other peoples’ results. The real world, alas, is a little less rosy. Biased, erroneous, and even blatantly fraudulent studies easily slip through peer-review. In a 2006 editorial, Nature stated that “Scientists understand that peer review per se provides only a minimal assurance of quality, and that the public conception of peer review as a stamp of authentication is far from the truth.”
Furthermore, the peer-review process often pressures researchers to shy away from striking out in genuinely new directions, and instead to build on the findings of their colleagues – that is, their potential reviewers – in ways that only seem like breakthroughs. One example is the glut of hyped papers touting gene linkages (autism genes identified!) and nutritional findings (olive oil lowers blood pressure!) that are plain dubious.
***
Here is one example of a point made by Ioannidis in the context of medical research which is applicable to palaeontology. Ioannidis says, “Even when the evidence shows that a particular research idea is wrong, if you have thousands of scientists who have invested their careers in it, they’ll continue to publish papers on it. It’s like an epidemic, in the sense that they’re infected with these wrong ideas, and they’re spreading it to other researchers through journals.”
This phenomenon will be familiar to readers of Genesis and Genes. In the section on the alleged evolution of dinosaurs to birds, I discussed the work of researchers like Professor John A. Ruben of Oregon State University, whose work casts heavy doubt on the reigning paradigm. I wrote:
The Science Daily report from which these quotations are taken continues: “The conclusions [of the Oregon State University researchers] add to other… evidence that may finally force many palaeontologists to reconsider their long-held belief that modern birds are the direct descendants of ancient, meat-eating dinosaurs…” Professor Ruben adds, “But old theories die hard, especially when it comes to some of the most distinctive and romanticized animal species in world history.” He continues, “Frankly, there’s a lot of museum politics involved in this, a lot of careers committed to a particular point of view even if new scientific evidence raises questions.”
Furthermore, Ioannidis found that even when a research error is publicised, it typically persists for years or even decades. He looked at three prominent health studies from the 1980s and 1990s that were each later soundly refuted, and discovered that researchers continued to cite the original results as correct more often than as flawed – in one case for at least 12 years after the results were discredited.
***
Early in his career, Ioannidis was disabused of the notion that mechanisms like randomized trials and double-blind studies were magic wands that ensure infallibility. In poring over medical journals, Ioannidis was struck by how many findings of all types were refuted by later findings. This is particularly visible in medical research. One month ago, TIME Magazine published an article entitled Spin Doctors.[4] The article states:
Mammograms help you live longer. Or wait; they may not… In the medical world, this kind of uncertainty is increasingly common… Enter the US Preventive Services Task Force (USPSTF), a panel of independent experts charged by Congress with sifting through all the studies about health procedures…
In a side-bar entitled Four Surprising Recommendations, TIME highlights four prominent turnabouts:
- What you may have heard: Taking estrogen and progestin after menopause can lower the risk of heart disease and bone fractures. What you may not have: The USPSTF says supplemental estrogen can increase the risk of breast cancer and does not protect against heart disease, as earlier studies suggested.
- What you may have heard: All men over age 50 should get regular blood tests for prostate cancer. What you may not have: Those blood tests, which detect many growths that are not cancerous, can lead to risky interventions. Plus, many prostate tumors are slow-growing and don’t need to be removed, even if they are cancerous.
- What you may have heard: Women should start annual screening for breast cancer at age 40. What you may not have: Women in their 40s have lower cancer rates than older women and higher rates of false positives that lead to additional tests and procedures that may come with complications.
- What you may have heard: Vitamin D and calcium can strengthen bones and lower the risk of fractures in postmenopausal women. What you may not have: They may slow bone loss, but recommended doses may not be high enough to lower the risk of fractures. And too much calcium can increase the risk of heart disease.
The article in The Atlantic makes much the same point: mammograms, colonoscopies, and PSA tests are far less useful cancer-detection tools than we had been told; widely prescribed antidepressants such as Prozac, Zoloft, and Paxil have been revealed to be no more effective than a placebo for most cases of depression; staying out of the sun entirely can actually increase cancer risks; taking fish oil, exercising, and doing puzzles doesn’t really help fend off Alzheimer’s disease; and peer-reviewed studies have come to opposite conclusions on whether taking aspirin every day is more likely to save your life or cut it short, and whether routine angioplasty works better than pills to unclog heart arteries.
One important reason for this see-sawing is that most studies involve a relatively small number of participants and run for a relatively short time, perhaps five years. The reason for this is straightforward – it’s expensive and cumbersome to run experiments for thirty or forty years. But the price paid for these short-term savings is that the results of clinical trials are more often than not incorrect. Let’s see why.
Randomized controlled trials constitute the gold standard in medical research. These studies compare how one group responds to a treatment against how an identical group fares without the treatment. Various checks and balances are used to try to shield the researchers from bias, and, consequently, these trials had long been considered nearly unshakable evidence. But these trials, too, are sometimes wrong. “I realized even our gold-standard research had a lot of problems,” Ioannidis says. Before long he discovered that the range of errors being committed was astonishing: from what questions researchers posed, to how they set up the studies, to which patients they recruited for the studies, to which measurements they took, to how they analyzed the data, to how they presented their results, to how particular studies came to be published in medical journals.
In a typical nutrition or drug study, researchers follow a few thousand people for a number of years, tracking what they eat and what supplements they take, and how their health changes over the course of the study. Then they ask, ‘What did vitamin E do? What did vitamin C or D or A do? What changed with calorie intake, or protein or fat intake? What happened to cholesterol levels? Who got what type of cancer?’
After this, complex statistical models are used to find all sorts of correlations between, say, Vitamin X and cancer Y. When a five-year study of 10,000 people finds that those who take more vitamin X are less likely to get cancer Y, you’d think you have good reason to take more vitamin X, and physicians routinely pass these recommendations on to patients. But these studies often sharply conflict with one another. Studies have gone back and forth on the cancer-preventing powers of vitamins A, D, and E; on the heart-health benefits of eating fat and carbohydrates; and even on the question of whether being overweight is more likely to extend or shorten your life. Ioannidis suggests a simple approach to these studies: ignore them all.
For starters, he explains, the odds are that in any large database of many nutritional and health factors, there will be a few apparent connections that are in fact merely flukes, not real health effects. But even if a study managed to highlight a genuine health connection to some nutrient, a given individual is unlikely to benefit much from taking more of it, because we consume thousands of nutrients that act in concert, and changing the intake of any one nutrient is bound to cause ripples throughout the network that are far too complex for these studies to detect, and that may be as likely to harm you as help you [this is why I explained in Genesis and Genes that science is strongest when it deals with observable, repeatable and limited phenomena.] Even if changing that one factor does bring on the claimed improvement, there’s still a good chance that it won’t do you much good in the long run, because these studies rarely go on long enough to track the decades-long course of disease and ultimately death. Instead, they track easily measurable health ‘markers’ such as cholesterol levels, blood pressure, and blood-sugar levels, and meta-experts have shown that changes in these markers often don’t correlate as well with long-term health as we have been led to believe.
On the relatively rare occasions when a study does go on long enough to track mortality, the findings frequently upend those of the shorter studies. (For example, though the vast majority of studies of overweight individuals link excess weight to ill health, the longest of them have not convincingly shown that overweight people are likely to die sooner, and a few of them have seemingly demonstrated that moderately overweight people are likely to live longer.) Now add to the above ubiquitous measurement errors (for example, people habitually misreport their diets in studies) and routine misanalysis (researchers rely on complex software capable of juggling results in ways they do not always understand).
If a study somehow avoids every one of these pitfalls and finds a real connection to long-term changes in health, you’re still not guaranteed to benefit, because studies report average results that typically represent a vast range of individual outcomes. Should you be among the lucky minority that stands to benefit, don’t expect a noticeable improvement in your health, because studies usually detect only modest effects that merely tend to whittle your chances of succumbing to a particular disease from small to somewhat smaller. “The odds that anything useful will survive from any of these studies are poor,” says Ioannidis – dismissing in a breath a good chunk of the research into which $100 billion a year in the United States is sunk.
I have pointed out before (see the post Blowing Hot and Cold, for example), that the problem of tackling research that is diffuse – the opposite of limited – is by no means restricted to medical research. Take the climate. It is affected by many dozens, perhaps hundreds, of factors. In the context of human health, we know that there can be a huge difference between what is detected over a 5-year study as opposed to what ultimately transpires when subjects die fifty years later. In climate studies, too, there may be enormous differences between what is measured over a few decades and what happens over millennia.
Furthermore, as we saw above, most medical studies do not actually track the individual’s health as a whole; rather, they measure ‘markers’ which are taken as proxies for overall health. The assumption that markers are good proxies for overall health is, at best, dubious. In climate science too, it is often ‘markers’ that are used to indicate the overall ‘health’ of the climate, and this may well lead to erroneous conclusions. Consider glaciers.[5]
In 1895, geologists thought the world was freezing up due to the ‘great masses of ice’ that were frequently seen farther south than before. The New York Times reported that icebergs were so bad, and that they decreased the temperature of Iceland so much, that inhabitants fearing a famine were ‘emigrating to North America.’ But in 1902 the Los Angeles Times, in a story on disappearing glaciers in the Alps said the glaciers were not ‘running away,’ but rather ‘deteriorating slowly, with a persistency that means their final annihilation.’ The melting led to alpine hotel owners having trouble keeping patrons. It was established that it was a ‘scientific fact’ that the glaciers were ‘surely disappearing.’ But the glaciers instead grew once more.
The Boston Daily Globe reported in 1923 that the purpose of an Arctic expedition it was covering was to determine the beginning of the next ice age, ‘as the advance of glaciers in the last 70 years would indicate.’ When that era of ice-age reports melted away, retreating glaciers were again highlighted. In 1953’s Today’s Revolution in Weather, William Baxter wrote that ‘the recession of glaciers over the whole earth affords the best proof that climate is warming’. He gave examples of glaciers melting in Lapland, the Alps, and Antarctica. In 1952, the New York Times reported on the global warming studies of climatologist Dr. Hans W. Ahlmann, whose ‘trump card’ ‘has been the melting glaciers.’ The next year the paper said that ‘nearly all the great ice sheets are in retreat.’ U.S. News and World Report agreed, noting on January 8, 1954 that ‘winters are getting milder, summers drier. Glaciers are receding, deserts growing.’
But in the 1970s, glaciers did an about face. Lowell Ponte, in his 1976 book The Cooling, warned that ‘The rapid advance of some glaciers has threatened human settlements in Alaska, Iceland, Canada, China, and the Soviet Union.’
In 1951, TIME magazine noted that permafrost in Russia was receding northward at up to 100 yards per year. But in a June 24, 1974, article, TIME stated that the cooling trend was here to stay. The report was based on ‘telltale signs’ such as the ‘unexpected persistence and thickness of pack ice in the waters around Iceland.’ The Christian Science Monitor in the same year noted ‘glaciers which had been retreating until 1940 have begun to advance.’ The article continued, ‘the North Atlantic is cooling down about as fast as an ocean can cool.’ And the New York Times noted that in 1972 the ‘mantle of polar ice increased by 12 percent’ and had not returned to ‘normal’ size. North Atlantic sea temperatures declined, and shipping routes were ‘cluttered with abnormal amounts of ice.’ Furthermore, the permafrost in Russia and Canada was advancing southward, according to the December 29 article that closed out 1974.
Two points are crucial. Markers for ultra-complex entities such as human health or the climate may or may not be useful indicators of overall health. Secondly, it may well be that studies of ‘markers’ – whether of human health or the climate – may require a lifetime (in the case of humans) or several centuries (in the case of global climate) to teach us anything significant. Shorter studies may well be misleading, as is certainly the case in many clinical studies.
***
In a nutshell, becoming an informed consumer of science involves the realization that science is a human endeavour. It is subject to a galaxy of factors beyond the nuts and bolts of the laboratory work, from political considerations that determine how much funding is funneled to particular fields to the interpretation of complex statistical analyses of murky results. As the physicist and philosopher John Polkinghorne has written,
Many people have in their minds a picture of how science proceeds which is altogether too simple. This misleading caricature portrays scientific discovery as resulting from the confrontation of clear and inescapable theoretical predictions by the results of unambiguous and decisive experiments… In actual fact… the reality is more complex and more interesting than that.
To its credit, the medical community seems to have embraced the work done by Ioannidis and its implications. The Atlantic reports that:
Ioannidis initially thought the community might come out fighting. Instead, it seemed relieved, as if it had been guiltily waiting for someone to blow the whistle, and eager to hear more. David Gorski, a surgeon and researcher at Detroit’s Barbara Ann Karmanos Cancer Institute, noted in his prominent medical blog that when he presented Ioannidis’ paper on highly cited research at a professional meeting, “not a single one of my surgical colleagues was the least bit surprised or disturbed by its findings.”
But Ioannidis is pessimistic about anything changing soon:
His bigger worry, he says, is that while his fellow researchers seem to be getting the message, he hasn’t necessarily forced anyone to do a better job. He fears he won’t in the end have done much to improve anyone’s health. “There may not be fierce objections to what I’m saying,” he explains. “But it’s difficult to change the way that everyday doctors, patients, and healthy people think and behave.”
***
Dr. John Ioannidis’ work deals with medical research, which is – at least theoretically – readily amenable to the tools of science. Even here, it is obvious that science consumers should ration out credibility carefully. The fact that you read about evidence-based medicine or peer-reviewed studies or randomized trials is by no means a guarantee that you’ve been touched by Truth. And this is all in the realm of the here and now. Contemporary science is vastly overrated when it deals with issues that go beyond those that affect medical research, and involve huge extrapolations, chains of reasoning and assumptions and numerous ideological commitments.
***
See also: the post Dr. Ben Goldacre and the Reproducibility of Research:
https://torahexplorer.com/2013/04/10/dr-ben-goldacre-and-the-reproducibility-of-research/
The post Blowing Hot and Cold:
https://torahexplorer.com/2013/03/11/blowing-hot-and-cold-2/
References:
[1] See http://www.theatlantic.com/magazine/archive/2010/11/lies-damned-lies-and-medical-science/308269/.
Retrieved 5th May 2013.
[2] Meta-research involves the analysis – often with advanced statistical tools – of a large number of primary studies performed by other researchers.
[3] See http://seedmagazine.com/content/article/dirty_little_secret/. Retrieved 5th June 2011.
[4] See http://www.time.com/time/magazine/article/0,9171,2139710,00.html
Retrieved 4th May 2013.
[5] The information on the media coverage of glaciers comes from a report by the Media Research Council entitled Fire and Ice:
http://www.mrc.org/special-reports/fire-and-ice
Retrieved 5th May 2013.