Clinical trials, medical studies performed on people, are held to high standards. The researchers running them endeavour to include a diverse group of participants, both men and women. Trial participants are randomly assigned to the treatments being compared, and trial staff measuring the outcomes of the study are often kept unaware as to which patients received which therapy. These precautions reduce the possibility of bias creeping into the results. Calculations done when the study is being designed let the researchers know the minimum number of participants needed—the sample size—to reduce the likelihood that their results could be due to chance alone.
But according to two new papers from researchers at the Ottawa Heart Institute, these same methods are rarely used in preclinical research: the laboratory work with animals that must be done to test new theories and treatments in advance of human trials. The result is that the findings of many published research studies cannot be confirmed through replication by other research teams—a fundamental standard for validating research. In addition, promising therapies may not perform as expected in women when the research progresses to clinical trials in humans, and treatments that work exclusively in women may never be discovered.
Benjamin Hibbert, MD, PhD, an interventional cardiologist and clinician-scientist at the Heart Institute and senior author of the two studies, has experience as both a laboratory and clinical researcher. “The [laboratory] scientists who do the bulk of this work are focused on the technical aspects of their studies, and they get no training in how to appropriately design a study,” he said. “I think it’s just not part of the culture to implement these design elements.”
“But these elements are the checks and balances of preclinical science, to make sure that we’re not just seeing what we want to see,” he added. “If we’re not including these very important backstops in our methodological design, it can lead to a lot of wasted resources in chasing what may be false conclusions and, in the worst cases, to premature clinical studies and even clinical harm.”
Gender Imbalance Not Improving
In 2014, the National Institute’s of Health (NIH) in the United States released guidance that sex should be one of the biological variables that scientists take into account in all future applications for funding for preclinical research.
Using both male and female animals is important, explained Dr. Hibbert, because some experimental treatments work in females but not males, or vice versa. Of those that are effective in both genders, some are processed differently in the body and therefore require different doses for females and males. If only male mice are used in preclinical research—as tends to be the default in most labs—these differences could cause problems down the road as the therapies move into human studies, he said.
To see if the NIH guidance moved the needle on sex bias in preclinical research, Dr. Hibbert, cardiology resident and lead author Daniel Ramirez, MD, and their colleagues looked at all preclinical cardiovascular studies published in journals of the American Heart Association between 2006 and 2016. Their study was published in Circulation.
Overall, the results were discouraging. The sex of the animals used was not even reported in 20% of studies during the decade. Of those that did report sex, over 70% only used male animals. And of the minority that used both males and females—about 15%—sex-based analysis was reported in only 17% of those studies.
Sex-based reporting of results also occurred in only a minority of the studies that reported the sex of the animals used. This percentage did not improve following the release of the NIH guidance.
Dr. Hibbert thinks that many ingrained—and erroneous—beliefs drive the preference for male animals, including concerns about hormone fluctuations in females that are actually largely irrelevant to research and a misguided impulse to “protect” females. “But we’re doing the opposite, really, because we don’t learn about how the therapies work in female biology, and we’re not learning how the disease process is specific to the gender,” he said.
Study Design Often Lacking
In their second study, published in Circulation Research, Drs. Hibbert, Ramirez and colleagues looked at whether randomization, blinding and sample size estimation had improved in the same journals over the same decade. Randomization eliminates bias in how study participants are assigned to treatment groups. Blinding—keeping staff unaware of who got which treatment—prevents staff from behaving differently with participants based on the treatment they received.
Again, the results were disappointing. Overall, the number of studies that reported randomization or blinding was low—under a third of the total for each measure—and this proportion did not change over the 10-year period. The number of studies that calculated the needed sample size before research began did increase, but remained below 7%.
These methodological flaws likely contribute to how few published studies can be successfully replicated by other laboratories, commented Dr. Hibbert. For example, in a famous recent study in cancer research, only 6 out of 53 influential studies later had their results duplicated by outside scientists.
“People aren’t doing this intentionally, but by not having these checks and balances in their studies, as soon as they get a result that fits with their hypothesis, they stop looking and they stop questioning,” he said.
How to Raise the Bar
The one notable exception in their second study may point to the potential solution, explained Dr. Hibbert. That exception was stroke research, where preclinical studies over the decade increasingly reported randomization, blinding, sample size calculations and the use of both female and male animals.
About 90% of stroke research that he and his colleagues examined was published in one journal: Stroke. In 2011, that journal alone had implemented a “Basic Science Checklist,” which required that all of these design elements be reported in any paper submitted for publication.
“I really think that the gatekeepers for this have to be the journals—that’s our reward system in academia. Whether or not your lab stays open depends on your next paper,” said Dr. Hibbert. “People respond to incentives. If scientists were evaluated and incentivized to make their science reproducible, and if at the publishing stage they had to show that they incorporated these design elements, they’re going to do it—these are smart, bright, motivated people,” he added.
“And if you elevate the quality of the preclinical science that’s being done, I think you’re going to see a dramatic increase in successful translation to the clinic,” he concluded.
Learn More
- See more coverage of this research in The Scientist