Imperial County, California, a poor, largely Hispanic agricultural region in the southeastern corner of the state, has been hit hard by Covid-19. By the end of January, according to the New York Times’s Covid-19 database, Imperial County had suffered 845 Covid deaths, or 4.7 per thousand inhabitants—a rate almost 80 percent higher than the U.S. average. The case fatality rate in Imperial County is 1.44 percent, the second-highest in California—and was significantly higher, 2.10 percent, at the end of October 2021 before the Omicron wave.
Two doctors in Imperial County, though—George Fareed and Brian Tyson, who run the All Valley Urgent Care network of medical centers—claim to have done far better with their Covid-19 patients. In fact, they claim near-perfect success: in a book that they published last January, they claim to have seen more than 7,000 patients and had only three deaths, all among patients who began treatment in later disease stages. A statistical analysis of part of their results by the statistician Mathew Crawford, included in their book, counts only seven hospitalizations and three deaths among 4,376 patients seen up through March 13, 2021—a reduction in hospitalization risk of well over 90 percent from the county average, even after (admittedly imperfect) statistical adjustments for differences in age between Fareed and Tyson’s patients and the general population.
According to prevailing medical views, Fareed and Tyson’s claimed results should be impossible. The doctors’ first protocol was based around hydroxychloroquine (HCQ), a repurposed anti-malarial drug, with other drugs such as ivermectin as more recent additions. Received opinion on the drugs is that ivermectin is at best unproven in treating Covid-19 (the Food and Drug Administration maintains an official webpage warning against using it as a treatment for the virus), and that HCQ has been actively disproved: early optimism from laboratory experiments and small clinical studies did not hold up in larger, more rigorous trials.
Such opinions have influenced not just news coverage but also the moderation policies of social media platforms, which have imposed ever-stricter rules against “misinformation” (meaning, in practice, contradicting American public health authorities). After Fareed and Tyson spoke by invitation at a meeting of the Imperial County Board of Supervisors, the Los Angeles Times ran an article noting that the Imperial County Medical Society “had urged supervisors to ‘not contribute to the dissemination of false or misleading information by legitimizing unproven treatments.’” The paper also quoted an executive at an Imperial County hospital, saying, “We need to stick with what we know is approved by the FDA for COVID-19 treatments. . . . Misinformation itself ought to be stopped.” In December, Twitter also suspended Tyson’s account for breaking its policies against Covid misinformation.
The dismissal of hydroxychloroquine as a possible Covid-19 treatment, however, was never based on solid science. The Los Angeles Times article reveals a fundamentally authoritarian worldview: medical claims are “unproven,” and dangerous for the public to discuss, until some official body endorses them—an approach that threatens public health and science alike.
Interest in hydroxychloroquine as a coronavirus treatment stretches back at least to 2005, when an in vitro study showed that chloroquine, a very similar compound, might protect against SARS infection. Based on laboratory studies and small clinical trials, medical authorities in China and South Korea recommended chloroquine as a Covid-19 treatment in February 2020.
Some doctors outside East Asia followed. Vladimir Zelenko, a doctor in a Hasidic community in New York, advocated a combination of HCQ, azithromycin (an antibiotic to guard against secondary infections), and a zinc supplement: HCQ increases the uptake of zinc ions into cells, a property that Zelenko surmised might provide antiviral effects. In an open letter in April 2020, Zelenko claimed to have treated about 1,450 patients, including 405 that he judged “high risk,” with only two deaths. Luigi Cavanna, a doctor in Piacenza, Italy, also claimed about the same time that thanks to an HCQ treatment protocol, none of his patients had died and only 5 percent were hospitalized—one-sixth the contemporaneous Italian hospitalization rate of over 30 percent. Many more systematic “observational” studies of HCQ—comparing patients in a hospital or elsewhere who received a drug (because of their own or a doctor’s choice) with those who did not—returned good results both as a treatment of Covid-19 cases (including one large study from the Henry Ford Health System in metropolitan Detroit) and for prevention of Covid-19 in individuals at high exposure risk. One especially striking example of the latter is a set of 11 “case-control” studies from India, where medical authorities recommended but did not mandate a weekly prophylactic dose of HCQ for medical workers. Most of these studies found that workers who took HCQ had reduced odds of testing positive for SARS-CoV-2 antibodies, with especially marked reductions for those who took six or more doses of the protocol.
Medical researchers tend to discount doctors’ reports and observational studies—which, granted, have many potential biases that can’t always be spotted or corrected. For instance, observational studies can underestimate the efficacy of a treatment that’s given more often to sicker patients—or overestimate it, if health-conscious patients are more likely to demand experimental treatments, or if doctors who give ineffective experimental drugs are also more likely to give effective experimental drugs (this latter point was a common and valid criticism of the Henry Ford study). So doctors generally consider randomized trials, which avoid these classes of bias, to be more reliable—though they have drawbacks, too, such as considerably greater expense and, therefore, typically smaller sample sizes.
And most analyses of randomized trials of HCQ—on the basis of which mainstream medical opinion decided that it doesn’t work for Covid-19—do draw negative conclusions. For instance, a February 2021 review by Cochrane, an organization that produces comprehensive reviews of randomized trials, concludes, “HCQ for people infected with COVID‐19 has little or no effect on the risk of death and probably no effect on progression to mechanical ventilation.” Another meta-analysis in Nature by Cathrine Axfors et al. estimates an 11 percent increase in risk of death on the basis of 26 randomized trials.
The results of both meta-analyses were essentially determined by two large, similar trials: the Solidarity trial run by the World Health Organization and the Recovery trial at the University of Oxford. These trials accounted together for over 97 percent of the statistical weight in Cochrane’s main analysis, and both claimed to rule out more than a tiny benefit of HCQ for hospitalized Covid-19 patients.
But neither trial disproves claims such as Fareed and Tyson’s. First and most importantly, both trials were on hospitalized patients and are not necessarily applicable to “outpatients” earlier in the disease course. Antiviral treatments work better earlier: for instance, oseltamivir (also known as Tamiflu), an antiviral influenza treatment, works well if started within two days of symptom onset, but not later. In Covid-19, viral load peaks soon after symptom onset, and viral replication has already ceased in most hospitalized patients, guaranteeing that antiviral treatments will have limited effect. One review in The Lancet found that dozens of studies consistently find that viral load in Covid-19 peaks in the first week of symptoms and that “No study detected live virus beyond day 9 of illness.” Lethal symptoms of Covid-19 in hospitalized patients are usually secondary effects, such as blood clotting and a dysregulated, hyperinflammatory immune response called “cytokine storm,” not continued action of the virus itself.
The Recovery and Solidarity trials also both tested HCQ alone, even though the most widespread protocols combined it with other medications such as zinc, and they used bizarrely high doses. Most doctors who prescribed HCQ used low doses: for instance, Zelenko’s original treatment regimen comprised 2,000 milligrams in total of hydroxychloroquine sulfate (a compound that is about 78 percent HCQ by weight), comprising two doses of 200 mg each per day for five days—quite similar to the regimen of 2800 mg total recommended by a group of Chinese researchers based on in vitro study.
Most of the trials reviewed by Cochrane used dosages about this size or, at most, about twice as large. But the Recovery and Solidarity trials used much larger doses: respectively, 10,000 mg and 9,600 mg of HCQ sulfate, according to Cochrane’s summary table, comprising a “loading dose” of 1,600 mg given within the first six hours followed by twice-daily doses of 400 mg for several days. HCQ at high doses can cause potentially dangerous heart rhythm distortions, and the total-day dose of 2,400 mg of HCQ sulfate (or roughly 1,860 mg of base HCQ) is already nearly half the dose of 4 grams that a report from 2017 called “potentially fatal in adults.” (Other sources give higher lethal doses: for instance, one paper estimates 2 to 3 grams as a potentially fatal dose of chloroquine and estimates that HCQ is one-quarter as toxic as chloroquine.) The Indian Council of Medical Research, India’s equivalent of the CDC, even wrote to the WHO warning that the Solidarity trial’s dose was needlessly and possibly dangerously high, approximately quadruple the dose commonly used in Indian hospitals.
The Recovery writeup argues that since there was no excess mortality in the HCQ arm in the first few days of the trial, the high doses were unlikely to have been harmful. But this argument is hardly bulletproof: HCQ has a long half-life (an FDA data sheet gives an estimate of 22 days), and it is not out of the question that the studies’ high continued doses could have accumulated to dangerous effect. The Recovery trial’s appendix even mentions that the HCQ group had a rate of major cardiac arrythmia 31 percent higher than the control group (8.2 percent versus 6.3 percent), though the authors dismiss this finding as not statistically significant.
The Recovery researchers may even have chosen HCQ dosage based on confusion with another medication. In a June 2020 interview with the magazine France Soir, Martin Landray, one of the directors of the Recovery trial, justified the dose as “in line with the sort of doses that you used for other diseases such as amoebic dysentery”—a disease for which HCQ has not been regularly used for decades. It’s plausible that the researchers confused hydroxychloroquine with hydroxyquinoline, which is indeed used for amoebic dysentery.
A more optimistic picture emerges from studies of earlier treatment with HCQ. Axfors et al. mention five trials with “outpatients” (that is, outside hospitals). Only two have more than a few dozen subjects: Mitjà et al., from a team in Catalonia; and Skipper et al., from researchers at the University of Minnesota. A third substantial outpatient trial has since been published: the TOGETHER trial, conducted in Brazil. Though all these trials report no effects for HCQ, the truth is a bit more complicated: all three found some positive effects for HCQ, including a reduction in hospitalizations, but were not large enough on their own to establish these results with certainty.
This point rests on a subtle distinction: a study that fails to prove that a treatment is effective has not necessarily proved that the treatment is ineffective. Medical researchers typically decide whether an experiment proves a treatment’s efficacy by reversing the question, and calculating the probability, called a p-value, that an ineffective treatment would generate equal or better results in the same experiment. A p-value of 5 percent or less is by convention designated “statistically significant” and regarded as at least preliminary proof of efficacy. A small study with a statistically insignificant result, though, has not necessarily proved a treatment to be ineffective: small experiments can turn up good results by chance, so they are simply not powerful enough to distinguish worthless treatments from many good ones.
By analogy, imagine flipping a coin 20 times to test if it’s more likely to come up heads than tails. Getting 18 or 19 heads out of 20 flips would be firm evidence that the coin was biased, but not, say, only 12 heads out of 20 flips: an unbiased coin would give at least 12 heads out of 20 flips about a quarter of the time. But a larger experiment with the same disproportion, such as 120 heads out of 200 flips, would be powerful evidence of bias—as would be, for that matter, several repeated experiments of 20 flips each that all yielded more heads than tails.
In any case, though all three RCTs of HCQ for outpatients mentioned above reported no statistically significant results, a closer look suggests potential substantial benefits for HCQ that each study was too weak to establish on its own. Mitjà et al. reported no statistically significant findings, but the HCQ group did have a lower rate of Covid-19-related hospitalizations: 5.9 percent (8 of 136) against 7.1 percent (11 of 157) in the control group, a reduction of 16 percent (the paper elsewhere claims a 25 percent reduction, likely an arithmetic error), and the HCQ group had somewhat faster cessation of symptoms. Skipper et al., similarly, found a moderate acceleration in the pace of symptom improvement among a group of American and Canadian participants treated with HCQ; the control group also had eight Covid 19-related hospitalizations out of 211, versus four of 212 in an equally sized experimental group. The TOGETHER trial, finally, was stopped early for “futility,” even though the preliminary results showed a reduction of about one-quarter in hospitalizations in the HCQ group compared with placebo: 8 of 214 versus 11 of 227.
All trials, furthermore, used HCQ in isolation, without zinc or any other treatments. Skipper et al., furthermore, compared HCQ against folate as a placebo—and according to weak but suggestive evidence, folate might help fight Covid-19 . One paper from Iran published in March 2020 noted that a computer simulation predicted that folic acid could inhibit an enzyme, furin, vital to SARS-CoV-2 replication; and one blogger from the United Kingdom has noted that hospitalizations of pregnant women for Covid-19 in the U.K. almost all occurred during later stages of pregnancy, while folate supplementation in the first trimester is near-universal. (In personal communication, David Boulware, one of the study authors, explained that the study used folate as a placebo because folate and HCQ pills matched one another in appearance well, and other common placebos such as talc or lactose would have been too expensive or harmed study participants with lactose intolerance.)
Another trial run by a similar University of Minnesota team considered post-exposure prophylaxis—that is, medication to stop people exposed to SARS-CoV-2 from developing symptoms. The trial sent subjects either HCQ or a folate placebo by overnight mail if they came into contact with a Covid case, finding a 17 percent reduction (not statistically significant) in total cases. A commentary by one professor from Brazil points out a sign from the study data that the HCQ might do even better under optimal conditions: excluding trial participants who began showing Covid symptoms before completing a full five-day course of medication gives a significantly higher efficacy, approximately 40 percent. (The results of another post-exposure prophylaxis study, unfortunately, don’t trend towards toward showing effectiveness of HCQ—though this study used a lower HCQ dose and a Vitamin C placebo that itself might have beneficial effects, though the study authors claim that the Vitamin C dose used was too low to be effective.)
Finally, a brief mention is worthwhile of two pre-exposure prophylaxis studies, each with several hundred health-care worker participants and dozens of Covid-19 cases in both placebo and control groups, one conducted mostly by the same University of Minnesota team and another conducted by Duke University. Both studies estimated that once- or twice-weekly consumption of HCQ reduced the incidence of Covid-19 cases by about one-quarter—neither quite statistically significant on its own.
In short, the actual scientific evidence used to dismiss HCQ is far from an absolute proof that it doesn’t work. Many of the studies commonly cited to dismiss the drug are irrelevant, too weak to bear much weight, or actually suggest some benefits. The RCT evidence alone is not enough for an affirmative case that HCQ certainly works, but neither does it provide any grounds to declare a priori that the results achieved by doctors such as George Fareed and Brian Tyson constitute “misinformation,” or are entirely due to confounding factors. At the very least, their claims merit good-faith close examination, including more formal trials that try to replicate their results with their exact protocol.
But there is a broader point here: the brokenness of the criteria that political authorities and Internet platforms use to determine acceptable opinion. With a handful of largely politically motivated exceptions—the scientific backing for mask mandates, for instance, amounts to scarcely more than artificial laboratory studies and cherry-picked epidemiological comparisons, with scant if any support from randomized controlled trials—medical regulatory agencies consider RCTs the only acceptable source of evidence. Though RCTs are immune from certain classes of bias, though, they can be poorly designed in other ways and are hardly infallible. Moreover, RCTs are expensive, labor-intensive, and typically beyond the reach of researchers without institutional backing, for often wholly artificial reasons—such as pettifogging ethical oversight requirements imposed by institutional review boards, and a ban on human challenge trials that could allow conclusive randomized testing of disease treatments with drastically reduced expense and time.
As blogger Scott Alexander has pointed out, the phrase “no evidence,” frequently used to dismiss potential alternative Covid-19 treatments, is one of the most overused in science communication, applied both to assuredly false statements and to those that are likely true but simply lack sufficiently authoritative proof. Critical thinking about medicine or any topic requires weighing multiple sources against one another and distinguishing between degrees of certainty, not ruling out all sources of evidence but one and equating “unproven” with “false.” The approach to health information increasingly taken by public officials, reporters, and social media—under which any statement is “unproven” and must be assumed harmful, barring some definitive pronouncement by public health authorities to the contrary—is thus not only authoritarian but also damaging to public health and science as a whole.