Searching for Truth?

When a person faces a new and debilitating diagnosis like multiple sclerosis (MS), a natural and urgent question arises: "Why me?" For a woman who was recently vaccinated against Hepatitis B, the temporal proximity of the events can lead her to wonder if the vaccine was the cause. This question, while emotionally and personally profound, presents a classic scientific challenge. It forces us to confront the limitations of our knowledge and the difficult task of establishing causation between a specific exposure and a complex disease. The case of the Hepatitis B vaccine and MS, which has been studied extensively, provides a powerful lens through which to examine the scientific tools we use—and their shortcomings.

To systematically approach the woman's question, we can turn to the Bradford Hill criteria, a set of nine guidelines used to evaluate evidence of causation in epidemiology. The first and most obvious criterion in this case is temporality: the cause must precede the effect. Since the vaccine was administered before the onset of MS symptoms, this criterion is met. However, this is the only criterion that is unequivocally satisfied. Next, we consider the strength of association and consistency. Is there a statistically significant link between the vaccine and MS in large populations? Have multiple independent studies, conducted by different researchers in different parts of the world, found the same association? Research has generally found a lack of a strong or consistent association. When we look at biological plausibility, we face a significant hurdle. There is no accepted biological mechanism by which the Hepatitis B vaccine is known to trigger the autoimmune response characteristic of MS. The vaccine is designed to stimulate an immune response to a viral protein, not to attack the body's own nervous system. While a new, undiscovered mechanism cannot be ruled out, the absence of a known one makes it difficult to support a causal link based on our current understanding of biology. This is why we need to invest more in research scientists and the funding for their work, as their direct efforts are what fill the gaps in our biological understanding.

Because the Bradford Hill criteria alone are insufficient in such a complex case, modern science relies on additional frameworks, especially for post-marketing surveillance. Regulatory bodies like the Centers for Disease Control and Prevention (CDC) and the Food and Drug Association (FDA) operate systems like the Vaccine Adverse Event Reporting System (VAERS). This system is designed to collect reports of adverse events following vaccination, acting as an early warning signal. While these systems can detect rare events that might not be seen in clinical trials, they are not designed to prove causation. A report to VAERS only establishes a temporal relationship; it does not account for confounding variables, pre-existing conditions, or the background rate of the disease in the population. The data from these systems is often incomplete and subject to reporting bias, as the woman or her doctor must actively choose to submit a report.

Ultimately, the woman's question, and similar questions about causation, highlight a critical challenge for the future: the role of data and human bias in the age of artificial intelligence. To illustrate, if we were to ask an AI to list the top biological findings of 2024, it might readily provide a list of breakthroughs, such as the discovery of a neural circuit regulating the immune system or the completion of the fruit fly brain connectome. This ability is impressive, but it reinforces a crucial point: an AI is only as good as the data it is trained on and the questions it is asked. These findings are not the result of the AI itself; they are the culmination of years of human curiosity, rigorous research, and dedicated funding. If the data from post-marketing surveillance is incomplete or biased—for example, if only a small subset of people report their symptoms—the AI will learn from this flawed information and may fail to identify a true causal link or, conversely, find a link where none exists.

This limitation is particularly relevant when considering whether an AI could, on its own, apply the Bradford Hill criteria to find causality in a vast electronic health record (EHR) database. An AI can certainly be trained to assist with the more quantitative criteria, such as temporality (ensuring the cause preceded the effect), strength of association (calculating the statistical link), and biological gradient (looking for a dose-response relationship). However, it falls short on the qualitative criteria. An AI cannot, for instance, determine the biological plausibility of a new mechanism, nor can it fully account for all unknown confounding variables to satisfy specificity. These criteria require human-level insight and the ability to synthesize findings from diverse sources—lab experiments, clinical observations, and broader scientific literature. The most sophisticated algorithms will not matter if we don't first do the hard work of gathering good, unbiased data and asking the right questions. We must recognize that AI is a tool, not a replacement for rigorous scientific inquiry and the critical thinking necessary to interpret findings and account for the inherent limitations of our knowledge.