Welcome to prevention science meets sci-fi.
AI is knocking at the door of prevention science, carrying predictions instead of theories and probabilities instead of certainties. It’s exciting, slightly unsettling, and entirely unavoidable-welcome to efficacy’s existential crisis. ChatGPT 4.0
Prevention science has established Standards of Evidence to evaluate intervention efficacy, effectiveness, and readiness for scale-up. These standards originated with Flay et al. (2005), who set criteria defining when an intervention can be considered “tested and efficacious” or “effective,” primarily focusing on rigorous randomized controlled trials (RCTs) and replication. A decade later, the Society for Prevention Research (SPR) updated these standards in their influential paper, “Standards of Evidence for Efficacy, Effectiveness, and Scale-up Research in Prevention Science: Next Generation” (Gottfredson et al. 2015). The updated standards expanded guidance to include replication, scale-up, theoretical grounding, comprehensive intervention descriptions, implementation quality, adaptation documentation, and outcome reporting. These standards underscore that preventive programs should demonstrate impact both under ideal, controlled conditions (efficacy) and in real-world settings (effectiveness), complete with clear theoretical rationale and fidelity monitoring.
Now, enter AI into this neat picture. Suddenly, the tidy standards start feeling like they’re missing something. How do AI-driven methods fit into the world of SPR Standards? Traditional efficacy standards assume theories, RCTs, systematic replication, and smooth scale-ups. AI, however, uses messy observational big data or continuously adapting algorithms—methods that don’t exactly play by the traditional RCT rulebook. This naturally begs our previous question: how do AI-driven methods fit into the world of SPR Standards?
Before venturing into sci-fi waters, it’s worth noting that the Standards of Evidence frame efficacy not simply as a little badge a program wears, but as a function of several moving parts: the program itself, how it’s implemented, the population receiving it, and even the timing and setting (an efficacy statement might look like: producing outcome Y for population Z at time T in setting S). It’s tempting, and frankly easier, to just slap on the label “efficacious” or “not efficacious” as if it’s a permanent tattoo, independent of the real-world details. But this is not the case.
SPR’s standards demand theory and clear causal mechanisms. They don’t just want to see if something works, they also insist on understanding why it works. AI, in contrast, often hands us remarkably accurate predictions while keeping the secrets of how it got there (the notorious “black box” scenario). This leaves prevention scientists scratching their heads: should we trust an algorithm predicting substance-use initiation if it stubbornly refuses to explain itself? Ironically, if we say “yes,” we might need to rewrite prevention science’s entire history, one that proudly resisted programs that mysteriously “just worked.”
But there is more to the story. Traditional standards keep intervention and evaluation separated, like good neighbors with clear fences. But AI casually strolls in and knocks these fences down. Imagine interventions that aren’t just evaluated by AI but continuously reshaped and adapted by it in real-time. Soon, we could have preventive programs with no two identical implementations, possibly no fixed interventions at all. Heck, why stop there? Maybe we’ll even have programs without human participants! Welcome to prevention science meets sci-fi.
This suggests it might be time to update our Standards of Evidence, think of it as “Evidence 3.0”. Sure, rigorous evaluations, replication, and real-world effectiveness remain essential. But in this brave new AI-driven world, we also need fresh guidelines. New standards might include performance benchmarks and regular bias audits for predictive models, transparency mandates (open-source algorithms, anyone?), and perhaps greater acceptance of robust quasi-experimental or even simulation-based evidence. The good ol’ RCT might not reign supreme forever (cue dramatic gasp).
In short, prevention science is now in the exciting and delightfully chaotic territories of balancing classic rigor with AI’s transformative unpredictability, the kind of sci-fi movie where the hero (the programs) needs to be a bit too unpredictable for their own good.
Back to top