Integrating AI Endpoint Technology in Early-Phase Trials Without Triggering Regulatory Scrutiny

About This Engagement

AI-assisted endpoint measurement is becoming more common in early-phase trials. Sponsors want cleaner measurements with less variability. Regulators want prospective validation before they trust a novel measurement tool. Those two timelines rarely line up.

This case study shows how EvoClinical helped a clinical-stage biotech navigate that gap. The sponsor had access to a promising AI measurement tool from a technology partner. The evidence base for that tool was still maturing. Using it in a high-risk role, such as a primary endpoint, would have created a credibility problem with regulators. Ignoring it entirely would have wasted a real opportunity to reduce endpoint noise.

The solution was not a compromise. It was a strategic plan that defined exactly how the technology could be used now, and what needed to happen before it could take on a larger role.

The Challenge

The sponsor was planning an early-phase Proof-of-Concept (PoC) study in an indication with a well-documented problem: noisy endpoints and unpredictable placebo-arm responses. Clean readouts are already difficult to achieve in this space. Adding an unvalidated measurement tool to that environment would have compounded the risk.

The sponsor had access to an AI-assisted endpoint measurement model developed by a technology partner. The model showed real promise for improving measurement precision. But the validation evidence behind it was earlier-stage than regulators would require to accept it as a primary endpoint or eligibility criterion.

The sponsor had no defined plan for how to use the tool within the trial. Without one, they were looking at two outcomes that were equally problematic.

Concerns

Two structural risks sat at the center of this engagement. Either outcome would have cost the program time it did not have to lose.

Regulatory over-reach

Positioning an AI-derived measure as a primary endpoint, or as an eligibility criterion, before it had been prospectively validated would have raised the evidence bar for the entire program. Regulators would have required justification the sponsor could not yet provide.

Inconclusive Proof of Concept

Without simulation-backed interim rules for futility and dose selection, the trial risked producing a readout in the gray zone: not clearly positive, not clearly negative, and subject to extended internal debate before any next steps could be agreed upon.

Identifying The Issues 

EvoClinical was engaged for strategic statistical consulting to close the gap between the technology and the clinical program. After reviewing the development plan, the model documentation, and the regulatory context, four core structural gaps emerged.

Translation Gap

There was no defined playbook for how AI measurements should inform clinical phase decisions. The technology existed. The intent to use it existed. The bridge between the two did not.

Maturity Mismatch

The AI model’s validation evidence was at an earlier stage than what would be required to position it as an AI biomarker. Assigning it a primary endpoint role carried real regulatory risk with no near-term path to resolution.

Decision-Framework Gap

The existing PoC design had no simulation-backed interim thresholds for futility or dose selection. That left the program exposed to drawn-out internal debate at the interim timepoint, with no pre-agreed criteria to anchor the discussion.

Evidence-Building Gap

The sponsor had no mechanism to generate their own validation data for the AI model during the study. Any future application of the tool in later phases would remain entirely dependent on the vendor’s data or third-party sources, neither of which the sponsor controlled.

The Solution

EvoClinical acted as the strategic translator between the technology and the clinical program. To maximize program-wide efficiency, we moved away from a fixed trial design in favor of a modular suite of four complementary options. Each option was documented with its regulatory risks, operational requirements, and expected return. The sponsor could adopt options independently or in combination for confounded benefits, based on their stage in the evidence-building process.

Option 1: The PoC Decision Engine (Recommended for Immediate Deployment)

What it does: A Phase 1b/2a adaptive design with pre-specified interim rules for futility and dose selection. The AI model is used as a covariate in the primary analysis only. The primary endpoint remains conventional and consistent with current regulatory guidance. 

When it activates: Immediately. This option requires no prior AI validation and is not dependent on evidence from any other option. 

What it gives the sponsor: A clean, defensible PoC with pre-agreed decision criteria. The interim readout produces one of three outcomes: advance, stop, or modify. There is no gray zone to debate.

Option 2: Placebo Right-Sizing

What it does: A Bayesian borrowing framework using robust-mixture priors to reduce placebo-arm sample size requirements in later-phase studies.

When it activates: After Option 1. Strict evidence criteria from the PoC must be met before this framework can be applied. It does not activate at study entry.

What it gives the sponsor: Meaningful sample size savings in later phases, grounded in data the sponsor already owns rather than external assumptions.

Option 3: Precision Strategy

What it does: Stratified randomization and hierarchical subgroup modeling to support an evidence-gated path toward patient enrichment designs.

When it activates: Only if prospective data from earlier phases supports it. This is not a default next step and is not triggered by timeline alone.

What it gives the sponsor: A structured path to enrichment if the data justifies it, without committing to that path before the evidence exists.

Option 4: Prospective Validation Add-On

What it does: An embedded validation dataset within the study, using a locked AI model version, predefined analyses, and site generalizability checks.

When it activates: Designed to run concurrently with Option 1. It does not add a separate study phase or delay the primary timeline.

What it gives the sponsor: Sponsor-owned validation evidence for the AI model, generated during the trial rather than sourced entirely from the vendor or third parties. This directly addresses the Evidence-Building Gap identified at intake.

The sponsor adopted Option 1 for immediate deployment and planned Options 2 and 4 as the program advances. The modular structure removed the pressure to commit to a single approach before the evidence existed to support it.

Simulation Framework for Interim Decision Rules

Before any patient is enrolled, interim decision rules need to be grounded in simulation, not assumption. EvoClinical proposed a foundational framework for simulation-backed interim thresholds covering futility boundaries, dose-pruning criteria, and Type I error trade-offs.

The framework did not produce final operating characteristics at this stage. Instead, it identified the critical parameters that must eventually be modeled and documented the AI model’s core operational dependencies, including data capture standards, version locking requirements, and protocols for handling missing data.

The output was a clear roadmap of what must be resolved in future protocols and SAPs to keep the trial defensible at the interim timepoint. It closed the Decision-Framework Gap without overreaching into design elements that required more mature evidence to specify.

A Note on Transparency

As part of the strategic framework, EvoClinical established a mandate for complete transparency in the simulation and decision logic. Upon execution, the sponsor receives full simulation reports including all underlying code, parameter assumptions, and reproducibility documentation. The decision logic is entirely open so the sponsor can conduct independent verification and internal governance reviews without barriers.

The Results 

The Lesson

When a sponsor has access to promising AI technology but no clinical trial playbook for it, the bottleneck is rarely the science or the model. It is finding a statistical partner who can translate novel capabilities into trial decisions built to withstand regulatory scrutiny.

That requires three things done in sequence: defining the right context of use for the technology at its current stage of evidence, demanding the right level of rigor for the decisions that context must support, and building a plan that delivers concrete value now while creating a structured path for the technology to take on a larger role later.

The sponsor in this case did not have to choose between using the technology and protecting the program. A structured statistical approach made both possible at the same time.

Contact us

Partner with a BioStatistics CRO you can trust.

Our process helps put you at ease when purchasing statistical services

Whether you’re facing a mid-program regulatory change, complex study design questions, or need transparent statistical methodology that can withstand regulatory scrutiny, EvoClinical’s biostatistics team is ready to help.

Focus areas:
What happens next?
1

Schedule a call

2

Discovery Conversation 

3

We prepare a proposal 

Schedule a Free Consultation