Skip to main content
spfr_blog_30.06.26-1.png
Insights

Your AI Cleared the FDA. Now You're Afraid to Improve It.

Milos Zikic · · 7 min read

Medtech teams freeze their AI models post-launch, fearing that any update means a costly return to the FDA. By leveraging the FDA’s Predetermined Change Control Plan (PCCP) framework and building automated evaluation infrastructure, engineering teams can escape the "deployment trap" and safely iterate in production

The same story plays out across medtech constantly. A team ships an AI feature: a measurement tool that reads an ultrasound and returns a number. It clears the FDA. The launch deck calls it a milestone. Eighteen months later, customers are using the exact same model.

The ML engineers have a backlog of improvements they can't ship: a retrain on the last year of real cases, a threshold change that would cut false positives, support for a newer scanner that half the customers now use. None of it has gone out. Somewhere in the org, someone decided that touching the model means going back to the FDA, and going back to the FDA means months and money. So the model sits frozen. The part of the product they sold as "AI that learns" is the one part nobody is allowed to change.

This is the deployment trap that no demo warns you about. We've written before that the model is not the deliverable, arguing that the model is maybe 15% of the work and production is where the real job starts. This is the other half of that argument. Shipping a regulated AI system once requires intense effort. Maintaining the regulatory clearance to continuously improve it is far more complex, yet few teams architect for it. 

The Cost of a Frozen Model

Machine learning models in production face an uncomfortable truth: they decay. Patient populations shift. Hospitals swap scanner vendors. Clinical practices move on. Coding standards change. The model that validated perfectly against historical data quietly degrades against current real-world inputs. A model with no mechanism to adapt doesn't hold performance steady by being left alone; it degrades on a schedule you don't control.

Meanwhile, the market accelerates. By the end of 2025, the FDA's cumulative total of authorized AI/ML-enabled devices passed 1,450, with roughly three-quarters in radiology and cardiology, neurology, and pathology climbing fast. More AI devices clear every quarter, buyers get more options, and your product sits still because of regulatory friction.

This caution is understandable. Under the traditional framework, any change hinges on whether it could significantly affect safety or effectiveness. For software updated once a year, that review cycle is manageable. For a statistical system whose value relies on iterative data improvements, it acts as a straitjacket. The legacy approach, treating every meaningful optimization as a new submission, turns engineering teams into spectators.The tool that changed the math: the PCCP

In December 2024, the FDA finalized its guidance on Predetermined Change Control Plans (PCCP) for AI-enabled device software functions. If you build medical AI, this framework decides whether your model ships frozen or iterates safely in production for years.

The mechanism, grounded in Section 515C of the FD&C Act, allows you to specify modifications upfront. In your original marketing submission, you describe the exact changes you intend to make after clearance and detail how you will validate them. Once the FDA authorizes that envelope, you deploy those changes without a new 510(k), De Novo, or PMA supplement, provided you operate within the agreed parameters.

A functional PCCP requires three components:

  1. Description of Modifications: The specific, bounded changes you will make, such as retraining on a stated cadence, adding specific input sources, or tuning output thresholds.
  2. Modification Protocol: The development, testing, and validation methods each change must clear before going live, including exact acceptance criteria.
  3. Impact Assessment: A technical analysis of how those changes shift device behavior and risk, proving the results remain clinically acceptable.

This process requires you to pre-negotiate the right to improve your product. The entry requirement is defining precisely what "better and still safe" means in automated code before you submit.

The Architectural Hurdle

The PCCP framework has been live for over eighteen months, yet only around two dozen devices have an authorized plan out of well over a thousand cleared AI tools.

This gap exists because a PCCP is an engineering artifact, not a piece of regulatory prose written at the end of a project. You cannot pre-authorize a change you cannot specify, validate programmatically, or monitor in production. Your modification protocol is your Continuous V&V (Verification & Validation) pipeline translated into regulatory terms.

Without an automated evaluation harness to prove a new version matches or beats the old one, plus real-time data-drift monitoring, you cannot execute a PCCP.

Teams that treat regulatory strategy as a post-script fail here. They build a model and a clean demo, but skip the infrastructure that turns validation into an automated query with an objective answer. When retraining is required, they cannot prove safety cheaply. The only compliant path backward is a full new submission. The model didn't freeze because the FDA required it to. It froze because nobody built the machinery to prove the next version safe.

A strong protocol specifies exact programmatic gates: 

  • Data Drift Thresholds: The system halts automated retraining if input data distribution shifts beyond a defined population embedding distance.
  • Automated Regression Harnesses: The pipeline tests the retrained model against a version-controlled, held-out gold-standard evaluation dataset of over 10,000 edge-case vectors.
  • Deterministic Safety Wrappers: Hard-coded boundary constraints verify that statistical weights cannot cause an ultrasound measurement tool to output mathematically impossible values.

Europe Is Heading the Same Way

If you sell into Europe, similar structural expectations are arriving. Under the EU AI Act, AI integrated into a medical device is classified as high-risk, adding obligations on top of existing Medical Device Regulation (MDR) compliance. These requirements, including post-market monitoring and disciplined change management, are scheduled to take effect beginning August 2027.

Regulators across continents share a clear direction: you must manage your model across its entire lifecycle, not freeze it at launch. The firms that capture market share in regulated AI will be those that can safely ship version 12 while competitors remain hesitant to touch version 1.

What to do this week

You do not need an active submission to begin this work. Before your next AI feature enters development, document a clear change envelope using this structure:

change-envelope-dark.png

If your team can populate all three columns with technical specifics, you have the foundation of a modification protocol and a deployable system. If you cannot, you have identified the exact engineering gaps blocking production.

Our engineering team operates under two strict principles on every regulated build:

  1. The evaluation harness is a precondition for shipping, not a post-launch addition. The test suite verifying the initial model is the exact same codebase that ensures the safety of the next deployment. Build it first.
  2. Regulatory constraints belong in the architecture conversation, not the submission phase. Your future freedom to iterate is decided when you design your data pipeline, telemetry monitoring, and retraining infrastructure, not in a compliance meeting eight months later.

How to move forward

This is where our studio model focuses. In a five-day Deployment Bootcamp, we work with your real data to deploy a working production system. Alongside it, we build the automated evaluation harness and the architectural documentation required to survive a compliance review.

That harness provides the foundation for the modification protocol, allowing you to iterate without refiling from scratch. Proving your AI works and earning the right to improve it happen within the same architecture.

If your model is stuck and your optimizations are piling up in the backlog, let's look at the blockers together. Or get in touch with our engineering team to talk through your system's architecture.

Your Turn

Shipping production AI into a regulated industry?

Tell us the regulatory or safety constraint slowing you down. 30 minutes with a senior engineer, a deployable architecture sketch, and an honest call on whether a Bootcamp is the right next step.