A Framework for Verifiable AI in Regulated Industries
Why 'the model said so' is not a compliance answer — and what verifiable AI infrastructure actually requires.
Three years ago, a hallucinating AI model in a regulated context was a theoretical risk. Today it is a regulatory liability. SEC staff bulletins, FINRA examination priorities, and FCA Consumer Duty guidance all now contain explicit language about AI governance. The question is no longer whether your AI outputs need to be verifiable — it is how you build the infrastructure to make them so.
This post lays out a framework. Not a philosophy. A set of concrete requirements that, if satisfied, make your AI outputs defensible to an examiner.
The four requirements of verifiable AI
1. Source-grounded outputs
Every factual claim in an AI output must trace back to a specific source document. Not "the model learned this during training." A specific document, with a retrievable citation, that your compliance team can produce on demand.
This is not a prompt engineering problem. It is an architecture problem. You need retrieval-augmented generation with citation tracking, or you need to constrain your model to operate only within a bounded document set. Models that generate from parametric memory alone are, by definition, unverifiable.
2. Real-time verification scoring
Citation is necessary but not sufficient. A model can cite a source and still misrepresent it. Real-time verification means scoring the output against the cited source — confirming that the claim is supported, not just adjacent to, the source material.
In production SEC filing summarization workflows, we measure unverified hallucination rates at 4–8%. After verification scoring with source cross-reference, this drops to under 0.3%. The delta is the liability your compliance team is currently carrying without knowing it.
3. Immutable audit logs
Verification is worthless if it cannot be demonstrated after the fact. Audit logs must be:
- Tamper-evident. Hash-chained entries where modifying any record breaks the chain.
- Complete. Capturing prompt, retrieved context, model version, temperature, output, verification score, and any flagged claims.
- Cryptographically signed. Each entry signed with ECDSA-P256 so authenticity can be proven without trusting the log storage provider.
- Exportable. Producible in structured formats (PDF, JSON) on demand for regulatory requests.
"We have logs" is not the same as "we have audit-ready logs." The difference is whether an examiner can trace a specific output back through the complete chain of its production.
4. Regulatory report templates
Different regulators ask for different things. An SEC examination staff questionnaire asks different questions than an FCA supervisory visit. Your audit infrastructure needs to produce documentation packages structured for each regulatory context — not raw log exports that your compliance team must translate.
This is the layer that turns infrastructure investment into examination outcomes.
Where most implementations fall short
The most common failure mode is treating these four requirements as independent. A team will build excellent retrieval (requirement 1), skip real-time verification (requirement 2), store logs in an application database without tamper-evidence (requirement 3 violated), and have no reporting templates (requirement 4 absent).
The result looks like a compliance posture from the outside. An examiner asking to see the audit trail for a specific client-facing AI output discovers it in 45 minutes — if the output was flagged for review. For all the outputs that were not flagged, there is nothing to produce.
The operational question
For each AI workflow you have in production, ask: if a regulator asked me to explain the provenance of output X from date Y, how long would it take me to produce a complete answer?
If the answer is more than one business day, you have a compliance gap. If the answer is "we cannot produce that," you have a liability.
The framework is not about restricting what AI can do. It is about making what AI does defensible. Those are different problems with different solutions. The first is a business question. The second is an infrastructure question.
Fact AI Lab is the infrastructure answer. The business question — whether to deploy AI in a given workflow — remains yours.
Questions about applying this framework to your specific workflows? Book a 20-minute call.