Building an LLM Fact-Checker That Got Legal Approval

Removing the Bottleneck to AI Content at Scale

The Challenge

The Motley Fool was publishing AI-generated articles for premium subscribers, allowing the company to scale content creation beyond what human writers could produce. But there was a critical bottleneck: every article required human fact-checkers to control for LLM hallucinations. The company's ambition was to scale to thousands of articles per month, providing comprehensive coverage across all US-listed companies. At that volume, human fact-checking would be both economically unfeasible and operationally impossible—we simply couldn't hire enough qualified fact-checkers fast enough.

The real problem wasn't just operational; it was existential for the AI content strategy. Without solving fact-checking at scale, we couldn't deliver on the promise of comprehensive coverage. And in financial services, publishing inaccurate information wasn't just embarrassing—it was legally risky and could destroy subscriber trust.

The Approach

Proof of Concept Development

Built a prototype LLM-based fact-checking system in Python that reversed-engineered the content creation process. The system extracted individual factual statements from each article, then systematically checked those facts against the original source material used to create the content. Incorrect statements were corrected in the final output or, if correction wasn't possible with available sources, removed entirely.

This created a verifiable audit trail for every claim in every article.

Cross-Functional Validation

Presented the prototype to cross-functional teams including legal, tech infrastructure, and executive leadership. This wasn't just a demo—it was the beginning of a collaborative process to understand what "good enough" looked like from legal, editorial, and technical perspectives.

Legal's involvement from this early stage was crucial; we needed to understand their comfort zones before building production systems.

Production System Development

Assembled a small, focused team: a product manager with prompt engineering expertise and an AI developer. They transformed the prototype into a production-ready "generic fact checker"—a modular system that could integrate into any content pipeline for any LLM-generated content type.

Over several weeks and hundreds of test runs, we systematically evaluated outputs, identified edge cases, and refined the system.

Legal Collaboration & Launch Strategy

Worked directly with the Legal Team to create a phased launch timeline with explicit safety guardrails. This included:

  • Running the AI fact-checker's outputs in parallel with human fact-checkers to prove statistical equivalence
  • Establishing thresholds for when human review was triggered
  • Defining monitoring systems for ongoing quality control

The goal wasn't to convince legal to take a risk—it was to demonstrate that the automated system was as reliable as human fact-checkers.

The Results

Operational Impact

Legal Approval Achieved

First fully automated content publishing system approved at The Motley Fool

End-to-End Automation

Complete pipeline from SEC filing detection through fact-checking to CMS publication

📈
Unprecedented Scale

Positioned company to cover thousands of companies monthly—previously impossible

🔧
Modular System

Used across multiple content pipelines and by the AI division of the Tech Team

Quality Validation

8.4/10
AI Content Quality
(vs 8.5/10 human baseline)
100%
Legal Standards
Met After Testing
Content Types
Supported

Strategic Value

Removed Primary Bottleneck

Eliminated the key constraint preventing AI content scale, unlocking the full potential of automated content generation

Competitive Advantage

Comprehensive company coverage became economically feasible—unfeasible for competitors using traditional approaches

Proof of Concept for Regulated Industries

Demonstrated that AI could handle high-stakes content in regulated industries with proper system design

Leadership Lessons

This project reinforced three principles about building AI systems for high-stakes applications:

Build for Verification, Not Just Output

The key innovation wasn't better content generation—it was creating an audit trail. By extracting claims, checking them against sources, and documenting every verification step, we gave legal and editorial teams the confidence that the system was accountable, not just accurate.

Legal as Design Partner, Not Gatekeeper

Involving legal from the prototype stage transformed them from potential blockers into collaborators. They helped define what "provably reliable" meant, which shaped our technical approach. By the time we requested approval, they'd already shaped the system they were evaluating.

Statistical Proof Beats Theoretical Arguments

We didn't convince legal with promises about LLM capabilities. We proved equivalence by running hundreds of parallel comparisons between AI and human fact-checkers. When the data showed the systems performed comparably, approval became straightforward.

Building High-Stakes AI Systems?

I specialize in developing AI systems for regulated industries where accuracy, accountability, and legal approval are critical.