Building an LLM Fact-Checker That Got Legal Approval
Removing the Bottleneck to AI Content at Scale
The Challenge
The Motley Fool was publishing AI-generated articles for premium subscribers, allowing the company to scale content creation beyond what human writers could produce. But there was a critical bottleneck: every article required human fact-checkers to control for LLM hallucinations. The company's ambition was to scale to thousands of articles per month, providing comprehensive coverage across all US-listed companies. At that volume, human fact-checking would be both economically unfeasible and operationally impossible—we simply couldn't hire enough qualified fact-checkers fast enough.
The real problem wasn't just operational; it was existential for the AI content strategy. Without solving fact-checking at scale, we couldn't deliver on the promise of comprehensive coverage. And in financial services, publishing inaccurate information wasn't just embarrassing—it was legally risky and could destroy subscriber trust.
The Approach
Proof of Concept Development
Built a prototype LLM-based fact-checking system in Python that reversed-engineered the content creation process. The system extracted individual factual statements from each article, then systematically checked those facts against the original source material used to create the content. Incorrect statements were corrected in the final output or, if correction wasn't possible with available sources, removed entirely.
This created a verifiable audit trail for every claim in every article.
Cross-Functional Validation
Presented the prototype to cross-functional teams including legal, tech infrastructure, and executive leadership. This wasn't just a demo—it was the beginning of a collaborative process to understand what "good enough" looked like from legal, editorial, and technical perspectives.
Legal's involvement from this early stage was crucial; we needed to understand their comfort zones before building production systems.
Production System Development
Assembled a small, focused team: a product manager with prompt engineering expertise and an AI developer. They transformed the prototype into a production-ready "generic fact checker"—a modular system that could integrate into any content pipeline for any LLM-generated content type.
Over several weeks and hundreds of test runs, we systematically evaluated outputs, identified edge cases, and refined the system.
Legal Collaboration & Launch Strategy
Worked directly with the Legal Team to create a phased launch timeline with explicit safety guardrails. This included:
- Running the AI fact-checker's outputs in parallel with human fact-checkers to prove statistical equivalence
- Establishing thresholds for when human review was triggered
- Defining monitoring systems for ongoing quality control
The goal wasn't to convince legal to take a risk—it was to demonstrate that the automated system was as reliable as human fact-checkers.
The Results
Operational Impact
Legal Approval Achieved
First fully automated content publishing system approved at The Motley Fool
End-to-End Automation
Complete pipeline from SEC filing detection through fact-checking to CMS publication
Unprecedented Scale
Positioned company to cover thousands of companies monthly—previously impossible
Modular System
Used across multiple content pipelines and by the AI division of the Tech Team
Quality Validation
(vs 8.5/10 human baseline)
Met After Testing
Supported
Strategic Value
Removed Primary Bottleneck
Eliminated the key constraint preventing AI content scale, unlocking the full potential of automated content generation
Competitive Advantage
Comprehensive company coverage became economically feasible—unfeasible for competitors using traditional approaches
Proof of Concept for Regulated Industries
Demonstrated that AI could handle high-stakes content in regulated industries with proper system design
Leadership Lessons
This project reinforced three principles about building AI systems for high-stakes applications:
Build for Verification, Not Just Output
The key innovation wasn't better content generation—it was creating an audit trail. By extracting claims, checking them against sources, and documenting every verification step, we gave legal and editorial teams the confidence that the system was accountable, not just accurate.
Legal as Design Partner, Not Gatekeeper
Involving legal from the prototype stage transformed them from potential blockers into collaborators. They helped define what "provably reliable" meant, which shaped our technical approach. By the time we requested approval, they'd already shaped the system they were evaluating.
Statistical Proof Beats Theoretical Arguments
We didn't convince legal with promises about LLM capabilities. We proved equivalence by running hundreds of parallel comparisons between AI and human fact-checkers. When the data showed the systems performed comparably, approval became straightforward.
Building High-Stakes AI Systems?
I specialize in developing AI systems for regulated industries where accuracy, accountability, and legal approval are critical.