GauntletScore

Independent verification for AI-generated documents.

Seven adversarial AI agents from four providers verify every claim against authoritative databases. Returns an Ed25519-signed trust score.

SEC 10-K — Danaher Corporation

Verified 3 minutes ago

Score

Grade A

Certificate

GS-2024-09847

Claim Verdicts

Revenue increased 8.2%VERIFIED

CEO is Richard T. CoteINCONCLUSIVE

Fabricated acquisition of Acme CorpDEBUNKED

Gross margin 54.3%VERIFIED

Source Badges

SEC EDGAR

eCFR

CourtListener

PubMed

Signature: Ed25519 verified7 agents consensus

Data sourced from authoritative registries

SEC EDGAR

CourtListener

eCFR

PubMed

ClinicalTrials.gov

38 CFR Part 4

The Problem

A single AI model agreeing with itself is not verification.

Document verification is broken when it relies on a single model. That's why we built GauntletScore.

9.6:1

In a pre-registered study of 20 publicly traded companies, one foundation model flagged 48 potential errors on the same document where another flagged only 5.

View pre-registered methodology on OSF →

The Self-Check Problem

When you ask an AI to review its own output, the number of errors it finds depends more on which model you chose than on what's actually wrong with the document.

Model Variance Creates Blind Spots

One foundation model flags critical errors while another misses them entirely. You don't know which one to trust.

How It Works

Three steps to verified trust

From document upload to certified score in minutes. No manual review, no single points of failure.

Submit

Upload a document via API, batch file, or web interface. We accept PDFs, text, and Markdown. Encrypted in transit and at rest.

Verify

Seven independent AI agents from four providers debate each claim. Cross-referenced against authoritative databases. Adversarial disagreement surfaces blindspots.

Certify

Receive a trust score, claim verdicts (VERIFIED/DEBUNKED/INCONCLUSIVE), and an Ed25519-signed certificate. Audit-ready within 5-10 minutes.

What It Catches

Errors that self-check misses entirely.

Real categories of error from a pre-registered study of 20 public companies. Every error verified against external databases—not model opinion.

CRITICAL

Fabricated Executives

An AI-generated company profile named a Chief Growth Officer who does not exist. Verified against public corporate records.

CRITICAL

Invented Corporate Events

A profile described a $100-125 million settlement between two companies. No such settlement occurred. Verified via court records.

MAJOR

Wrong Financial Figures

A profile claimed 16% of total U.S. commercial bank deposits. Actual figure: 13-14% per FDIC data.

MAJOR

Arithmetic Failures

A balance sheet claimed 13.05 + 2.81 = 19.07. Actual sum: 15.86. Caught by mathematical proof engine.

MINOR

Misattributed Roles

A profile stated one person held both Chairman and CEO titles. Verified: they are held by two different people.

Solutions

Built for regulated industries

Purpose-built verification workflows for the industries where trust matters most.

Legal

Verify case citations against CourtListener. Flag fabricated holdings, incorrect regulatory references, and jurisdiction errors before filing.

Financial Compliance

Check executive names, M&A events, and financial figures against SEC EDGAR filings. Verify arithmetic in balance sheets and projections.

Insurance & Fraud

Validate billing codes, provider credentials, and claims arithmetic. Detect upcoding patterns, excluded providers, and geographic impossibility.

More Domains

Scientific research, VA disability claims, biotech regulatory filings, and general document verification. Same engine, domain-specific tools.

Comparison

How GauntletScore compares.

Built on independent verification, not single-model confidence.

General AI

RAG + Tools

GauntletScore

Multi-provider verification

Adversarial cross-examination

Primary source verification

Partial

Per-claim verdicts

Cryptographic proof

Air-gapped deployment

Audit-ready artifacts

Validation Data

Tested, measured, published.

840+

verification runs across 30+ models

+37 pts

avg improvement vs reasoning-only

96%

score parity between cloud and local

5-10 min

typical time to score and certify

Pre-registered validation study of 20 public companies completed. Full methodology published on Open Science Framework. White paper forthcoming.

Sovereign Edition

Private Preview

Air-gapped deployment. Zero data egress.

Run the same verification engine on your own hardware for absolute data sovereignty.

Air-gapped or VPC deployment

Runs on your own hardware. No cloud API calls. No data leaves your network.

Zero data egress

Complete control over your infrastructure and verification processes.

Customer-managed keys

You control the cryptographic keys and all signing operations.

96% score parity

Identical results between cloud and local air-gapped deployment.

CMMC 2.0 ready

Architecture maps to CMMC 2.0 controls for government compliance.

Same Score. No Cloud.

Absolute Data Sovereignty

Pricing

Simple pricing. No subscriptions.

Pay per credit. Credits never expire. Every analysis includes full 7-agent debate, all document types, cryptographic certificate, and full transcript.

Free

3 credits included

One-time. No credit card required.

Starter

$29

10 credits

$2.90/credit

Perfect for initial testing.

Pro

$69

25 credits

$2.76/credit

Best for regular use.

Business

$125

50 credits

$2.50/credit

For high-volume verification.

Longer or multi-part documents may require more than one credit.

Enterprise and Sovereign Edition pricing available. Contact sales@genstrata.com

View full pricing →

FAQ

Frequently Asked Questions

Everything you need to know about GauntletScore and trust verification.

What counts as one analysis?

One credit = one document submission. Each document is processed by 7 agents from 4 providers, cross-referenced against authoritative sources, and returned with per-claim verdicts and a signed certificate. Longer or multi-part documents may require more than one credit. Free tier includes 3 credits, one-time.

What document types do you support?

PDFs, plain text, and Markdown. SEC 10-Ks, legal filings, earnings calls, research papers, insurance claims, biotech filings, VA disability claims—anything with verifiable claims. Submit via API, batch, or web upload.

Do you store my documents?

No. Documents are encrypted in transit, processed in memory, and deleted immediately after analysis. You receive a signed certificate of verification. Sovereign Edition customers can run air-gapped local deployment with zero data egress.

How do you handle disagreement between agents?

Disagreement is the feature, not a bug. When agents debate a claim and cannot reach consensus, we mark it INCONCLUSIVE and show you which agent flagged what. This surfaces blindspots that a single model would miss. Transparent uncertainty beats false confidence.

How long does verification take?

Typical documents complete in 5-10 minutes. Each analysis triggers between 100 and 200 API calls to authoritative databases across 4 rounds of adversarial debate. Longer or multi-part documents take proportionally longer.

What about privacy and compliance?

HIPAA BAA available. GDPR DPA with SCCs available. Architecture maps to SOC 2 and CMMC 2.0 controls. All documents deleted after processing. Ed25519-signed certificates provide cryptographic proof of verification for audit trails. Sovereign Edition runs air-gapped on your own hardware with customer-managed encryption keys and zero data egress.

How do I know the agents aren't just inventing their own errors?

GauntletScore's verdicts are not based on what agents believe or estimate. During the tool verification phase (Round 0), each agent issues direct, structured queries to authoritative databases — CourtListener, SEC EDGAR, eCFR, PubMed, and others — and receives structured responses. A claim is DEBUNKED when a database returns a record that contradicts it, or when the authoritative source returns no matching record for a citation that should be there. A claim is VERIFIED when the authoritative source confirms it. The debate rounds that follow are structured around that external evidence, not around agent opinion. The full source citation for every tool query is preserved in the audit transcript, so you can verify the database result yourself.

What happens when two agents disagree?

The structured debate is designed to surface and resolve disagreements — but not to manufacture false consensus when the evidence is genuinely ambiguous. If agents hold opposing positions after four rounds of debate and neither side can produce controlling external evidence, the claim is returned as INCONCLUSIVE. This is a distinct verdict in the scoring system, not a fallback. An INCONCLUSIVE verdict tells you that the claim could not be confirmed or refuted against available authoritative sources — which is materially different from both VERIFIED and DEBUNKED, and more useful than a false confidence score that buries the uncertainty.

Can I use GauntletScore on documents that contain confidential information?

The Cloud Edition processes documents in memory during analysis. The original document text is not stored — only its SHA-256 hash is retained for certificate verification. Temporary files are deleted after results are stored, and each analysis runs in an isolated subprocess whose memory is fully reclaimed by the OS on exit. Tenant data is isolated at the database level through PostgreSQL Row-Level Security. For organizations that cannot send documents through any external API, the Sovereign Edition runs the complete GauntletScore pipeline on your own hardware with no external network dependencies.

How is this different from running my document through multiple chatbots?

Chatbots reason about claims. GauntletScore verifies them against authoritative databases. There is no amount of reasoning that produces the same result as a live CourtListener query confirming whether a case citation resolves to a real decision.
Chatbots don't challenge each other with structured evidence. GauntletScore's four-round debate structure forces agents to defend their findings against adversarial challenge. Round 2 is specifically designed so that Pyrrho challenges every significant claim and demands evidence.
Pyrrho's causal_validate pipeline has no equivalent in standard chatbot review. When a document makes a causal claim, Pyrrho evaluates its temporal, proportional, and logical structure — not just whether the stated facts are individually accurate.
Bayesian confidence calibration produces scores that reflect evidentiary weight, not chatbot confidence language. “I'm fairly confident this is accurate” and a calibrated 0.91 confidence score are not the same thing.
The knowledge graph means that every organization's second, tenth, and hundredth run benefits from verified facts accumulated in prior runs. Chatbots have no persistent memory of what they've verified before.
Every GauntletScore analysis returns an Ed25519-signed cryptographic certificate. If you need to demonstrate to a court, regulator, or compliance auditor that a specific document was independently verified, the certificate provides that proof. A chatbot conversation log does not.

What's your error rate?

Our pre-registered validation study examined 13,579 claims across 360 evaluations of 20 public companies. The system detected 27 tool-verified factual errors — cases where an authoritative database directly contradicted a claim. What the system catches well: structurally verifiable claims — citations, arithmetic, regulatory references, executive names and titles. Tool-augmented verification produced a +37.1 point improvement over reasoning-only analysis. What the system does not catch: errors requiring deep domain expertise unavailable in any public database, claims that are technically accurate but selectively framed, and arguments that are logically structured but strategically misleading. We return INCONCLUSIVE rather than forcing a verdict when evidence is insufficient.

Start verifying AI-generated documents today.

Get 3 free document analyses. No credit card. No trial period. Just signing up.