ICDAR 2026 • DocVQA Competition

Technical Report: DCA at DocVQA 2026

Welf Wustlich, CTO, PLANET AI


June 2026 · 19 pages

0%
Accuracy on the DocVQA 2026 leaderboard
0 pp
Above the best mixture-of-experts baseline (~37.5%)
0
Document domains, from business reports to comics

PLANET AI Wins DocVQA 2026: Architecture Beats Model Size

DocVQA 2026 is the most demanding document understanding benchmark in the field. Organized by the Computer Vision Center at Universitat Autònoma de Barcelona, it requires deep reasoning across eight entirely different document types. No single model handles all eight well.

PLANET AI won — not with a larger model, but with a different architecture. The Distributed Cognitive Architecture (DCA) coordinates multiple foundation models as a cooperative team, grounded by IDA as a deterministic OCR layer. Result: 60.0% accuracy against roughly 40% for the best frontier-model configuration.

What the report covers

  • Architecture and methodology: How IDA, four VLMs, and a reasoning agent work as a coordinated system

  • Reader agent battlecard: Which model leads in which document domain and why; including where each one fails

  • Breakdown of the +20 pp gain: ~+7 pp attributed to IDA, ~+13 pp to DCA orchestration

  • Hallucination mitigation in the ensemble: How cross-perspective conflict detection surfaces errors before they reach the answer

By entering your contact details, you will receive the complete technical report.