Blog
🔵 LightOn opens a new field for AI with LightOnOCR-2: Document Intelligence
LightOn integrates into Paradigm a technology that beats the world state of the art: LightOnOCR-2



Deep Research is now Open
Agent-ModernColBERT adds ~10% over Reason-ModernColBERT on BrowseComp-Plus, stays at 149M parameters, and brings GPT-5 + Qwen3-8B-level retrieval performance to a fully open stack.
CTA Title
Lorem Ipsum


EDiTh: Sharing the Data Nobody Shares
Meet EDiTh: a synthetic enterprise corpus built to benchmark what public datasets can't touch: 1,004 PDFs in six languages and 36 ground-truth use cases.
CTA Title
Lorem Ipsum


Your RAG Pipeline Is Eating Your Roadmap
The production gap, the retrieval trap, and the way out.
CTA Title
Lorem Ipsum


The Retriever You Actually Need
Introducing LateOn and DenseOn, two Apache 2.0 retrievers: SOTA on BEIR, built to generalize.
CTA Title
Lorem Ipsum


Document Intelligence at First Sight
OriOn-Qwen-SR1: Fast Implicit Reasoning for Long Documents
CTA Title
Lorem Ipsum


Open-Source LightOnOCR-2 Just Outscored Claude, GPT-5, Qwen3, Mistral and Mathpix at Table Extraction
The most valuable information in enterprise documents doesn't live in paragraphs. It lives in tables
CTA Title
Lorem Ipsum



