Web Analytics Made Easy - Statcounter

Research And Development

R&D
Overview

Advancing Generative AI through Innovation

The R&D team at LightOn plays a pivotal role in advancing the field of generative AI through continuous innovation and development. Their expertise spans across creating and fine-tuning large language models (LLMs) that form the backbone of the Paradigm platform, a comprehensive AI solution designed for enterprise use. This platform simplifies the integration of generative AI into business workflows, offering both on-premise and cloud options to ensure flexibility and scalability for various business needs.​

r&d publications

Recent R&D Posts

Read post

Adaptive Chunking: Reasoning Starts Before the LLM Sees a Token

Document-aware chunking selection for production RAG systems

May 19, 2026
research

CTA Title

Lorem Ipsum

Read post

Deep Research is now Open

Agent-ModernColBERT adds ~10% over Reason-ModernColBERT on BrowseComp-Plus, stays at 149M parameters, and brings GPT-5 + Qwen3-8B-level retrieval performance to a fully open stack.

May 12, 2026
research

CTA Title

Lorem Ipsum

Read post

🔴 The Retriever You Actually Need

Introducing LateOn and DenseOn, two Apache 2.0 retrievers: SOTA on BEIR, built to generalize.

April 21, 2026
research

CTA Title

Lorem Ipsum

Read post

Document Intelligence at First Sight

OriOn-Qwen-SR1: Fast Implicit Reasoning for Long Documents

April 9, 2026
research

CTA Title

Lorem Ipsum

Read post

Open-Source LightOnOCR-2 Just Outscored Claude, GPT-5, Qwen3, Mistral and Mathpix at Table Extraction

The most valuable information in enterprise documents doesn't live in paragraphs. It lives in tables

April 7, 2026
research

CTA Title

Lorem Ipsum

Read post

The Bloated Retriever Era Is Over

Reason-ModernColBERT tops BrowseComp-Plus, the most rigorous agentic search benchmark, across every metric, with 54× fewer parameters and fewer search calls.

March 19, 2026
research

CTA Title

Lorem Ipsum

Read post

NOVA: A Guide to Actually Measuring How Your Agent Works on Your Data

Numbers Over Vibes: Building a RAG Evaluation Framework That Actually Works

March 11, 2026
research

CTA Title

Lorem Ipsum

Read post

Day Zero of Multi-Vector Retrieval

Introducing ColBERT-Zero: late interaction model trained from scratch with PyLate

February 19, 2026
research

CTA Title

Lorem Ipsum

Read post

Introducing OriOn: the SOTA Long-Context Engine That Powers Agentic Search & Reason

Agentic AI starts with retrieval. It scales with long context.

February 18, 2026
research

CTA Title

Lorem Ipsum