Research And Development
R&D Overview
.avif)
Advancing Generative AI through Innovation
The R&D team at LightOn plays a pivotal role in advancing the field of generative AI through continuous innovation and development. Their expertise spans across creating and fine-tuning large language models (LLMs) that form the backbone of the Paradigm platform, a comprehensive AI solution designed for enterprise use. This platform simplifies the integration of generative AI into business workflows, offering both on-premise and cloud options to ensure flexibility and scalability for various business needs.
r&d publicationsRecent R&D Posts

LightOn Releases GTE-ModernColBERT, First State-of-the-Art Late-Interaction Model Trained on PyLate!
LightOn is proud to announce the release of GTE-ModernColBERT, our new state-of-the-art, open-source, multi-vector retrieval model. By leveraging ModernBERT architecture and our innovative PyLate library, we've created a solution that sets a new milestone in the field and addresses the complex challenges of modern enterprise information retrieval.
CTA Title
Lorem Ipsum

Finally, a Replacement for BERT
This blog post introduces ModernBERT, a family of state-of-the-art encoder-only models representing improvements over older generation encoders across the board.
CTA Title
Lorem Ipsum
.png)
MonoQwen-Vision, the first visual document reranker
We introduce MonoQwen2-VL-v0.1, the first visual document reranker to enhance the quality of the retrieved visual documents and take these pipelines to the next level. Reranking a small number of candidates with MonoQwen2-VL-v0.1 achieve top results on the ViDoRe leaderboard.
CTA Title
Lorem Ipsum

FC-AMF-OCR Dataset : LightOn releases a 9.3 million images OCR dataset to improve real world document parsing
With over 9.3 million annotated images, this dataset offers researchers and AI developers a valuable resource for creating models adapted to real world documents.
CTA Title
Lorem Ipsum

PyLate: Flexible Training and Retrieval for ColBERT Models
We release PyLate, a new user-friendly library for training and experimenting with ColBERT models, a family of models that exhibit strong retrieval capabilities on out-of-domain data.
CTA Title
Lorem Ipsum
CTA Title
Lorem Ipsum
%20(2).avif)
Transforming LLMs into Agents for Enterprise Automation
Developing Agentic Capabilities for LLMs to automate business workflows and create smart assistants.
CTA Title
Lorem Ipsum
Explore Publications by LightOn

.avif)