Research And Development
R&D
Overview
.avif)
Advancing Generative AI through Innovation
The R&D team at LightOn plays a pivotal role in advancing the field of generative AI through continuous innovation and development. Their expertise spans across creating and fine-tuning large language models (LLMs) that form the backbone of the Paradigm platform, a comprehensive AI solution designed for enterprise use. This platform simplifies the integration of generative AI into business workflows, offering both on-premise and cloud options to ensure flexibility and scalability for various business needs.​
r&d publicationsRecent R&D Posts

LightOn’s Multi-Vector Retrieval Revolution: From Research to Production
Discover how LightOn’s late-interaction stack, ModernBERT, PyLate, and FastPlaid,is transforming semantic search and AI retrieval from academic theory into real-world production systems
CTA Title
Lorem Ipsum

FastPlaid: Bringing Multi-Vector Search to Production Scale
FastPlaid is LightOn’s open-source Rust engine for late-interaction retrieval. Version 1.10.0 adds incrementally-updatable indexes—6.5× faster than Stanford PLAID—so your RAG, recommender or search pipeline can evolve in real time without downtime.
CTA Title
Lorem Ipsum

Introducing Ettin Suite: the SoTA open recipe to outperform existing Generative & Retrieval Models
Introducing Ettin, the first ever SOTA suite of paired encoder & decoder models, developed by Johns Hopkins University in collaboration with LightOn.
CTA Title
Lorem Ipsum

PyLate-rs: a lightweight tool to compute lightning-fast embeddings
PyLate-rs is a high-performance inference engine for PyLate models, meticulously crafted in Rust for optimal speed and efficiency.
CTA Title
Lorem Ipsum

Announcing BioClinical ModernBERT: a new SOTA encoder model for Medical NLP
The recent release of ModernBERT by LightOn and AnswerAI aims at providing the best base model that can be then used in different industry verticals. Today, Thomas Sounack from the Dana-Farber Cancer Institute in collaboration with researchers at Harvard University, LightOn, MIT, McGill University, Albany Medical College and Microsoft Research, used this capability and trained a new State-Of-The-Art (SOTA) medical encoder named BioClinical ModernBERT.
CTA Title
Lorem Ipsum

LightOn Unlocks Agentic RAG with new SOTA Model Reason-ModernColBERT
After the recent release of GTE-ModernColBERT exhibiting state-of-the-art results on long-context retrieval, LightOn announces a major leap forward in AI-driven knowledge discovery: Reason-ModernColBERT, an open-source multi-vector model purpose-built for Deep Research applications.
CTA Title
Lorem Ipsum

LightOn Releases GTE-ModernColBERT, First State-of-the-Art Late-Interaction Model Trained on PyLate!
LightOn is proud to announce the release of GTE-ModernColBERT, our new state-of-the-art, open-source, multi-vector retrieval model. By leveraging ModernBERT architecture and our innovative PyLate library, we've created a solution that sets a new milestone in the field and addresses the complex challenges of modern enterprise information retrieval.
CTA Title
Lorem Ipsum

ModernBERT: Finally, a Replacement for BERT
This blog post introduces ModernBERT, a family of state-of-the-art encoder-only models representing improvements over older generation encoders across the board.
CTA Title
Lorem Ipsum
.avif)
MonoQwen-Vision, the first visual document reranker
We introduce MonoQwen2-VL-v0.1, the first visual document reranker to enhance the quality of the retrieved visual documents and take these pipelines to the next level. Reranking a small number of candidates with MonoQwen2-VL-v0.1 achieve top results on the ViDoRe leaderboard.
CTA Title
Lorem Ipsum
Explore Publications by LightOn

.avif)