FastPlaid: Bringing Multi-Vector Search to Production Scale

FastPlaid is LightOn’s open-source Rust engine for late-interaction retrieval. Version 1.10.0 adds incrementally-updatable indexes—6.5× faster than Stanford PLAID—so your RAG, recommender or search pipeline can evolve in real time without downtime.

August 14, 2025

TL;DR

"Production-ready multi-vector retrieval just got real-time updates—FastPlaid v1.10.0 brings dynamic indexing to RAG and search pipelines"

At LightOn, we believe that late interaction models will prevail in AI-based search engines. As a result, we are introducing FastPlaid, a high-performance engine built from scratch in Rust and optimized for GPUs.

FastPlaid brings Stanford’s PLAID accuracy with a dramatic performance boost:

Lightning-fast: +554% queries per second (QPS) compared to Stanford PLAID multi-vector search.
Purpose-built: Designed as the “Faiss for multi-vector models”, making similarity search faster and more practical than ever

With FastPlaid, multi-vector retrieval systems finally have a production-ready engine that can keep up with the demands of RAG pipelines, recommender systems, and enterprise search.

Try it: Explore FastPlaid on GitHub

What’s New in FastPlaid 1.10.0

Today LightOn releases FastPlaid v1.10.0, adding a major new capability: mutable indexes.

Until now, FastPlaid delivered exceptional performance for static indexes, where data is pre-computed and periodically reloaded, a strong fit for large-scale recommender systems or nightly rebuilt retrieval pipelines.

With this release, FastPlaid extends its reach to more dynamic environments. Users can now:

Insert new documents into an existing index without a full rebuild
Expand coverage seamlessly for evolving datasets
Enable real-time knowledge management, supporting workflows where information is continuously updated

This update strengthens FastPlaid’s position as the backbone for high-throughput, multi-vector retrieval in both stable and fast-changing enterprise contexts.

As Raphael Sourty, ML Engineer at LightOn and the driving force behind FastPlaid, explains:

Plenty of use cases can be solved with static indexes, including large-scale recommenders or nightly retrieval systems. But having an updatable index definitely increases the coverage of FastPlaid projects, and will attract even more users.

Try it: Check out the v1.10.0 release

Why It Matters

The ability to update indexes on the fly transforms FastPlaid from a research-friendly tool into a production-grade engine for applied AI. Real-world benefits include:

Enterprise RAG pipelines that stay up to date with evolving documentation
Recommender systems that adapt to new items instantly
Dynamic search engines that can grow without downtime

FastPlaid is open-source and is already gaining traction on GitHub with 140+ stars (as of August 2025) since its inception on June 6, 2025. Whether you are building enterprise search, AI assistants, or large-scale recommenders, FastPlaid makes multi-vector retrieval fast, scalable, and production-ready.

Any contribution to the project on GitHub is welcome!

LightOn is committed to making late interaction retrieval architectures not just a research concept, but a practical foundation for AI at scale.

Ready to Transform Your Enterprise?

TL;DR

What’s New in FastPlaid 1.10.0

Why It Matters

Recent Blogs

Afnic chooses LightOn to Deploy Sovereign and Secure Generative AI

Cyllene and LightOn join forces to accelerate the adoption of sovereign and responsible AI within French enterprises

LightOn: Deep Tech, Simple Delivery

Ready to Transform Your Enterprise?