TL;DR
"Production-ready multi-vector retrieval just got real-time updates—FastPlaid v1.10.0 brings dynamic indexing to RAG and search pipelines"
At LightOn, we believe that late interaction models will prevail in AI-based search engines. As a result, we are introducing FastPlaid, a high-performance engine built from scratch in Rust and optimized for GPUs.
FastPlaid brings Stanford’s PLAID accuracy with a dramatic performance boost:
- Lightning-fast: +554% queries per second (QPS) compared to Stanford PLAID multi-vector search.
- Purpose-built: Designed as the “Faiss for multi-vector models”, making similarity search faster and more practical than ever
With FastPlaid, multi-vector retrieval systems finally have a production-ready engine that can keep up with the demands of RAG pipelines, recommender systems, and enterprise search.
Try it: Explore FastPlaid on GitHub
What’s New in FastPlaid 1.10.0
Today LightOn releases FastPlaid v1.10.0, adding a major new capability: mutable indexes.
Until now, FastPlaid delivered exceptional performance for static indexes, where data is pre-computed and periodically reloaded, a strong fit for large-scale recommender systems or nightly rebuilt retrieval pipelines.
With this release, FastPlaid extends its reach to more dynamic environments. Users can now:
- Insert new documents into an existing index without a full rebuild
- Expand coverage seamlessly for evolving datasets
- Enable real-time knowledge management, supporting workflows where information is continuously updated
This update strengthens FastPlaid’s position as the backbone for high-throughput, multi-vector retrieval in both stable and fast-changing enterprise contexts.
As Raphael Sourty, ML Engineer at LightOn and the driving force behind FastPlaid, explains:
Plenty of use cases can be solved with static indexes, including large-scale recommenders or nightly retrieval systems. But having an updatable index definitely increases the coverage of FastPlaid projects, and will attract even more users.
Try it: Check out the v1.10.0 release
Why It Matters
The ability to update indexes on the fly transforms FastPlaid from a research-friendly tool into a production-grade engine for applied AI. Real-world benefits include:
- Enterprise RAG pipelines that stay up to date with evolving documentation
- Recommender systems that adapt to new items instantly
- Dynamic search engines that can grow without downtime
FastPlaid is open-source and is already gaining traction on GitHub with 140+ stars (as of August 2025) since its inception on June 6, 2025. Whether you are building enterprise search, AI assistants, or large-scale recommenders, FastPlaid makes multi-vector retrieval fast, scalable, and production-ready.
Any contribution to the project on GitHub is welcome!
- Get started: FastPlaid on GitHub
- Explore: PyLate
- Learn more: LightOn
LightOn is committed to making late interaction retrieval architectures not just a research concept, but a practical foundation for AI at scale.