Web Analytics Made Easy - Statcounter
For ISVs & Product Builders

The RAG API.
Upload - Search - Reason

Production-ready multimodal RAG. One API call. No GPUs to manage, no parsers to babysit, no vector DB to scale.
REST API
OpenAPI V3
LLM Agnostic
SOC 2 Type 1

HOW IT WORKS

Raw Document → Grounded Answer
LightOnOCR-2 • 1B

Ingest & Parse

Bulk-ingest your document base through a single API call. Sync from SharePoint, Teams, or ServiceNow. The parsing engine scores 83.2 on OlmOCR-Bench — beating every system evaluated, including models 9× its size.
NextPlaid • Rust • CPU-optimized

Chunk & Index

Intelligent splitting preserves context. The multi-vector database indexes each chunk with token-level precision finding the exact paragraph in a 200-page manual, not just "a related document.
LateOn • Multi-vector Retrieval

Retrieve & Rerank

Candidate pages are retrieved, then individually evaluated. Out of Pages 3, 4, and 6: "Page 3 relevant. Page 4  no. Page 6 yes." Only pertinent content reaches the LLM. Noise eliminated.
Any LLM • Grounded Output

Reason & Respond

Filtered context is sent to the LLM of your choice. Pick any model based on your privacy, cost, or performance requirements: open-source, commercial, or private.
The hardest part isn't building a RAG pipeline. It's keeping it current. We handle every upgrade, every migration, every month. Your integration stays stable. Your product stays ahead.

Buy vs. Build

You've Built Internal RAG
Now Ship It at Scale.
Parser maintenance. Model updates. Chunking edge cases. Vector DB scaling. That's infrastructure work, not product work.
Without LightOn API

Your current reality

Ongoing Maintenance
Parsers, OCR, chunking logic...
Custom ACL Implementation
Synced with your IAM
GPU Scaling
And inference optimization
Model Upgrades
Regression testing
With LightOn API

Your potential reality

Universal Ingestion Endpoint
PDF, Office, scans, HTML. We handle extraction and semantic chunking.
Native Workspace & Collection Isolation
SSO/SAML integration, audit logs out of the box.
Managed Pipeline
Optimized for low-footprint deployment (on-prem or private cloud).
Versioned API V3
Backward compatibility, controlled rollout.

Why LightOn API

What You Get That
You Can't Build or Buy Elsewhere
Features
DIY RAG
Legacy Search
Cloud RAG
LightOn API
Complex tables & diagrams
Breaks
Text only
Basic
OCR-2 native
Search precision
Single vector
Keywords
Single vector
Multi-vector
On-premise / air-gapped
Complex DIY
Possible
Cloud only
3 GPUs
ACL enforcement
DIY
Partial
Varies
Doc-level
LLM flexibility
Locked
N/A
Vendor
Agnostic
Time to production
6-12 months
2-3 months
1-2 months
Now
Built on Open Research.
Tested and trusted by the community.
37,186,936
Model downloads on Hugging Face
ModernBERT
LightOnOCR-2
GTE-ModernColBERT
Alfred-40B
LateOn-code
ColBERT-Zero
OriOn
...
3,100
Hugging Face likes
ModernBERT
LightOnOCR-2
GTE-ModernColBERT
Alfred-40B
LateOn-code
ColBERT-Zero
OriOn
...
1,861
Github stars
PyLate
NextPlaid
FastPlaid
ColGrep
PyLate-rs
...
109,064
PyPi downloads
PyLate
NextPlaid
FastPlaid
ColGrep
PyLate-rs
...