Web Analytics Made Easy - Statcounter

The retrieval API for the agentic AI era

Parsing, extraction, retrieval with citations. One API key. Free tier on Console. Per page pricing when you scale.
# 4 lines to structured Markdown
curl https://api.lighton.ai/v3/parse \
    -H 'Authorization: Bearer $LIGHTON_API_KEY' \
    -H 'Content-Type: application/json'\
    -D '{"document":"https://console-examples.lighton.ai/AFD-091005-062.pdf"}'
In production at
Three endpoints.
Zero pipeline maintenance.

Parse a document. Extract any field. Retrieve with citations.
The same API key on Console, the same SDK on Enterprise.

LightOnOCR-2.
State-of-the-art parsing.

Turns scans, tables, handwriting, and multi-column layouts into structured Markdown. 20+ languages natively. The parsing engine behind every retrieval workflow.
83.2
OLMOCR-BENCH (SOTA)
€0.002
Per Page
20+
Native language
Open
Weights on HuggingFace
$ curl https://api.lighton.ai/v1/parse \    
    -H 'Authorization: Bearer $LIGHTON_API_KEY' \
    -H 'Content-Type: application/json'\
    -D '{"document":"https://console-examples.lighton.ai/AFD-091005-062.pdf"}'

Define the schema.
Get JSON back.

Pull any field, entity, or key-value pair you care about. Invoice numbers, lease end dates, claim IDs, contract clauses. You define the schema; LightOn returns structured JSON.
JSON Schema
In / Out
€0.004
Per Page
Async
Via Webhooks
Cited
Per Field
$ curl https://api.lighton.ai/v1/extract \    
    -H 'Authorization: Bearer $LIGHTON_API_KEY' \  
    -H 'Content-Type: application/json' \   
    -d ''{"document":"https://console-examples.lighton.ai/AFD-091005-062.pdf","schema":"<YOUR_JSON_SCHEMA>"}'

Grounded retrieval
with citations.

One query, three signals: dense, sparse, late-interaction. The index picks the right signal, not the developer. Every result ships with the source passage that produced it. Built on LateOn and NextPlaid, our open-source ColBERT family.
Multi-vector
Dense + Sparse + Li
€0.006
Per Query
<200ms
P50 Latency
ACL
At Chunk Level
$ curl https://api.lighton.ai/v1/search \
   -H 'Authorization: Bearer $LIGHTON_API_KEY' \
   -H 'Content-Type: application/json' \
   -d '{"query":"<YOUR_QUERY>"}'
Try it in your browser.
No install needed.

Every endpoint is testable directly from Console.
Drop a file, get the response, copy the code into your project.

console.lighton.ai
Start free.
Pay per page and per query.

No seat licenses. No commitment. The free tier covers most prototypes.
Per-page pricing kicks in only when you scale.

STARTER
For developers and builders.
Free
+ PAYG
Pure usage-based — no commitment.
Includes
Unlimited workspaces
Native data source connectors
Data governance & access controls (RBAC)
SSO (SAML, OIDC, LDAP) & SCIM provisioning
Audit logs & usage analytics
Bring Your Own LLM or LightOn LLMaaS
EU sovereign hosting
BUSINESS
For teams shipping to production.
149€
/mo + PAYG
Annual license, Business SLAs.
Everything in starter, plus
99.5% monthly availability SLA
Business support, business hours (CET)
100 GB document storage included
Unlimited users — no per-seat fees
ENTERPRISE
For large organizations and the public sector.
Custom
Tailored deployment, Enterprise SLAs and white-glove support.
Everything in Business, plus
Dedicated environment: cloud, VPC, on-prem
SecNumCloud / regional options (MENA, APAC, US)
Managed deployment & setup
Custom data source connectors (1 included)
Dedicated CSM + Enterprise SLAs
Volume discounts
Optional custom AI engineering & integration
Need VPC, on-prem, or air-gapped deployment? See Enterprise deployment options
RAG was built for chatbots.
LightOn is built for agents.

Agents do not ask nicely. They dump raw PDFs, garbled tables, and off-domain queries into the same thread. The retrieval layer has to handle the input it gets, not the input you wish you had.

Retrieval

Hybrid retrieval

Dense, lexical, and late-interaction signals on one query. The index picks the right signal, not the developer. Built on LateOn and NextPlaid, our open-source ColBERT family.
Trust

Grounded by default

Every answer ships with the exact passage that supports it. Retrieval and reasoning are separable. Auditable by design, not retrofitted.
Infra

LLM-agnostic

Bring your own model. Open-source, commercial, or private. No lock-in on the inference layer. Your security policy dictates where inference happens.
Protocol

MCP-native

Drop LightOn into any agent that speaks Model Context Protocol. Single agent, multi-agent system, or business application integration. Same API.
Scope

Workspaces and ACLs at chunk level

Multi-agent systems need scoped corpora. Each agent gets its own workspace, its own collections, its own permissions. Access control is enforced at the chunk, not at the document. An agent never sees a single token it should not see.
What will you build?

Three endpoints. Hundreds of products waiting to be shipped.
Here are the ones builders are shipping right now on Console.

For Agent Builders
Ground your AI agent in your docs
Connect any agent (LangChain, LlamaIndex, custom) to LightOn via MCP. The agent calls /search , gets cited passages, generates grounded answers. No hallucinations, no orphan claims.
LightOn API + MCP + your LLM
For Product Teams
Add semantic search to your app
Replace keyword search in your SaaS with multi-vector retrieval. Upload docs via /parse, query via /search, surface results with citations. Two endpoints, no vector DB to manage.
LightOn API + your frontend
For ISVs
Build a doc Q&A bot in an afternoon
Parse a folder of PDFs with /parse, index them in a workspace, query with /search, render the cited answer in your UI. Production-ready chatbot in <100 lines of code.
LightOn API + Next.js or Streamlit

BUILT ON OPEN RESEARCH

Our retrieval models are in your dependency tree.
Empower developers with a production-ready Multimodal RAG API running on your infrastructure. Integrate secure reasoning into your apps (CRM, ERP) without managing the complex AI stack.
50M
HuggingFace Downloads
916K
PYPI installs per mounth
2,345
GitHub Stars
3,845
HuggingFace Likes
Open-source models in production
LateOn
NextPlaid
PyLate
DenseOn
LightOnOCR-2
+8 more on HF →

Developer Experience

Built by Developers, for Developers

No enterprise procurement process to read your docs. No SDK rewrite every quarter. Just stable endpoints, real protocols, and a documentation that actually matches the API.

MCP-native

Connect any AI agent that speaks Model Context Protocol to LightOn in two lines. Single agent, multi-agent system, or app integration. The API is the same.

Comprehensive documentation

Full OpenAPI V3 specs, quickstart guides, copy-paste recipes for the most common workflows. Updated with every release.

Standard protocoles

RESTful architecture. Consumable from Python, Node.js, Java, Go, or any HTTP client. No proprietary SDK required.

Don’t just take our word for it

Hear from some of our amazing customers who are building faster
Safran
The expertise of their tech team and the rapid evolution of the product, such as the hybrid search feature, put them at the forefront of innovation.
Jerome Lacaille - Emeritus Expert in Algorithms
Jérôme Lacaille
Emeritus Expert in Algorithms
Babbar
Babbar needed an efficient SEO strategy enhancement through LLM technology to stay competitive in the dynamic SEO industry.
Sylvain Peyronnet - Currently working on IBOU (but still @Babbar)
Sylvain Peyronnet
Co-founder & search engine specialist
Region Ile De France
LightOn responded very quickly with tools that perfectly matched our needs, enhancing our document base and onboarding users without experience.
Achille Lerpinière - Chief Information & Technology Officer chez Region Ile-de-France
Achille Lerpinière
Chief Information & Technology Officer