TL;DR
A case study on NIS2 compliance: LightOn identifies truly critical OT systems from incomplete, distributed, and contradictory documents.
Véracier Industries is a fully synthetic industrial group represented by a dataset of 1,004 documents across seven subsidiaries, six languages, scanned PDFs, bilingual contracts, internal emails, and procurement artifacts.
8:12 AM. Marc Lefèvre’s phone vibrates during his commute to Palaiseau.
The message comes from the corporate secretary’s office:
“The compliance committee needs a consolidated NIS2 mapping by Friday. External audit in six weeks.”

Véracier Industries: a fictional industrial group made up of 1,004 documents spread across seven subsidiaries, six languages, scanned PDFs, bilingual contracts, internal emails, and procurement agreements.
At first glance, the task seems straightforward:
- Identify the systems that fall within the NIS2 scope
- Verify the applicable obligations
- Detect compliance gaps
But in an industrial group like Véracier, the most exposed systems are not always the ones listed in official inventories.
Some OT environments directly support strategic production lines without appearing in the group-wide inventory.
- An aging SCADA system is still in operation despite previously identified segmentation issues.
- A MES platform is marked as compliant even though its last security audit took place more than four years ago.
Taken individually, every document appears consistent. Together, they tell a different story.
Marc opens LightOn and asks the question exactly as he would ask his team:
“Which industrial systems actually fall within the NIS2 scope, and where are our compliance gaps?”
The mapping the audit never produced
A few minutes later, LightOn reconstructs what the documents had never expressed collectively.
- Eleven industrial systems genuinely fall within the NIS2 scope
- Three critical OT environments appear in no centralized inventory
- Two industrial networks present segmentation gaps incompatible with group policies
- One SCADA system remains in operation despite an architecture already flagged as non-compliant in a previous audit
- Several assets officially classified as “non-critical” actually support essential production operations
Every conclusion is directly linked to source documents:
- Pentest reports
- OT matrices
- Security policies
- Local audits
- Industrial reference frameworks
Friday morning, the compliance committee begins with an incomplete presentation from the IT teams. Industrial managers challenge parts of the scope. Legal reminds everyone of the regulatory obligations. Executive leadership asks for a clear estimate of the company’s actual exposure.
Marc then opens the consolidated mapping:
- Systems concerned
- Impacted sites
- Identified gaps
- Source documents
- Contradictory architectures
- Incomplete inventories
The discussion changes immediately.
From that point on, nobody debates interpretations anymore. The documents finally speak together.
Where traditional RAG loses the thread
This result is not difficult because the information is impossible to find. It is difficult because the information is fragmented, partial, and sometimes contradictory.
The analysis spans twelve documents distributed across several European entities:
- OT inventories
- Pentest reports
- SCADA architectures
- Criticality matrices
- Cybersecurity policies
- Local industrial procedures
Taken individually, each document provides a plausible answer.
- The group inventory suggests certain systems are out of scope
- The criticality matrix classifies some assets as non-critical
- A local report nevertheless identifies essential production dependencies
- A previous audit mentions unresolved segmentation issues
- A SCADA architecture diagram reveals connections absent from any centralized repository
A traditional RAG system can retrieve each of these documents. But identifying the actual exposure requires something else entirely: reading the documents against one another.
That is where the use case becomes impossible for a conventional search system.
It is not enough to retrieve the NIS2 policy, nor to list the assets already labeled as critical. The system must understand that an officially “non-critical” asset can become regulatory critical because it supports an essential production line, depends on a poorly segmented network, or appears in a local audit that was never consolidated at the group level.
In other words, the risk is not written in any single document.
It only emerges when the documents begin to contradict each other.
Why this use case matters
The CISO-02 scenario in EDiTh was not designed to test whether a system can retrieve a cybersecurity policy.
It tests something far more difficult: Reasoning over the true regulatory scope of an industrial group from incomplete, distributed, and contradictory documents.
The challenge is not retrieving a PDF that mentions NIS2. The challenge is identifying the systems you would actually have to defend before a regulator, an auditor, or an executive committee.
Test the scenario yourself
The NIS2 scenario is part of EDiTh, LightOn’s open enterprise benchmark built around Véracier Industries: a synthetic industrial group containing 1,004 documents distributed across seven subsidiaries, six languages, industrial architectures, cybersecurity audits, and realistic operational documents.
Ask the same question:
“Which industrial systems actually fall within the NIS2 scope, and where are our compliance gaps?”
Then see whether your system merely retrieves cybersecurity policies, or whether it genuinely identifies the systems you would need to defend before a European regulator.
Start with EDiTh. Then test it against your own documents.
Access LightOn Console to run the scenario yourself.
Want to understand how the corpus was built, how retrieval worked, and why this answer is so difficult to produce? Read the EDiTh launch article.
Previously in Impossible Use Cases: “The ambiguity of a force majeure clause, untangled in a single prompt.”




.avif)
.avif)
