Enriching AI with a Knowledge Base to Ease New Employee Onboarding

Accelerate Onboarding with a GenAI Assistant

November 15, 2024

TL;DR

LightOn allows you to streamline onboarding for new employees within your teams. By integrating the augmented research platform Paradigm, you can index and structure your internal documents for quick and efficient searches. This reduces the time needed to find essential information while ensuring security and customization of workflows.

Think back to your first day at your company. Regardless of its size or industry, a common pattern emerges: finding the right resources to accomplish your tasks is challenging, and you may hesitate to bother your colleagues. This lack of access to information hinders productivity and wastes resources, sparking growing interest in generative AI tools like ChatGPT. However, such tools have limitations, particularly in terms of data privacy, security, and the ability to perform effective internal searches. At LightOn, we address these issues with augmented research via private, secure GPTs, enabling organizations to leverage AI-based value without compromising security.

Integrating GenAI solutions into augmented research projects (Retrieval-Augmented Generation or RAG) and even intelligent agents can seem complex, with cumbersome deployments and difficult scaling. This is why LightOn, with Paradigm, offers a turnkey solution that facilitates secure access to internal information via language models (LLMs), while ensuring control over sensitive data.

This article is inspired by a use case for the Île-de-France Region (see this article and the success story on LightOn’s website), where the time needed to find the right information was reduced by more than 85%, according to the regional agents.

Hands-On: From Document Integration to Organization-Wide Use

We’ll walk through Paradigm step-by-step to see how, in practice, it can help facilitate new employee onboarding. The Paradigm version used here includes new Agent-based logic (in beta starting with Illuminated Impala). This demo is based on the LightOn Book, LightOn’s internal documentation describing our internal processes, and similar documents.

Step 1: Efficient Document Import and Indexing

The first task is to be completed by the manager of the team welcoming the new hire: importing our internal documentation into the Paradigm database. The imported documents are read by Paradigm, which captures the semantic and syntactic properties of the data, storing them in a vector database (VectorDB) to help the LLM better understand and interact with the content. LightOn has optimized this process to be completed end-to-end in less than two seconds.

This step ensures that information is stored in a structured and accessible format, enabling the LLM to conduct real-time searches through large data sets. By breaking down large documents into smaller, more manageable parts, our system enables faster searches and more accurate responses.

Document imports are intuitive and secure. Through the Paradigm platform, it is possible to specify the workspace where a document will be uploaded:

A personal workspace, accessible only to the document uploader.
A company-wide workspace, accessible to all company members.
An intermediate workspace, created by an admin, with assigned users.

For now, we’ll choose one of the latter two options to give the new employee access to relevant documents. They can later import and query their own documents.

These workspaces provide a level of security granularity for different documents and enhanced collaboration.

***Figure 1.*** *Uploading documents in a collaborative & secure space.*

Pro tip: If the number of documents to import is too large to do manually, the Paradigm API allows you to batch upload them with just a few lines of Python code. Once the documents are uploaded, the new user can see the list of documents they have access to:

***Figure 2.*** *List of uploaded documents accessible to a user, organized by workspace.*

One important point to keep in mind is the need to keep this database up to date, by modifying documents as needed and removing those that are outdated; as we know, in data science, garbage in, garbage out.

In summary, this step allows our newcomer to view the documents and start asking questions to find the key information within them. Let’s go!

Ready to Transform Your Enterprise?

Step 2: Intelligent Document Search Pipeline (RAG)

Now that the document database is up to date, we can start querying it.

One of Paradigm's strengths is its ability to intelligently build a search and enhanced generation pipeline (RAG). This enables the system to extract relevant information from the documents in response to user requests.

Pro tip: When in doubt, it’s possible to compare multiple models to determine which one best suits a given task.

***Figure 3.*** *Comparison of multiple models without documents.*

‍

Once the user's question is asked, the RAG search proceeds in several steps:

The user's question is rephrased to take into account the context of the conversation and to optimize it for retrieving the most appropriate documents. Paradigm uses a fine-tuned model from LightOn specifically designed for this task.
Once the question is reformulated, Paradigm searches, both through semantic proximity (matching embeddings) and an exact search (presence of query keywords in chunks), for the five most relevant passages from the documents to respond to the query.
These passages are then filtered again to ensure their relevance in answering the question. This step also uses a model fine-tuned specifically for this task, based on numerous examples of French-language documents, ensuring the highest level of performance on the French market.
If no document passes the filtering stage, Paradigm responds to the user’s query in simple Chat mode (without documents). If documents are selected, Paradigm responds with the support of the documents. The response, along with the document passages used to respond, is displayed to the user.

For each question asked, Paradigm displays the documents that contributed to the response. Within each document, the passages used are highlighted.

‍

Suppose the user wants to ask how to request their first vacation at LightOn and poses their question. Here’s what Paradigm returns:

***Figure 4.*** *Model response to a user question.*

This response is accompanied by the documents used to generate it. The user can view the highlighted chunks (passages) within the document that contributed to the response.

***Figure 5.*** *Highlighting relevant information in the document.*

This gives Paradigm users confidence in the response, as it is backed by the company’s documents. It also saves valuable time for the new user, allowing them to become operational more quickly.

Finally, after receiving the model’s response, the user can:

Provide feedback, positive or negative, on the model’s response by clicking a thumbs-up or thumbs-down icon. This is a crucial aspect, allowing the admin teams to understand, through Business Intelligence practices, what is well or poorly understood in the documents.

Report the response to support (flag icon below) to alert the team about a model malfunction. The admin team is then notified that the use case isn’t functioning as the user expects, who can add a comment describing their issue.

***Figure 6.*** *Conversation with Paradigm and feedback options.*

Step 3 – Bonus: Customizing the Chatbot with Individual and Collective Instructions

Each user can write their own instructions, allowing the newcomer to tailor the model's responses according to their individual preferences. Additionally, within each company, an admin can write instructions specific to that company.

These two levels of customization guide the model’s responses to align with the expected behavior for the user. This is especially useful for:

providing contextual information about the person asking the question,
guiding the length of responses,
defining the company's internal terminology or business practices.

***Figure 7.*** *Customizing chat parameters. Company instructions are set by an Admin.*

Step 4: New Agent Capabilities

Paradigm currently offers document search and internet search.

In the coming months, Paradigm will expand with more tools, such as text summarization, and numerical data analysis through Python code execution. Document search will become just one tool among many.

The goal of these new tools is to create autonomous agents capable of selecting the relevant tools for a given request, giving various business functions more integration with their tools and enabling quick value generation through LLMs.

These tools can then be organized into agents using an orchestration model that can choose the right tool at the right time to fulfill the assigned task. Let’s take the example of a new employee joining LightOn. They will describe their needs through a pre-prompt tailored to their use case, add relevant documents to Paradigm’s knowledge base, and gain access to new tools. This opens up new possibilities:

Asking for a summary of a long HR document they don't have time to read.
Summarizing the team’s progress over the year before they joined.
Quickly developing BI tasks, even if they lack familiarity with the company's tools.

‍

By introducing agents, LightOn provides organizations with a powerful solution for automating and personalizing a wide variety of tasks, while ensuring that workflows remain secure.

To learn more, see a demo, or explore our generative AI solution within your environment, contact us.

Ready to Transform Your Enterprise?

TL;DR

Hands-On: From Document Integration to Organization-Wide Use

Step 1: Efficient Document Import and Indexing

Step 2: Intelligent Document Search Pipeline (RAG)

Step 3 – Bonus: Customizing the Chatbot with Individual and Collective Instructions

Step 4: New Agent Capabilities

Recent Blogs

Sodern, a subsidiary of ArianeGroup, Selects LightOn to Industrialize Generative AI Within a Secure Framework

Announcing BioClinical ModernBERT: a new SOTA encoder model for Medical NLP

LightOn Unlocks Agentic RAG with new SOTA Model Reason-ModernColBERT

Ready to Transform Your Enterprise?