Document QnA Migration Guide


Document QnA Element allows users to make a set of documents available for an LLM to consider in its response (RAG). We recently replaced this by a more modular and feature rich RAG pipeline that gives you better control.

Below is a description of how a DocQnA flow can be setup in the new RAG pipeline:

RAG Ingestion Flow

The RAG (Retrieval-Augmented Generation) ingestion flow is the foundational pipeline responsible for processing and storing your documents in a structured format, enabling efficient querying and semantic retrieval during inference.

1. OCR

The ingestion process begins with the OCR (Optical Character Recognition) component. This element scans through the specified folder or directory, identifying and reading various document types (e.g., PDFs, scanned images, Word documents etc). It systematically iterates over each file in the input folder, ensuring all documents are processed.

2. Semantic Chunking

Once the documents are parsed and text is extracted, the next stage involves chunking the content. The goal is to divide large documents into smaller, semantically meaningful sections or "chunks."

  • Natural Boundaries: Chunking is typically done based on paragraph breaks, sentence limits, or semantic boundaries to maintain context.
  • Chunk Size and Overlap: Parameters such as chunk size (e.g., 500 tokens) and overlap (e.g., 50 tokens) can be configured to optimize retrieval performance and prevent information loss between segments.

3. Embedding Generation

Each chunk of text is then passed through an embedding model. This model transforms the textual data into high-dimensional vector representations.

  • Vector Representation: These embeddings capture the semantic meaning of each chunk, enabling similarity-based search during retrieval.

4. Storage in Vector Database

The final step in the ingestion pipeline involves storing the resulting data in a vector database. In this case, ChromaDB is used locally to store:

  • Chunks (Raw Text): The original semantic chunks of text.
  • Vectors (Embeddings): The numerical vector representation of each chunk.
  • Metadata: Additional information such as source document name, page number, timestamp, or any custom tags.

This structured storage allows the RAG system to efficiently retrieve relevant document segments based on user queries, enhancing the accuracy and contextual relevance of generated responses.

rag-ingestion-update-05-27.png

Settings

OCR

  • Data Path: Specify the direct path to the folder containing the source documents to be processed.
  • Output Path: Define a target directory where the processed output and generated database will be stored.
doc-qna-ingestion-ocr-settings.png

Chunking

  • Configurable Parameters: Users can customize settings such as chunk size, overlap..etc
  • Flexibility: All chunking-related settings are adjustable to suit individual preferences and document types.
doc-qna-ingestion-chunking-settings.png

Embedding

  • Enable Toggle: Set the "Is Ingestion" toggle to true to activate embedding during the ingestion process.
  • Model Selection: Users can choose from various embedding models based on their specific requirements, such as accuracy, speed, or model provider.
doc-qna-ingestion-embedding-settings.png

RAG Inference Flow

rag-inference-updated-5-27.png

Once your documents have been successfully processed and the vector database has been created, the system is ready to transition from ingestion to inference mode. This enables interaction with your documents via natural language queries.

Configuration for Inference Pipeline

Disable Ingestion Mode

  • Set the "Is Ingestion" toggle to false to switch the system from data ingestion to inference mode.
doc-qna-inference-embedding-settings.png

Set Artifact Path

Provide the path to the directory where the previously saved artifacts are stored. This allows the inference pipeline to load and use the database.

Select LLM Architecture

  • Choose your preferred Large Language Model (LLM) for generating responses.
doc-qna-inference-llm-chat-settings.png

Related Documentation

For additional guidance on working with Navigator:

  • Element Registry - Learn how to build powerful solutions with Navigator elements
  • Supported LLM Models - Discover which LLM models are supported for your flows
  • Terminology - Understanding key concepts and terms in Navigator
  • LLM API - External application integration with your LLM workflows