Generate Batched RAG Outputs from Datasets

Input Field: Specify the column name in your dataset containing the input questions
RAG Parameters: Set retrieval options (number of chunks, similarity threshold, etc.) Configure generation parameters
Reranker Options: Toggle Use Reranker to enable/disable reranking of retrieved documents If enabled, enter the name of the reranker model from HuggingFace or the path to the reranker

This page explains how to generate batched RAG (Retrieval-Augmented Generation) outputs from datasets using Transformer Lab.

Prerequisites

Before using the RAG Batched Outputs Generator, you need:

Once you have configured your task, click on the Create button
Click on the Queue button to start the generation process
The task will begin processing each input in your dataset through the RAG pipeline

When the task completes, a new dataset will be created in the Datasets tab
This dataset will contain:
- Your original data columns
- The RAG-generated outputs
- Retrieved context used for generation
- If your input dataset was created with the "RAG QA Dataset Generator for RAG Evaluation" plugin, an expected_output column will also be preserved for evaluation purposes

This plugin is particularly useful for: