Retrieval-augmented generation (RAG) is a powerful approach for grounding large language models (LLMs) in external knowledge. But at times, it becomes essential to incorporate external knowledge into the LLM itself, especially if it is small or there is a lot of proprietary knowledge.
To achieve this, we are adding support to generate data with Retrieval Augmented Finetuning (RAFT) method in Curator.
RAFT (Retrieval-Augmented Fine-Tuning) is a technique the Berkeley Gorilla team developed in collaboration with Microsoft Research. RAFT enables the creation of synthetic question-answer (QA) datasets from text documents, which can be used to fine-tune LLMs for better performance in RAG pipelines. RAFT takes chunks of a document and generates relevant question-answer pairs based on the content. The following two key ideas are used when creating the training data:
The former allows the model to pay close attention to the relevant context, and the latter allows the model to memorize some knowledge.
In Curator we have made it very easy to create this data, which can then be fed to existing fine-tuning libraries.
We’ve integrated RAFT natively into Curator through curator.blocks.raft
. That means you can generate high-quality synthetic QA datasets with just a few lines of code, and it fits right into your existing LLM workflow with curator.LLM
objects.
Here’s a working example you can try out:👉 RAFT Example on GitHub
In this walkthrough, we demonstrate:
✅ Native hosted viewer for live dataset previews
✅ Full support for batch mode generation - 50% cost efficient
✅ Smart caching to avoid redoing work
✅ Faster QA dataset generation
✅ Clean and simple API: curator.blocks.raft
If you're building production-grade RAG systems and care about quality, speed, and developer experience, Curator’s RAFT support is designed for you.
As always, please give us feedback and show us support by starring CURATOR on GitHub!