Generating synthetic data at scale can be expensive. So, several LLM API providers, including Google, offer 50%-70% discounts through batch mode, which processes large requests asynchronously. However, batch API with Gemini is notoriously tricky due to many steps involved and scattered documentation.
Let’s go over the steps required for a simple Gemini batch processing (when not using Curator):
These steps add a lot of friction, causing many users to stick to online processing and miss out on significant cost savings. Curator solves this by making Gemini’s batch APIs easy to use!
No manual polling, no file management, just cost-efficient batch processing in a few lines of code.
import os
from bespokelabs import curator
os.environ["HOSTED_CURATOR_VIEWER"]="1"
os.environ["GOOGLE_CLOUD_PROJECT"] = "<project-id>"
os.environ["GEMINI_BUCKET_NAME"] = "<bucket-name>"
os.environ["GOOGLE_CLOUD_REGION "] = "us-central1" # us-central1 is default
llm = curator.LLM(model_name="gemini-1.5-flash-001", backend="gemini", batch=True)
questions = [
{"prompt": "What is the capital of Montana?"},
{"prompt": "Who wrote the novel 'Pride and Prejudice'?"},
{"prompt": "What is the largest planet in our solar system?"},
{"prompt": "In what year did World War II end?"},
{"prompt": "What is the chemical symbol for gold?"},
]
ds = llm(questions)
Happy data generation!
Read more about other API batch processing offered by Curator for OpenAI, Anthropic and more here.
As always, please give us feedback and show us support by spreading the word and starring on GitHub!