Blog

OpenThinker is a decensored reasoning model

Read more

Scaling up Open Reasoning with OpenThinker-32B

Powerful reasoning models can be trained by scaling data, verifying reasoning traces, and scaling model size. Releasing OpenThinker-32B, a state-of-the-art open-data reasoning model

Read more

Measuring Reasoning with Evalchemy

If you can't measure it, you can't improve it. Releasing reasoning benchmarks into our model evaluation tool Evalchemy

Read more

Launching the Open Thoughts Project

Open Thoughts, an open-source effort to curate the best open reasoning datasets

Read more

Cut Token Costs in Half: Batch Processing Made Easy with Curator

Read more

Bespoke-Stratos: The unreasonable effectiveness of reasoning distillation

We trained Bespoke-Stratos-32B, our reasoning model distilled from DeepSeek-R1 and using Berkeley NovaSky’s Sky-T1 data pipeline. The model outperforms Sky-T1 and o1-preview in reasoning (Math and Code) benchmarks, and almost reaches the performance of DeepSeek-R1-Distill-Qwen-32B while being trained on 47x fewer examples.

Read more

Hallucinations, Fact checking, Entailment and all that. What does it all mean?

AI hallucinations can derail accuracy, but Bespoke's latest factuality model is designed to combat that. Learn how advanced checks are helping models deliver more reliable outputs, reducing common errors in data generation.

Read more

Ready to fine-tune your models with precision?

By submitting, you agree to receive periodic emails from Bespoke Labs.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.