Bespoke-Minicheck

Outperforming the Giants: How our 7B Model Reduces Hallucinations While Being Fast

Combat hallucination with our SOTA grounded factuality model and API.

Get a free API Key

SOTA Grounded Factuality Model

Bespoke-MiniCheck is a SOTA grounded factuality model, which ranks at the top of the LLM-AggreFact leaderboard.

Try Now

Figure: Bespoke-MiniCheck checks if a claim is supported by a context. This is useful in detecting hallucinations in RAG settings.

Grounded factuality is a way of measuring hallucination. In grounded factuality, the source of truth is given in a context document. Then, the factuality of a claim is checked with respect to that given context. This problem is also called textual entailment in the NLP and linguistics literature. Please read our blog article for more information.

Grounded factuality is extremely important for RAG, where a context naturally exists, and LLMs generate claims (answers). If a claim is not factually grounded in a context, then it means the model hallucinated some unsupported information. For example, a Stanford study found that in the legal setting, RAG-based AI research tools hallucinate 17% to 33% of the time, contrary to claims that RAG systems are “hallucination-free”.

Using our proprietary curation platform, we trained a 7B model that is remarkably good at grounded factuality. Given a context and a claim, this model spits out a percentage that says how well the claim is supported by the context. We have made this model available on HuggingFace for non-commercial use.

As mentioned above, this model tops the LLM-AggreFact leaderboard with 77.4% on the benchmark. Vectara’s HHEM 2.1, a model with similar capability, gets only 71.8%.

The model we trained is relatively small and beats the performance of much larger models, such as Claude 3.5 Sonnet, in this task. As a result, the model can respond to results in about 200 milliseconds on modern GPUs, thus being useful as a guardrail, and can run on consumer-grade hardware such as Macbooks. Please contact us if you are interested in 100ms response times.

Bespoke-MiniCheck API

Bespoke-MiniCheck is a SOTA grounded factuality model, which ranks at the top of the LLM-AggreFact leaderboard.

Try Now

We are excited to announce the availability of this model’s capability via a self-serve API platform.

You can sign up for free at our Bespoke Console, and use our client library for easy access. Please check the documentation at Bespoke Docs.

You can drastically improve your RAG game in just a few lines of code.

Try Before You Buy

Not convinced that the benchmark numbers tell the full story? We get it. Experience the product’s snappy performance at our Bespoke Playground.

Try Now

Bespoke-MiniCheck blows everything else we tested out of the water.

— ML Engineer at GuardrailsAI

Integrations

Ollama has added support for our model as a first class citizen. It’s available here. The model produces yes/no, but logit support will land soon.

The model is also available via GuardrailsAI’s hub. See more information here.

Contact

For questions/comments related to the product/model, please either contact us or schedule a meeting with one of the founders.

Ready to fine-tune your models with precision?

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.