Inference reliability metrics

Inference Provider Leaderboard

We rank inference providers on how accurately they serve models. The ranking is based on a metric: Exact Match Rate. This is the share of output tokens sent by the provider that match tokens sourced from a trusted reference implementation of the model.

We give the providers the same inputs as our trusted reference models and compare how closely their output tokens match.

Exact match rates are usually above 95 percent. Lower rates may signal issues such as quantization affecting behavior, bugs in the inference setup, or use of non standard templates or tokenizers.