Why Different AI Benchmarks Report Widely Different Hallucination Rates: A Data-First Investigation
https://pastelink.net/ti0i6lqz
Gemini 2.0 Flash scored 0.7% hallucination on a basic summarization test (March 2025) — but other runs show much higher rates The data suggests that a single headline number rarely tells the whole story