QIMMA قِمّة ⛰: A Quality-First Arabic LLM Leaderboard

https://huggingface.co/blog/tiiuae/qimma-arabic-leaderboard(huggingface.co)

QIMMA is a new leaderboard designed to evaluate Arabic Large Language Models (LLMs) with a focus on quality. It addresses the problem of fragmented and unvalidated Arabic NLP benchmarks, many of which are direct translations from English and contain errors. QIMMA applies a rigorous quality validation pipeline, including automated assessment and human review, to the benchmarks before any model evaluation occurs. This process revealed systematic quality issues in existing resources, and the leaderboard aims to provide a more reliable measure of genuine Arabic language capabilities.

0 points•by ogg•3 months ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?