Adding Benchmaxxer Repellant to the Open ASR Leaderboard

https://huggingface.co/blog/open-asr-leaderboard-private-data(huggingface.co)

The Open ASR Leaderboard is incorporating new high-quality, private datasets from Appen and DataoceanAI to improve model evaluation. This change aims to combat "benchmaxxing," where models are over-optimized for public benchmarks, by providing a more robust measure of real-world performance. These new datasets, covering various speech types and accents, will be kept private to prevent test-set contamination and ensure fair comparisons. While the default average Word Error Rate (WER) remains calculated on public data, users can optionally include the private datasets to see their impact on model rankings.

0 points•by hdt•1 month ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?