A New Framework for Evaluating Voice Agents (EVA)

https://huggingface.co/blog/ServiceNow-AI/eva(huggingface.co)

A new end-to-end evaluation framework called EVA has been introduced for conversational voice agents. It addresses the challenge of existing frameworks that evaluate task accuracy and conversational experience in isolation by being the first to jointly score both. EVA uses a realistic bot-to-bot architecture to produce two primary scores: EVA-A (Accuracy) and EVA-X (Experience). The initial release includes an airline dataset and benchmark results for 20 systems, revealing a consistent trade-off where agents excelling in task completion often provide a worse user experience.

0 points•by ogg•3 months ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?