Building Autoraters for Expert-Level Reasoning Data

https://scale.com/blog/building_autoraters_for_expert_level_reasoning_data(scale.com)

Scale AI is using advanced AI "autoraters" to improve the quality control of their expert-level reasoning data, which is in high demand for training foundation models. These autoraters leverage a human-in-the-loop approach and model debate to detect incorrect answers more effectively. By using multiple LLMs and human collaboration, Scale AI ensures the accuracy of the data delivered to customers, particularly in challenging domains like math, coding, and science. The multi-agent system evaluates the correctness of answers and reasoning traces, and can be integrated with various foundation models. Testing shows that Scale AI's custom QC methods are robust and more precise compared to single-model approaches, leading to significant improvements in data quality.

0 points•by hdt•6 months ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?