The Remote Labor Index: Measuring the Automation of Work

https://scale.com/blog/rli(scale.com)

Scale AI and the Center for AI Safety have introduced the Remote Labor Index (RLI), a benchmark to measure how well AI agents can automate paid freelance work. The initial findings show that the best AI agents can successfully automate only 2.5% of real-world, end-to-end projects across domains like software development and design. Agents frequently fail due to low-quality output, incomplete deliverables, and an inability to follow complex, multi-step instructions. The research indicates that while AI is not yet capable of widespread autonomous work, it excels at specific generative tasks like creating images or audio from a simple prompt, suggesting its immediate impact will be augmentation rather than full automation.

0 points•by chrisf•8 months ago

Comments (0)

No comments yet. Be the first to comment!

Want to join the discussion?