t>

Maybe AI assistants can become lawyers after all


Last month, I wrote about Mercor’s new benchmark measuring the potential of AI agents in technical tasks such as regulation and industry analysis. At the time, the numbers were so bad, every major lab scored below 25%, so we concluded that lawyers were safe from the transfer of AI, for now.

But the capabilities of AI can change dramatically in a matter of weeks.

This week’s release of Anthropic’s Opus 4.6 he shook blackboardsand the new Anthropic model that was just shy of 30% in one test, and about 45% when given less cracks in the problem. Of course, this release also included a lot of new features, including a “helper group,” which may have helped to solve this type of problem.

Regardless, the leap is a big jump from the past, and it’s a sign that progress on the foundation’s models isn’t slowing down. Mercor CEO Brendan Foody, who was impressed, said, “jumping from 18.4% to 29.8% in a few months is crazy.”

The APEX-Agents Leaderboard.Image credit:Mercor (photo)

Thirty percent is still far from 100%, so it’s not like lawyers need to worry about being replaced by machines next week. But they should be more confident than they were last month!



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *