Artificial Intelligence, or AI for short, has been touted as a revolutionary technology with the potential to automate many jobs and transform the way we work. However, a recent benchmark suggests that even the most advanced AI agents are woefully inadequate when it comes to performing freelance work.
Researchers at Scale AI and the Center for AI Safety (CAIS) recently developed a new benchmark that measures an AI agent's ability to automate economically valuable work. The experiment involved giving several leading AI agents a range of simulated freelance tasks, including graphic design, video editing, game development, and administrative chores.
The results were stunningly underwhelming. Even the best AI agents were able to perform less than 3% of the work, earning a paltry $1,810 out of a possible $143,991. The most capable AI agent in the experiment was Manus from a Chinese startup, followed closely by Grok from xAI, Claude from Anthropic, ChatGPT from OpenAI, and Gemini from Google.
"It's hard to see how this is going to change much anytime soon," says Dan Hendrycks, director of CAIS. "We've been talking about AI replacing humans for jobs for years, but most of that has been theoretical or hypothetical."
The researchers acknowledge that their benchmark is not a perfect measure of an AI agent's economic impact, as many professions include tasks not covered by the measure. Nevertheless, the findings offer a sobering reminder that AI is unlikely to be stepping into vacated roles anytime soon.
Meanwhile, speculation about AI surpassing human intelligence and replacing vast numbers of workers continues to gain momentum. In March, Dario Amodei, CEO of Anthropic, suggested that 90% of coding work would be automated within months. However, the latest benchmark suggests that this is unlikely to happen anytime soon.
As one researcher notes, "They don't have long-term memory storage and can't do continual learning from experiences. They can't pick up skills on the job like humans." The idea that AI is already taking jobs is gaining traction, however, with Amazon recently announcing plans to cut 14,000 jobs in part due to the rapid rise of generative artificial intelligence.
It's clear that while AI has the potential to transform many aspects of our work lives, it's unlikely to be a silver bullet for job replacement anytime soon.
				
			Researchers at Scale AI and the Center for AI Safety (CAIS) recently developed a new benchmark that measures an AI agent's ability to automate economically valuable work. The experiment involved giving several leading AI agents a range of simulated freelance tasks, including graphic design, video editing, game development, and administrative chores.
The results were stunningly underwhelming. Even the best AI agents were able to perform less than 3% of the work, earning a paltry $1,810 out of a possible $143,991. The most capable AI agent in the experiment was Manus from a Chinese startup, followed closely by Grok from xAI, Claude from Anthropic, ChatGPT from OpenAI, and Gemini from Google.
"It's hard to see how this is going to change much anytime soon," says Dan Hendrycks, director of CAIS. "We've been talking about AI replacing humans for jobs for years, but most of that has been theoretical or hypothetical."
The researchers acknowledge that their benchmark is not a perfect measure of an AI agent's economic impact, as many professions include tasks not covered by the measure. Nevertheless, the findings offer a sobering reminder that AI is unlikely to be stepping into vacated roles anytime soon.
Meanwhile, speculation about AI surpassing human intelligence and replacing vast numbers of workers continues to gain momentum. In March, Dario Amodei, CEO of Anthropic, suggested that 90% of coding work would be automated within months. However, the latest benchmark suggests that this is unlikely to happen anytime soon.
As one researcher notes, "They don't have long-term memory storage and can't do continual learning from experiences. They can't pick up skills on the job like humans." The idea that AI is already taking jobs is gaining traction, however, with Amazon recently announcing plans to cut 14,000 jobs in part due to the rapid rise of generative artificial intelligence.
It's clear that while AI has the potential to transform many aspects of our work lives, it's unlikely to be a silver bullet for job replacement anytime soon.
 . I mean, sure, they're good at some stuff, but can they really do it all? The benchmark is pretty weak imo - just because an AI can do 3% of the work doesn't mean it's ready for prime time. And what about the skills that aren't even covered in this thing? Like, how would it handle a crisis or something?
. I mean, sure, they're good at some stuff, but can they really do it all? The benchmark is pretty weak imo - just because an AI can do 3% of the work doesn't mean it's ready for prime time. And what about the skills that aren't even covered in this thing? Like, how would it handle a crisis or something? . "90% of coding work will be automated within months"? No way, dude. AI might be good at some repetitive tasks, but it's not like it can just pick up where a human left off and keep going. And what about all the things that require common sense or empathy? That's still way out of its league.
. "90% of coding work will be automated within months"? No way, dude. AI might be good at some repetitive tasks, but it's not like it can just pick up where a human left off and keep going. And what about all the things that require common sense or empathy? That's still way out of its league. . It's cool and all, but let's not be too hasty in our enthusiasm.
. It's cool and all, but let's not be too hasty in our enthusiasm. I mean, I knew AI was getting better and all, but I didn't think it would be this far behind us yet
 I mean, I knew AI was getting better and all, but I didn't think it would be this far behind us yet  . It's like, I get that it's not perfect and can't learn from experience or anything, but still...
. It's like, I get that it's not perfect and can't learn from experience or anything, but still...  Maybe they just need to work on those skills a bit more?
 Maybe they just need to work on those skills a bit more?  ... I mean, we're supposed to be on the cusp of some revolutionary technology that's gonna make our lives so much easier, but honestly, it just seems like they're not even close
... I mean, we're supposed to be on the cusp of some revolutionary technology that's gonna make our lives so much easier, but honestly, it just seems like they're not even close  ... and don't even get me started on the whole "90% of coding work will be automated in months" thing
... and don't even get me started on the whole "90% of coding work will be automated in months" thing  . And what really gets my goat is that people are already talking about how AI is gonna take our jobs
. And what really gets my goat is that people are already talking about how AI is gonna take our jobs  ... like, we need to wait until they can even perform some basic tasks without making a total mess of it before we start panicking
... like, we need to wait until they can even perform some basic tasks without making a total mess of it before we start panicking  ... and don't even get me started on the whole "long-term memory storage" thing
... and don't even get me started on the whole "long-term memory storage" thing  . These researchers are being realistic for once and acknowledging that AI isn't as magical as we've been led to believe
. These researchers are being realistic for once and acknowledging that AI isn't as magical as we've been led to believe 


 . and let's not forget, there are always going to be jobs that require human touch.
. and let's not forget, there are always going to be jobs that require human touch. 
 It's actually pretty interesting how these top-notch AI agents struggled with even simple freelance tasks
 It's actually pretty interesting how these top-notch AI agents struggled with even simple freelance tasks  And yeah, I guess Dan Hendryck's point about it being hard to see AI changing much anytime soon makes sense... AI is just not that advanced yet
 And yeah, I guess Dan Hendryck's point about it being hard to see AI changing much anytime soon makes sense... AI is just not that advanced yet  .
. . i mean, we're talking 3% of work done by the best ai agents? come on!
. i mean, we're talking 3% of work done by the best ai agents? come on!  .
. . ai might be able to do some stuff, but it's not like they can just pick up skills on the job or anything
. ai might be able to do some stuff, but it's not like they can just pick up skills on the job or anything  .
. . it's gonna take a lot more than this for me to start putting my faith in the machines
. it's gonna take a lot more than this for me to start putting my faith in the machines  . Like, who doesn't love having more time to focus on creative work?
. Like, who doesn't love having more time to focus on creative work?  .
. .
.

 like, what's the point of even trying if they can't even do 3% of the work?
 like, what's the point of even trying if they can't even do 3% of the work?  .
. . I've been hearing all this hype about how AI is going to automate everything and make our lives easier, but let's be real, most people are just okay with having a robot do their taxes or something
. I've been hearing all this hype about how AI is going to automate everything and make our lives easier, but let's be real, most people are just okay with having a robot do their taxes or something  .
. .
. .
. . The whole 'AI surpassing human intelligence' thing seems like a bit of an exaggeration to me
. The whole 'AI surpassing human intelligence' thing seems like a bit of an exaggeration to me  . That does show that companies are already feeling the effects of AI in the job market
. That does show that companies are already feeling the effects of AI in the job market