OpenAI has introduced a new AI agent called Operator. The tool is available to people who pay $200 a month for the company’s highest subscription tier, ChatGPT Pro.
Today we are launching our next agent capable of doing work for you independently—deep research.
Give it a prompt and ChatGPT will find, analyze & synthesize hundreds of online sources to create a comprehensive report in tens of minutes vs what would take a human many hours. pic.twitter.com/03PPi4cdqi
— OpenAI (@OpenAI) February 3, 2025
Operator gives users the ability to direct an AI agent that can use a web browser, fill out forms, and take other actions on their behalf.
AI agents are becoming popular in Silicon Valley.
Temporarily, I think, OpenAI is best-in-class for a lot of AI stuff (Reasoners, mini models, researchers, agents), still trailing Google for video and almost everyone for image generation, and, for non-reasoner standard models, Claude & Gemini Pro Experimental likely beat GPT-4o.
— Ethan Mollick (@emollick) February 3, 2025
Many industry insiders think they are the next big thing because an AI agent that can use a computer can actually accomplish valuable real-world tasks, rather than just provide assistance. Many of the leading AI companies, including Google and Anthropic, are testing autonomous agents that they claim companies will eventually be able to “hire” as full-fledged workers.
To test Operator, a user upgraded their ChatGPT subscription.
The key takeaway from OpenAI’s Deep Research preview is that they’ve made significant progress on longterm planning + tool calling.
This is how you get The Virtual Collaborator.
Agents are coming.
— Mckay Wrigley (@mckaywrigley) February 3, 2025
On the surface, Operator looks a bit like regular ChatGPT, except that when you give it a job, such as “Buy me a 30-pound bag of dog food on Amazon,” Operator opens a miniature browser window, types “Amazon.com” into the address bar and starts clicking around, trying to follow your instructions. However, Operator’s performance has been far from perfect.
It requires constant supervision and frequent assistance when it gets stuck. It is also rather sluggish and slow to adapt.
THE SHIFT How Helpful Is Operator, OpenAI’s New A.I. Agent? Operator, a new computer-using tool from OpenAI, is brittle and occasionally erratic, but it points to a future of powerful A.I. agents. https://t.co/Nj3cfsMFzE pic.twitter.com/Ild9HdbFiO
— Evan Kirstel #B2B #TechFluencer (@EvanKirstel) February 1, 2025
“For several agonizing moments, I watched as OpenAI’s artificially intelligent agent slowly navigated the internet like someone who’s had the web described to them in great detail but never actually used it,” wrote Rachel Metz.
Operator faces practical performance issues
“I had to monitor it the entire time.”
Metz’s experience suggests a considerable gap between OpenAI’s current capabilities and the vision of releasing AI agents that can function as virtual employees or personal assistants. These agents are expected to boost productivity by handling tasks autonomously.
OpenAI has highlighted Operator’s potential usefulness in making reservations, booking flights, and creating shopping lists. However, the technology is still in an early stage and is not as hands-off as one might hope. OpenAI has warned that Operator needs confirmation before completing important tasks, indicating it isn’t yet trustworthy enough to be left alone.
For example, Operator successfully managed tasks like getting ice cream delivered but required guidance and permission, such as providing payment info and approving the purchase. For more complex tasks like creating a spreadsheet for a schedule, it frequently messed up the details. OpenAI admitted that Operator still struggles with “complex interfaces” like creating slideshows and managing calendars.
While it can assist with simpler tasks like ordering food via Instacart, the question remains if it is worth the $200 monthly fee. “Given it was just a test, I was ready and willing to keep a close eye on the product,” Metz concluded. “But if OpenAI and its peers want agents to take off, they’ll need to convince people that they can trust these services to act autonomously on their behalf.
Otherwise, we may decide if we want a job done right, we should just do it ourselves.”