OpenAI Issued Its First AI Agent, Which Takes Control of Web Browsers and Performs Actions

IBL News | New York

OpenAI yesterday introduced a research preview of its general-purpose AI agent, an operator that can take control of a web browser and independently perform some actions. It costs $200 per month on a pro subscription plan for paid users in the U.S.

This move is OpenAI’s first attempt in the upcoming agentic economy, with tools that automate and take actions on behalf of humans.

“The Powering Operator is a Computer-Using Agent (CUA), a model that combines GPT-4o’s vision capabilities with advanced reasoning through reinforcement learning. CUA is trained to interact with graphical user interfaces (GUIs)—the buttons, menus, and text fields people see on a screen—just as humans do. This gives it the flexibility to perform digital tasks without using OS- or web-specific APIs,” the San Francisco–based research lab explained.

Operator combines advanced GUI perception with structured problem-solving. It breaks tasks into multi-step plans and adaptively self-corrects. The model seeks user confirmation to enter login details or respond to CAPTCHA forms.

In other words, Operator can use buttons, navigate menus, and fill out forms on a web page much like a human would.

OpenAI says it’s collaborating with companies like DoorDash, eBay, Instacart, Priceline, StubHub, and Uber to ensure that Operator respects these businesses’ terms of service agreements.

These are some of the prompts that OpenAI provided to illustrate the reach of Operator.

• “Search Britannica for a detailed map view of bear habitats. Now please check out the black, brown and polar bear links and provide a concise general overview of their physical characteristics, specifically their differences. Oh and save the links for me so I can access them quickly.”

• “I want one of those target deals. Can you check if they have a deal on poppi prebiotic sodas? If they do, I want the watermelon flavor in the 12fl oz can. Get me the type of deal that comes with this and check if it’s gluten free.”

• “I am planning to shift to Seattle and I want you to search Redfin for a townhouse with at least 3 bedrooms, 2 bathrooms, and an energy-efficient design (e.g., solar panels or LEED-certified). My budget is between $600,000 – $800,000 and it should ideally be close to 1500 sq ft.”

Last week, OpenAI released Tasks, giving ChatGPT simple automation features such as setting reminders and scheduling prompts to run at a set time every day.

“The next challenge space we plan to explore is expanding the action space of agents,” said OpenAI.

OpenAI has been slow to develop an AI agent compared to rivals like Google or Anthropic.