ChatGPT functioning as an active agent capable of carrying out tasks
OpenAI, a leading artificial intelligence (AI) research company, has recently launched a groundbreaking AI agent named ChatGPT. Scheduled to roll out on Thursday, July 17, 2025, this agent is designed to perform a wide range of computer-based tasks for users.
The ChatGPT agent demonstrates state-of-the-art performance across multiple benchmarks, scoring 41.6% on Humanity's Last Exam (pass@1) and 27.4% on the FrontierMath benchmark with access to tools like a code-executing terminal. These scores represent a significant leap from its predecessors, nearly doubling their scores.
One of the key features of ChatGPT is its ability to handle tasks such as planning, buying ingredients for a Japanese breakfast for four, and analyzing three competitors and creating a slide deck. It also incorporates features from prior tools like Operator and Deep Research.
However, safety considerations were integral in the development of ChatGPT due to its advanced capabilities. To thwart misuse by bad actors potentially extracting sensitive information through prompts, OpenAI has disabled the ChatGPT agent's memory feature. Real-time monitoring during user interactions has been implemented, with a classifier to assess each prompt for biological relevance. If a prompt is flagged, another monitor evaluates whether the content could pose a biological threat.
ChatGPT is available to subscribers of OpenAI's Pro, Plus, and Team plans. Users can enable the agent by selecting "agent mode" from the dropdown menu in ChatGPT. The agent can connect to applications such as Gmail and GitHub, and it has access to a terminal and can utilize APIs for certain applications.
While the real-world effectiveness of the ChatGPT agent remains to be fully ascertained, OpenAI claims it is significantly more capable than previous versions of AI agents. The company also categorizes the ChatGPT agent as "high capability" concerning biological and chemical weapon domains.
OpenAI has taken steps to prevent the misuse of ChatGPT, such as disabling its memory feature and implementing real-time monitoring. These measures aim to ensure that the powerful capabilities of the ChatGPT agent are used responsibly and for the benefit of society.