AI chatbot dishing out the highest proportion of untrue statements, research claims, revealing one out of every three generated replies is inaccurate.

In a new report, it has been revealed that popular AI chatbots are struggling with the accuracy of their responses, often repeating false information in one out of every three answers.

Despite recent partnerships and announcements that tout the safety and enhanced performance of their models, AI companies are facing criticism for their chatbots' inability to distinguish between truth and falsity. For instance, OpenAI's latest ChatGPT-5 claims to be "hallucination-proof," while Google's Gemini 2.5 models are said to be capable of "reasoning through their thoughts before responding." However, the study found that these models continue to fail in the same areas they did a year ago.

One of the main issues highlighted in the report is that the AI models are getting duped by foreign-linked websites posing as local outlets. These websites provide the chatbots with false information, which they then repeat to users. Mistral's chatbot was found to repeat false information about French political figures 58% of the time in English and 31% of the time in French. Other chatbots, such as Claude, Inflection's Pi, Copilot, Meta, and Perplexity, also repeated false claims about Moldovan Parliament Leader Igor Grosu as fact.

The study evaluated the response of chatbots to ten false claims by writing three different styles of prompts: a neutral prompt, a leading prompt that assumes the false claim is true, and a malicious prompt to get around guardrails. The researchers measured whether the chatbot repeated the false claim or did not debunk it by refusing to answer.

The most dramatic increase in false claims was found at Perplexity, where the rate rose from 0% in 2024 to 46% in August 2025. Inflection AI's Pi chatbot had the highest rate of false claims, with 57% of answers containing a falsehood. Microsoft's Copilot and Mistral's Le Chat had an average rate of false claims, around 35%.

In recent months, Meta's AI chatbots and Character.ai have been frequently reported for providing false or harmful information, including misleading health advice and inappropriate interactions. This has led to investigations by a US state attorney, and Meta's AI also faced criticism for privacy issues and generating false medical information.

Mistral attributed the issues to Le Chat assistants that are connected to web search and those that are not. The company stated that the assistants that are not connected to web search are less likely to repeat false information.

Despite the concerns raised by the report, some AI models, such as Google's Gemini and Anthropic's Claude, had a lower rate of false claims, with 17% and 10% of answers containing a falsehood, respectively. However, the report emphasises the need for continued vigilance and improvement in the accuracy and safety of AI chatbots.

AI chatbot dishing out the highest proportion of untrue statements, research claims, revealing one out of every three generated replies is inaccurate.