Skip to content

Underdog's Triumph: The Mini AI Model Surpasses Microsoft on Meta's GAIA Benchmark Test

Unveil the impressive performance of Coral Protocol's AI mini-model on the GAIA benchmark, surpassing leading competitors with an unprecedented high score.

Title: Little Yet Powerful: The AI Mini-Model That Beat Microsoft on GAIA Benchmark by Meta
Title: Little Yet Powerful: The AI Mini-Model That Beat Microsoft on GAIA Benchmark by Meta

Underdog's Triumph: The Mini AI Model Surpasses Microsoft on Meta's GAIA Benchmark Test

In a groundbreaking development, London-based Coral Protocol has announced the launch of its multi-agent AI "mini-model" system. This innovative technology has outperformed Microsoft's agent platforms by 34% on Meta's GAIA benchmark, marking a significant milestone in the field of artificial intelligence.

Coral Protocol's system connects a team of narrow-purpose AI agents, each excelling at particular tasks, collaborating in a secure, real-time framework. The agents communicate, verify each other's identities, and handle payments or credits through Coral's infrastructure, which provides a secure communication protocol called 'MCP'.

The mini-models in Coral's network are relatively lightweight, making them suitable for resource-constrained environments. This horizontal, multi-agent strategy offers better performance for a given amount of compute and greater fault-tolerance and flexibility. As a result, Coral's network of mini-models can solve problems faster, at lower cost, and with potentially greater security and robustness compared to traditional AI systems.

The GAIA benchmark, a demanding test suite of real-world tasks requiring heavy reasoning, web browsing, data analysis, and tool use, has shown that human participants typically answer about 92% of its questions correctly, while advanced AI like GPT-4 (with plugins enabled) manage only ~15%. In comparison, Coral's mini-model achieved the top score for small-scale AI systems, surpassing Microsoft-backed Magnetic-UI (which scored ~30%).

The global market for chatbots, voice agents, and smart speakers is projected to reach $37.7 billion by 2026, indicating a rapid growth in the adoption of AI assistants. Coral Protocol is aiming to build an open, interconnected network of AI agents, or an "Internet of Agents." This network would allow AI systems from anywhere to communicate, coordinate, and even transact with each other seamlessly.

NVIDIA AI researchers argue that small language models are "sufficiently powerful, inherently more suitable, and necessarily more economical" for many agentic tasks. However, the company that developed the successful mini-model remains unnamed in the provided search results.

Coral's Chief Technology Officer, Caelum Forder, stated that the breakthrough marks a turning point in AI infrastructure, with the potential to revolutionise the way we interact with AI systems on a daily basis. As the timing of Coral's feat coincides with the rapid growth in the adoption of AI assistants, we could soon witness billions of AI-driven interactions every day.

Read also: