François Chollet’s ARC-AGI-3 Reveals Why Today’s AI Still Lacks True Intelligence

AI researcher François Chollet introduced ARC-AGI-3, a new benchmark designed to test real intelligence by evaluating how AI systems handle unfamiliar, interactive tasks. The results show that while humans can solve these problems through reasoning and adaptation, current AI models struggle because they rely heavily on pattern recognition and memorization rather than true understanding. This highlights a major gap in “fluid intelligence” the ability to learn and adapt in new situations suggesting that simply scaling larger models is not enough to achieve true artificial general intelligence.

March 27, 20265 min read

11 Views

Key Highlights

François Chollet introduced a new benchmark called ARC-AGI-3 to test real AI intelligence.
The test reveals that current AI models struggle with new, unfamiliar problems.
Today’s systems rely heavily on memorization and pattern recognition, not true reasoning.
Humans outperform AI in “fluid intelligence” the ability to adapt and solve novel tasks.
The benchmark highlights a key limitation: AI cannot easily recombine knowledge in new situations.

Top AI researcher explains why current AI models still fall short

Leading AI researcher François Chollet is drawing attention to a critical limitation in today’s artificial intelligence systems, despite rapid advancements in the field.

Through a newly introduced benchmark called ARC-AGI-3, Chollet aims to test whether AI can truly think and adapt like humans or whether it is still largely dependent on memorization.

What ARC-AGI-3 reveals about AI intelligence

The ARC-AGI-3 benchmark evaluates AI systems using game-like environments with no instructions, forcing them to figure out rules and strategies on their own.

Unlike traditional tests, these environments are completely new each time, meaning models cannot rely on prior training data or pattern recall.

Humans can solve these tasks by experimenting, learning and adapting. But current AI systems struggle significantly because they are not designed for this kind of open-ended reasoning.

The core problem: AI depends on memorization

According to Chollet, modern AI models are extremely good at absorbing massive amounts of data and recognizing patterns, often surpassing humans in raw knowledge.

However, they fail when faced with something unfamiliar.

This is because most models depend on retrieving learned patterns rather than understanding and reasoning through new problems. Even when they contain vast knowledge, they lack the ability to combine it dynamically in real time.

Humans vs AI: the gap in “fluid intelligence”

One of the biggest differences highlighted by the benchmark is what researchers call fluid intelligence the ability to adapt to new situations.

Humans can approach an unfamiliar problem, test ideas and figure out solutions on the fly. AI systems, on the other hand, often become “lost” when they encounter scenarios they haven’t seen before.

Research around ARC benchmarks consistently shows that while humans can solve nearly all such tasks, AI performance remains significantly lower.

Why scaling bigger models isn’t enough

Chollet has long argued that simply making AI models larger is not the solution.

While bigger models improve performance on known tasks, they do not address the fundamental issue: true general intelligence requires adaptability, not just more data.

The ARC-AGI-3 benchmark reinforces this idea by focusing on how efficiently a system can learn and act in completely new environments not just how much it already knows.

A reality check for the AI industry

The findings come at a time when AI is often portrayed as approaching human-level intelligence. However, benchmarks like ARC-AGI-3 suggest that current systems are still far from achieving true general intelligence.

They excel in structured tasks but struggle with open-ended reasoning, exploration and decision-making skills that humans use effortlessly in everyday life.

Bigger picture: what comes next for AI

Chollet’s work signals a shift in how AI progress is measured. Instead of focusing only on accuracy or scale, researchers are now emphasizing adaptability, reasoning and efficiency.

The message is clear: despite impressive breakthroughs, today’s AI models are still limited and overcoming those limitations will require fundamentally new approaches, not just bigger models.