Are Large Language Models "Intelligent"?

1. Introduction: Why This Matters

In mainstream media and industry, large language models (LLMs) are routinely described as "intelligent" or even showing "sparks of AGI." But current LLMs are self-supervised sequence models: extremely capable, but optimised to predict the next token in text. They are not grounded in the physical world and do not possess the embodied, causal, socially situated learning that characterises human intelligence.

This article argues:

Calling LLMs "intelligent" in the human sense is misleading.
A credible path toward AGI requires embodied, multimodal, socially grounded, continual learning.

2. What LLMs Actually Do: Stochastic Parrots, Not Reasoners

Bender et al. (2021) describe LLMs as "stochastic parrots": systems that can generate fluent text via statistical correlations but without grounding, understanding, or agency.¹ They warn that LLMs:

scale environmental and financial costs,
amplify biases in web-scale data,
produce plausible-sounding but ungrounded outputs.

Multiple studies show that LLMs struggle with tasks requiring genuine reasoning or causal understanding, especially when problems deviate from training distributions.^5,6

3. Scaling Is Hitting Technical and Economic Limits

Between 2020–2024, bigger models + more compute = better performance. Today, major constraints are clear:

Flattening scaling laws — marginal gains require exponentially more compute.⁸
Data exhaustion — high-quality text is nearly "mined out".
Macroeconomic limits — The 2025 Storm report argues current AI valuations rest on unrealistic productivity assumptions.⁸

If scaling alone could yield AGI, we would not need elaborate scaffolds like chain-of-thought prompting, retrieval, tool use, memory systems, or Monte Carlo search. These exist precisely because text prediction alone cannot produce causal world understanding.

4. The Brain Is a Prediction Machine — But Predicting the World, Not Tokens

Predictive-processing theories (Friston 2010; Clark 2013) conceptualise the brain as a hierarchical prediction engine minimising error signals across sensory channels.^2,3 But crucially, the brain predicts:

visual and auditory features,
motor outcomes and proprioception,
interoceptive signals,
social cues and intentions,
causal consequences of actions.

A 2024 meta-analysis by Costa et al. identifies a domain-general "Dynamic Prediction Network" across cognition.⁴ Humans are indeed prediction machines — just not token-prediction machines.

5. How LLM Concept Representations Diverge from Human Concepts

Xu et al. (2025) compared human conceptual representations with LLM embeddings for 4,442 concepts.⁹ They found:

High similarity for abstract/non-sensorimotor features,
Lower similarity for sensory features,
Very poor similarity for motor/action features.

This aligns with the symbol grounding problem: LLMs define symbols only via other symbols, not via embodied experience.¹⁰

6. Why Better Data Alone Won't Fix It

Two structural barriers remain:

Embodiment: Human cognition arises from closed-loop interaction with the world. LLMs lack sensorimotor grounding.¹⁰
Continual learning: Humans learn over decades. LLMs forget catastrophically without specialised techniques.^7,11

Multi-agent RL shows hints of emergent communication and culture,^12,13 but these systems remain brittle and narrow.

7. A More Plausible Route Toward AGI

Embodied agents acting in real or simulated environments.¹⁴
Multimodal grounding linking concepts to perception and action.¹⁰
Social learning including teaching, imitation, and cultural transmission.^12,13
Robust continual learning without catastrophic forgetting.^7,11
Causal and uncertainty-aware reasoning building explicit models of the world.¹⁵

LLMs may remain important modules — but not standalone artificial minds.

8. So, Are LLMs "Intelligent"?

If "intelligent" means surface competence, LLMs qualify. If it means grounded understanding, causality, agency, and lifelong learning — they do not.

Calling LLMs "intelligent" obscures real limitations and risks.

9. References

Ahn, J., Verma, R., Lou, R., Liu, D., Zhang, R., & Yin, W. (2024). Large language models for mathematical reasoning: Progresses and challenges. arXiv:2402.00157. https://arxiv.org/abs/2402.00157
An, Z., et al. (2025). Embodied intelligence: Recent advances and future perspectives. The Innovation Informatics. https://doi.org/10.59717/j.xinn-inform.2025.100008
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots. FAccT '21 (pp. 610–623). ACM. https://doi.org/10.1145/3442188.3445922
Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3), 181–204. https://doi.org/10.1017/S0140525X12000477
Costa, C., et al. (2024). Comprehensive investigation of predictive processing. Human Brain Mapping, 45(12), e26817. https://doi.org/10.1002/hbm.26817
Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138. https://doi.org/10.1038/nrn2787
Harnad, S. (1990). The symbol grounding problem. Physica D, 42(1–3), 335–346. https://doi.org/10.1016/0167-2789(90)90087-6
Hao, S., et al. (2023). Reasoning with language model is planning with world model. arXiv:2305.14992. https://arxiv.org/abs/2305.14992
Ndousse, K., et al. (2021). Emergent social learning via multi-agent reinforcement learning. ICML (pp. 7991–8004). PMLR.
Pang, R. Y., et al. (2024). GSM-Symbolic: Understanding the limitations of mathematical reasoning in LLMs. arXiv:2410.05229.
Phan, T., et al. (2024). Multi-agent reinforcement learning: A comprehensive survey. IEEE TPAMI. https://doi.org/10.1109/TPAMI.2024.3378699
Storm, S. (2025). The AI bubble and the U.S. economy. Working Paper No. 240. INET. https://doi.org/10.36687/inetwp240
van de Ven, G. M., Siegelmann, H. T., & Tolias, A. S. (2020). Brain-inspired replay for continual learning. Nature Communications, 11, 4069. https://doi.org/10.1038/s41467-020-17866-2
Xu, Q., et al. (2025). Large language models without grounding recover non-sensorimotor but not sensorimotor features of human concepts. Nature Human Behaviour, 9(9), 1871–1886. https://doi.org/10.1038/s41562-025-02203-8
Zhao, W. X., et al. (2024). A survey on large language models for continual learning. arXiv:2402.01364.

← Back to Blog

Are Large Language Models "Intelligent"? Text Prediction, Embodiment, and the Road to AGI