What Is AGI? Artificial General Intelligence Explained

Artificial general intelligence (AGI) refers to AI that can match or exceed human performance across most cognitive tasks, not just one narrow domain.
Today’s systems are narrow AI — superb at specific tasks (chess, translation, image labelling) but unable to generalise the way humans do.
There is no agreed-upon definition or test; the Turing test (1950) and the newer ARC-AGI benchmark capture only slices of “general” ability.
Expert timelines range from a few years to many decades to “never”; a 2022 survey of researchers put the median 50% estimate for high-level machine intelligence around 2059.
Superintelligence — AI far beyond human ability — raises safety and control questions that motivate much of today’s AI-safety research.

What AGI actually means

Artificial general intelligence is the hypothetical ability of a machine to understand, learn, and apply knowledge across the full range of tasks a human can handle, rather than being confined to one domain. According to Wikipedia’s overview, AGI implies an agent that can transfer skills to unfamiliar problems and reason flexibly — the defining gap between it and the narrow systems deployed today.

The phrase gained currency in the early 2000s, popularised by researchers including Shane Legg and Ben Goertzel, partly to distinguish their ambition from the narrow, task-specific AI that dominated industry. The word “general” is load-bearing: it points at breadth and transfer, not raw performance on any single benchmark. A system can be superhuman at Go and still possess no general intelligence whatsoever.

Narrow AI vs general AI

Narrow AI — sometimes called weak AI — is built for a bounded task and cannot step outside it. A fraud-detection model, a chess engine, or a speech recogniser are all narrow: each is excellent within its lane and useless beyond it. Every commercially deployed AI system in 2026, including large language models, is generally classified as narrow, however broad its surface competence appears.

The interesting wrinkle is that modern foundation models blur the line. A single large model can write code, draft email, summarise law, and answer biology questions — far broader than a classic narrow system. Whether that breadth amounts to genuine generality, or merely wide-but-shallow pattern matching, is one of the central unresolved debates in the field.

AGI vs superintelligence

AGI and superintelligence are distinct stages. AGI denotes roughly human-level general capability; superintelligence, as philosopher Nick Bostrom defined it in his 2014 book, is “an intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest.” One is parity with humans; the other is decisive superiority over them.

The two are linked by a much-discussed idea: an “intelligence explosion.” The concept, traced to statistician I. J. Good in 1965, holds that a machine able to improve its own design could iterate rapidly, vaulting from human-level to far beyond. Good wrote that “the first ultraintelligent machine is the last invention that man need ever make.” Whether such recursive self-improvement is feasible — or fast — is contested, but it is why many researchers treat AGI and superintelligence as a continuum rather than separate problems.

Why the distinction matters for safety

The control problem grows sharper as capability rises. A human-level system that is misaligned with human intent is dangerous; a superhuman one that is misaligned may be impossible to correct after the fact. This is the core motivation behind the AI-safety and alignment field, where labs like Anthropic, DeepMind, and OpenAI maintain dedicated research teams. The reasoning is precautionary: design the controls before, not after, the capability arrives.

How would we test for AGI?

There is no single accepted test for AGI, and that absence is itself a major obstacle. Proposed yardsticks each capture only part of “general” intelligence, and a system can ace one while failing another. The most-cited proposals are the Turing test, the ARC-AGI benchmark, and a handful of more colourful thought experiments.

The Turing test and its limits

Alan Turing’s 1950 “imitation game” asks whether a machine can converse so convincingly that a human judge cannot reliably tell it from a person. For decades it was the popular shorthand for machine intelligence. But modern language models can produce fluent, human-like text while still failing at basic reasoning, which is why many researchers now regard the Turing test as a measure of conversational mimicry rather than general intelligence.

ARC-AGI and reasoning benchmarks

The ARC-AGI benchmark, created by François Chollet in 2019, tests abstract reasoning on novel visual puzzles designed so they cannot be solved by memorisation. Chollet has argued that “skill is not intelligence” — what matters is efficient acquisition of new skills. ARC tasks are easy for humans yet long resisted machines; progress on them is tracked closely as a proxy for genuine generalisation. The rise of reasoning models that “think” step by step has driven scores up sharply, intensifying debate over whether the benchmark still measures what it set out to. More broadly, the field leans on a battery of AI benchmarks — none of which, individually, certifies AGI.

When will AGI arrive — if ever?

Expert forecasts for AGI span an enormous range, from “within a few years” to “many decades” to “possibly never,” and the disagreement is genuine rather than a rounding error. A 2022 survey of 738 machine-learning researchers by AI Impacts put the aggregate forecast at a 50% chance of high-level machine intelligence by around 2059, though individual estimates varied wildly.

Industry voices tend to be more bullish than academics. Several lab leaders have publicly suggested AGI-like systems could appear before 2030, while many academic researchers caution that current models lack reliable reasoning, grounding, and continual learning. Skeptics such as Gary Marcus argue that scaling alone will not close those gaps. The honest summary is that no one knows: confident predictions in either direction outrun the available evidence.

What’s still missing

Today’s best models struggle with reliable multi-step reasoning, robust handling of genuinely novel situations, learning continuously after training, and grounding language in the physical world. They also lack persistent memory and stable goals. Whether these are incremental engineering gaps or fundamental barriers requiring new ideas is precisely the question that divides optimists from skeptics.

Why the stakes are high

The case for taking AGI seriously rests on asymmetric stakes: even a modest probability of a transformative outcome justifies serious preparation, given how large the consequences — positive or negative — could be. Proponents foresee breakthroughs in medicine, science, and productivity; critics warn of disruption, misuse, and loss of human control.

This is why AI safety has moved from the philosophical fringe to mainstream research agendas. Work on alignment, interpretability, evaluation, and governance aims to ensure that increasingly capable systems remain steerable and beneficial. The aim is not to predict the date AGI arrives, but to be ready regardless of when — or whether — it does.

Frequently asked questions

Is ChatGPT or Claude an AGI?
No. Large language models are impressively broad and can perform many tasks, but they are still classified as narrow AI. They lack reliable reasoning over novel problems, persistent memory, continual learning, and grounding in the physical world. Their breadth comes from training on enormous text corpora rather than from genuine general understanding, and they fail in ways no generally intelligent human would, which is why researchers do not consider them AGI.

What is the difference between AGI and superintelligence?
AGI describes a machine with roughly human-level general intelligence — able to handle most cognitive tasks a person can. Superintelligence describes a system that vastly exceeds the best human minds across virtually every domain. AGI is parity; superintelligence is decisive superiority. Many researchers worry the gap between them could be crossed quickly if an AGI were able to improve its own design, a scenario sometimes called an intelligence explosion.

When do experts think AGI will arrive?
There is no consensus. A 2022 survey of hundreds of machine-learning researchers produced an aggregate 50% estimate of high-level machine intelligence by roughly 2059, but individual forecasts ranged from a handful of years to never. Industry leaders tend to predict sooner, often before 2030, while many academics are far more cautious, citing missing capabilities in reasoning, grounding, and continual learning that scaling alone may not resolve.

Why do people worry about AGI safety?
The concern is that a highly capable system whose goals are not well aligned with human intentions could cause serious harm, and a superhuman one might be impossible to correct after deployment. Because the consequences of getting this wrong are potentially severe and irreversible, researchers argue the safeguards — alignment, interpretability, evaluation, and governance — must be developed before the capability arrives, not after.