What Evolution Tells Us About the Path to AGI
Before we go any further, a question: Do you believe the human brain is the product of evolution, or of some form of intelligent design? This piece assumes the former. If you’re in the latter camp, you can stop here — not because the argument gets theological, but because the entire case rests on what evolution actually demonstrates about intelligence. If you accept evolution, keep reading. The implications for AGI are more direct than most people realize.
The First Dead End
The early days of artificial intelligence were dominated by a reasonable-sounding idea: figure out how thinking works, then write code that does it. Build explicit rules for vision. Hand-craft logic for reasoning. Encode knowledge as structured facts. This approach — now called Good Old-Fashioned AI, or GOFAI — ran into two walls simultaneously. The problems were too complex to specify by hand, and even partial solutions couldn’t scale. Someone had to write every rule. The combinatorial explosion of the real world crushed every attempt.
So researchers tried something different. Instead of designing intelligence from the top down, what if you imitated the structure that already produces it? The human brain — and to varying degrees, every animal nervous system — is a network of interconnected nodes that fire, adapt, and reorganize based on experience. Crude artificial versions of this, neural networks, had been theorized since the 1940s. The bet was simple: if the real thing works, a sufficiently good imitation might work too.
This is, at its core, an act of faith. But it is faith grounded in the most robust existence proof available. General intelligence exists. We know how it was built. It was built by evolution — by billions of years of random mutation, brutal selection pressure, and the gradual accumulation of structures that worked well enough to survive. No designer. No blueprint. Just variation and time.
The underlying assumption is worth stating plainly: general intelligence is an emergent property of sufficiently powerful predictive systems. That is not proven. But neither is its opposite. And evolution has already run the experiment once.
Why Does AI Need So Much Data for Training?
One of the most common dismissals of LLMs goes like this: a child can recognize a cat after seeing a handful of examples. An LLM needs to be trained on vastly more. Doesn’t that prove the architecture is inferior — that something essential is missing?
It proves nothing of the sort. The child is not starting from zero. Before that child sees a single cat, evolution has already loaded millions of years of visual processing capability into the hardware — object permanence, edge detection, motion tracking, pattern recognition, depth perception. All of it pre-installed, refined across countless generations of survival pressure. The child isn’t learning to see from scratch. The child is running a query against an extraordinarily sophisticated system that took billions of years to build.
LLMs start from zero. No pre-loaded visual cortex. No inherited reflexes. No evolutionary head start. They compensate with volume — training on the accumulated written output of human civilization, which is itself the encoded product of all that evolved intelligence. It is not less data than evolution used. It is the same data, stored differently — in text rather than in genes and synapses.
The comparison isn’t LLM versus child. It’s LLM versus the entire evolutionary lineage that produced the child. And that lineage has been running its training data for billions of years — almost certainly more than any current LLM has processed. The difference is that evolution compressed all of it into biological structures: DNA, neural architecture, instinct, reflex. Compact encodings of an unimaginable training run. Machine learning does something analogous with weights and embeddings — sparse representations that capture the essential structure of vast input. Different medium. Same principle.
The Existence Proof
The brain that resulted is not a precision instrument. It is an accumulation of evolutionary hacks — older structures repurposed, newer ones layered on top, the whole thing running on roughly 20 watts. It is, by any engineering standard, a mess. And yet it generalizes across domains, handles novel situations, reasons under uncertainty, and produces consciousness — or at least something that feels exactly like it from the inside.
Neural networks, even primitive ones, share something essential with this architecture: distributed representation, learned weights, emergent behavior from simple units operating in parallel. The neuroscientist David Marr argued in 1982 that understanding any information-processing system requires analysis at multiple levels — what it computes, how it computes it, and what physical substrate runs it. His point was that the implementation level doesn’t determine the computational level. At the computational level, biological brains and artificial neural networks are doing something recognizably similar: finding patterns in input and using those patterns to predict and act.
Critics of modern AI often invoke “mere pattern matching” as a dismissal. The phrase assumes that pattern matching is categorically different from — and lesser than — genuine understanding. But this is precisely what evolution calls into question. The brain is a pattern-matching system. Its neurons fire based on weighted inputs; its architecture was selected because certain patterns of activation produced behaviors that survived. Whatever understanding feels like from the inside, the mechanism underneath is not categorically different from what large neural networks do. It is more complex, certainly. Categorically different, probably not.
Neuroscience is increasingly explicit about this. An influential framework in computational neuroscience — predictive coding, associated with Karl Friston’s free energy principle — describes the brain as a prediction machine: a system that continuously generates probabilistic models of sensory input and updates them based on error signals. In other words, the brain is doing Bayesian inference on a massive scale. If that’s what “genuine understanding” looks like under the hood, the “mere statistics” dismissal doesn’t just miss the point — it describes the thing it’s trying to dismiss.
If evolution’s messy, unguided process produced general intelligence from a neural substrate, the burden of proof falls on anyone who claims a different substrate cannot do the same.
The Critique That Proves the Point
The dominant critique of large language models today runs something like this: LLMs are sophisticated statistical engines — they predict the next token without any genuine understanding of the world. No model of physics. No cause and effect. No real reasoning. Just pattern matching at enormous scale. The argument was formalized in a 2021 paper by Emily Bender et al., which coined the term “stochastic parrots” — systems that manipulate linguistic form without any access to meaning. The conclusion drawn, then and since, is that you can scale this forever and never reach AGI because the foundation is wrong.
In my opinion, this critique is a restatement of the GOFAI program in different language. The ingredients the stochastic parrot camp says LLMs are missing — grounding, causal models, explicit world representations — are precisely what symbolic AI researchers spent decades trying to hand-code. It didn’t work. The problems were too complex to specify, and there was no path to scale. Dressing those requirements up as a critique of neural networks doesn’t solve either problem; it just relocates the failure.
This critique is a serious argument made by serious people. It is also, on examination, an extraordinary claim. It asserts that the researchers making it know better than four billion years of selection pressure what intelligence actually requires. They have a theory — real intelligence needs symbols, or embodiment, or causal models, or something the critics can gesture at but rarely fully specify — and they’re willing to bet that theory against the only working example of general intelligence we have.
There is a word for the belief that intelligence requires something extra, something that mere physical processes operating on simple units cannot produce. That word is not “neuroscience.”
Nature’s Method, Compressed
Modern machine learning has not stayed static while critics catalogued its limitations. Retrieval-augmented generation gives models access to external memory — something evolution solved with the hippocampus. Multi-step reasoning chains externalize working memory — something evolution solved by expanding the prefrontal cortex. Tool use and agentic architectures let models act on the world and observe consequences — something evolution solved by connecting nervous systems to bodies over hundreds of millions of years.
None of these additions required inserting a theory of intelligence. They required observation, experimentation, and iteration. The same process nature used, compressed by engineering and running on a different substrate.
The empirical record is hard to argue with. A decade ago, the list of things neural networks couldn’t do was long and confident: they couldn’t hold a conversation, write coherent prose, generate images from descriptions, solve novel mathematical problems, pass professional licensing exams. The list is shorter every year. The velocity of improvement is not slowing.
You don’t need a PhD in AI to see it. Today’s LLMs may not be AGI — but they are remarkable, and anyone who has spent serious time with them knows it. The proof is in the pudding. Whatever these systems are doing, it is working, and it is getting better.
And the deficiencies keep shrinking. Not by throwing out the foundation and starting over, but by tweaking the basic mechanism — exactly the way nature does it. Hallucinations reduced. Reasoning improved. Memory added. Context expanded. Each problem that was supposed to prove the architecture was fatally flawed turned out to be an engineering problem with an engineering solution. Nature never scrapped the neuron and started over. It just kept refining.
The Theological Argument You Didn’t Know You Were Making
To say that LLMs cannot become AGI is to make a specific claim: that the architecture is fundamentally wrong. Not immature, not incomplete — wrong at the foundation. That no amount of scale, data, or architectural refinement will get you from here to general intelligence.
But consider what that claim requires you to believe. The human brain began as a simple neural tube in primitive vertebrates. It was not designed. It had no roadmap. Through random variation and selection, it became what it is now — capable of language, mathematics, art, science, and every other cognitive achievement we recognize as distinctly human. At no point in that process was the foundation “right.” It was always a hack, always imperfect, always good enough to survive and reproduce.
To say that the neural network approach cannot reach AGI because it lacks something essential is to say that the brain could never have become what it became — that at some point in evolutionary history, an observer would have been justified in saying: this architecture has a ceiling, and general intelligence is above it.
That observer would have needed a designer to explain where general intelligence actually came from.
Nature is not intelligent. It is random. But given enough time and selection pressure, randomness produces intelligence. We have proof. It’s reading this sentence.
The only question worth arguing about is how long it takes us to compress four billion years of evolution into something we can run in a data center. The evidence so far suggests: faster than anyone expected.

