Research directions for AGI that I am excited about.

It is well known that AI scientists for decades have been looked to the human mind for inspiration. Through the lens of this parallel, this blog traces the progress we have had thus far, why we don’t feel like AGI is here yet, and what might be missing. My final section explores beyond LLMs.

I am aware that the examples and analogies I have chosen are biased in the research directions I am more optimistic about. I do not pretend this is a purely unbiased view, but rather a view of my intuition on how the AI world is evolving. This is an opinionated piece and gives a direction to what I’m prioritising in my own research experiments.

tldr:

Language is a great approximation for cognition
Self-directed agency is a key component of AGI
Language is a lossy approximation for cognition

The Story Thus Far.

Most human intelligence throughout history has been expressed and discussed through language. If you are an AI researcher, this seems like a great starting point to build human-like intelligence. Language is intrinsically tied to how humans learn; or perhaps more strongly it shapes human cognition (formally, Linguistic Relativity) [1]. You train a model to mimic human speech. At first, it’s very simple “next word prediction” but with scaling, intelligence becomes emergent.

[1] Some people criticise this parallel since LLMs seemingly require so much more data to learn the same concepts that are natural to even babies very quickly. Researcher Yann LeCun actually showed the opposite. A 4-year-old child processes ~50x more data than the largest LLMs. The optic nerve alone delivers ~20 MB/sec to the visual cortex. Over 16,000 waking hours, that's ~10^15 bytes of visual data alone — orders of magnitude more than the ~10^13 bytes used to train the largest LLMs.

Scaling laws work well and you are able to create the first generation of LLMs like GPT-3.5 and Llama-3. They have human-like responses often, but they do tend to hallucinate and get things wrong[2]. At this stage, LLMs are like a “drunk guy at the bar”. If you asked a drunk person to multiply two large numbers, he might confidently respond with a best guess in the approximate range of the real answer (try it right now: do not calculate! answer quick! what is 134 x 5453); very similar to what LLMs do. If I asked you “how many Rs in strawberry”, your brain instinctively starts to count out explicitly — “reasoning” or “thinking” is so baked into us that sometimes we don’t consciously realise it, but the drunk guy at the bar whose reasoning faculties are not active is a decent model for an LLM’s responses.

[2] Humans “hallucinate” too. The key difference is that we have the metacognition to know that we have not done the required analytical effort to get the correct answer so we present our answer with uncertainty or we have the faculties to do the required work to increase confidence, whereas LLMs do not communicate uncertainty making their hallucinations more frustrating.

You realise this huge reasoning gap to achieve AGI, so you start training the next set of models to mimic human “reasoning” and soon you have models like OpenAI-o1 and DeepSeek-R1. These are the first models to plan over long horizons and solve complex math and coding problems. Promising.[3]

[3] It was recently discovered that the reasoning chains exposed by these LLMs are inaccurate to their real thoughts (i.e CoT is not the causal mechanism but a story the model tells about a decision already made in the residual stream). I have recently fallen into the rabbit hole of Michael Gazzaniga’s fascinating work on split-brain patients. The research shows that humans are likely also very bad at accurately identifying and representing reasoning behind our actions and often confabulate post-hoc explanations. This parallel, although purely speculative, feels very interesting to me. This interview of Annaka Harris is a great explainer on the split-brain research from a philosophy of mind perspective.

That brings us to today. AI is smarter than most humans, but something is missing from the AGI picture.

The Missing Piece.

So why do LLMs not yet feel human-like? What are they missing?

One major gap that is frequently brought up is that humans evolve and update priors as we interact through the world. LLM weights, in contrast, are fixed. This is the concept of continual learning. For a second, imagine a magic wand was waved and continual learning was implemented to its ideal state. This would mean that LLMs could learn anything super quickly (without catastrophic forgetting). Similar to a human learning about something new, such an LLM would perfectly “internalize” and updates its priors in real-time. While this would be incredible, I would suggest it is still not AGI — some external force still has to explicitly induce this learning. The model does not decide on a random Tuesday morning unprompted that it will specialize in black hole theory, in the way that a curious human might.

This is agency. OpenClaw was the first agent harness to attempt to simulate this piece. I think cron jobs are an 80/20 solution to agency, and I would attribute a large part of OpenClaw’s virality to the marvel people felt at the 80 (through things such as Moltbook). Like with any 80/20 solution, the last 20 is the hardest and oftentimes the most impactful. In humans, agency does not come at random evenly spaced intervals like cron jobs. We are constantly thinking, identifying salient things and making internal decisions. This layer of metacognition does not exist in LLMs.

Our decisions to act are often steered by internal emotional reactions (formally Damasio’s Somatic Marker Hypothesis). Interestingly, LLMs do “express” emotions. That is, if you trained a linear probe on the activations of an LLM you would see that certain sections of its weight would repeatedly “light up” for conventionally scary/ happy/ angry etc. responses (see here, here and here). But of course, LLMs are stateless so an emotional activation does not persist or compound into the kind of self-directed agency we experience. LLMs also do not phenomenologically experience emotions and I am unsure if this is a necessary condition to develop agency (or if the conceptual representations of pain, suffering, fear, joy, excitement is sufficient).

Finally, on practicality. Is it computationally viable to develop a synthetic default mode network to simulate metacognition in LLMs? At surface it seems to involve some absurd level of compute/ power to run endless forward passes to simulate the constant thinking humans do. Is that a desirable outcome even, or is our reference for AGI not actually a practically useful one?

Looking Beyond Language.

I am recently increasingly unsure if LMs are the best framework. Connecting this back to the start of this essay: language is indeed how we express ideas, but not how we actually think. We do not have an explicit CoT process in our brains drafting ideas in formal sentences which we then speak. Human reasoning is closer to latent thinking, so I am excited about frameworks like JEPA that train semantic, rich, physically-grounded embeddings as a primary focus rather than a byproduct emergent from scaling next-token-prediction.

Going back to the drunk guy analogy: scaling CoT reasoning feels like having a drunk guy at the bar who thinks-out-loud for a while before eventually stumbling onto the right answer. Imagine being asked to solve the same math problem while being forced to constantly speak something rather than just being able to do the calculation in your head and then just produce the answer. You might get the same answer, but being forced to do the multipliction out loud is just not convenient nor efficient compared to simply calculate it quietly. This is all a long-winded roundabout way to say that language feels very lossy space to do reasoning. I am excited for directions focused on increasing latent thinking (like Soft Thinking, Coconut) and improving the quality of latent embedding representations.