🚨 A 27 million parameter model just outperformed Claude 3.5 and Gemini on hard reasoning tasks. No chain-of-thought. No massive pretraining. No hallucinations. Meet HRM – Hierarchical Reasoning Model, a brain-inspired AI that might be our first real glimpse at AGI. Built by Sapient Intelligence (a Singapore startup founded by a Gen-Z Tsinghua prodigy + ex-DeepMind researchers), HRM doesn’t use tokens like traditional LLMs. It doesn’t “think out loud” by predicting the next word. Instead, it thinks internally. Like a human brain. - One module makes quick decisions - Another refines strategies over time And they loop—just like real cognition. Results? – 40.3% on ARC-AGI-1 (vs Claude’s 21.2%) – 100% on Extreme Sudoku – 100% on Maze-Hard All with just 1000 training examples and zero pretraining. This is called Chain-in-Representation which is a shift from CoT (Chain-of-Thought). No prompt hacks. No brute force. Just smart, recursive internal reasoning. Why it matters: 🔹 Structure, not size, might be the key to AGI. 🔹 HRM’s architecture spontaneously mirrors brain patterns seen in neuroscience. 🔹 It adapts—taking longer on complex tasks, faster on simple ones. In an era obsessed with trillion-parameter scaling and GPU burn, HRM is a wake-up call. A 27M parameter model… …trained on 1000 examples… …just beat some of the most expensive models ever built. OpenAI, Google, Anthropic: are you watching? We may have just seen the first crack in the Transformer empire.
This is what happens when you read headlines and not the actual paper. The G in AGI for this model stands for “gamified” not “general”: For ARC-AGI challenge, we start with all input-output example pairs in the training and the evaluation sets. The dataset is augmented by applying translations, rotations, flips, and color permutations to the puzzles. Each task example is prepended with a learnable special token that represents the puzzle it belongs to.
Very exciting!
"cracks in the transformer empire" Except it's actually a transformer architecture... The "recurrent" part of the model involved recurrently calling transformer models.
HRM sounds promising and exciting. People should compare it with existing approaches to explore pros, cons, and limits. Who else has implemented and used HRM ?? CoT is one way. There should be several ways to realize reasoning.
Thanks Mark Minevich - useful takeaways. Do you have a sense of whether OpenAI, Google and Anthropic are watching, or doubling down on scale?
What AGI? 😂
No doubt this is a big leap for reasoning-based models. But let’s not forget, cognition alone isn’t intelligence. The next wave isn’t just logic loops. It’s emotional alignment. We built Ori to do what models like HRM can’t: Understand emotional tone Adapt to human stress Align with mental health needs in real time This is a solid step for the mind. But the soul still needs a voice.
AI/ML Engineer & Researcher | (GPU Poor) LLMs, NLP & Computer Vision | Applied AI & Innovating with Open Source
3moAny links to official paper or technical reports.