/AI14h ago

Microsoft VP of AI Nando de Freitas proposes replacing complex training pipelines with unified, continual interactive causal agents

The single stream replaces separate SFT and RLHF objectives

182011520117.9K
Original post
Nando de Freitas@NandoDF#28inAI

The field of AI is at a local minimum. Not a local minimum in architectures and models, but a local minimum on how we train: a Frankenstein multi-stage approach. In this new blog entry, I propose a different route based on continual interaction and causality.

https://love4all.ai/blog/continual-interactive-causal-agents/

4:50 AM · Jun 7, 2026 · 13K Views
Sentiment

Users like Nando de Freitas's proposal for single-stream interactive causal agent training because they find the causal approach interesting and directionally promising.

Pos
100.0%
Neg
0.0%
6 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS2.8KBOOKMARKS3LIKES9REPLIES2
Michael Black@Michael_J_Black

@NandoDF I like this. Directionally, it feels right.

The field of AI is at a local minimum. Not a local minimum in architectures and models, but a local minimum on how we train: a Frankenstein multi-stage approach. In this new blog entry, I propose a different route based on continual interaction and causality.

https://love4all.ai/blog/continual-interactive-causal-agents/

13hViews 2.8KLikes 9Bookmarks 3
Pedro A. Ortega@AdaptiveAgents

This is the way

The field of AI is at a local minimum. Not a local minimum in architectures and models, but a local minimum on how we train: a Frankenstein multi-stage approach. In this new blog entry, I propose a different route based on continual interaction and causality.

https://love4all.ai/blog/continual-interactive-causal-agents/

6hViews 278Likes 4Bookmarks 1
Julius Adebayo@juliusadml

thought provoking read.

The field of AI is at a local minimum. Not a local minimum in architectures and models, but a local minimum on how we train: a Frankenstein multi-stage approach. In this new blog entry, I propose a different route based on continual interaction and causality.

https://love4all.ai/blog/continual-interactive-causal-agents/

2hViews 180Likes 0Bookmarks 1
Pim de Witte@PimDeWitte

@Michael_J_Black @NandoDF So basically world models 😜

13hViews 41Likes 1
Pim de Witte@PimDeWitte

@NandoDF @Michael_J_Black I view all of those things as just text conditioning and steerability on a WM architecture. What you’re describing here is precisely the original promise (and reason) WMs are being pursued so hard. In case you hadn’t read yet: https://www.notboring.co/p/world-models

13hViews 8

@Michael_J_Black Thanks, Michael. It’s still not very developed or properly tested, but I agree that directionally, it feels worth trying

Michael Black@Michael_J_Black

@NandoDF I like this. Directionally, it feels right.

13hViews 1.8KLikes 3Bookmarks 0

@PimDeWitte @Michael_J_Black Precisely not. This is not about model architectures, what people often stress when talking about world models. This works with Jepa or GPT. This is about causal interactive training. It’s all about environments, not agents.

13hViews 16

Thanks for asking. Pedro pointed out issue much earlier when I was working on General AgenT One — Gato 🐈

https://arxiv.org/abs/2205.06175

and wrote about it

https://arxiv.org/abs/2110.10819

Then Pedro came up with this brilliant theoretical insight:

https://www.adaptiveagents.org/universal_ai_as_imitation

And we generalised it to LLMs recently:

https://love4all.ai/files/why-it-is-important-to-understand-causality-and-agency.pdf

https://love4all.ai/files/emergent-reward-maximization.pdf

7hViews 55Likes 1
Pim de Witte@PimDeWitte

@NandoDF @Michael_J_Black P(S’|A,S) defines how the world evolves, P(S’|do(A),S) is how you train general agents inside those WMs. They supplement each other in the loop you lay out. I see your distinction though - two sides of the same coin

12hViews 22Likes 1

@PimDeWitte @Michael_J_Black Could you please be more precise. Could you show us precisely how world models are trained causally. Thanks

8hViews 61

@BlissyOnX I agree. But everything has a beginning 🙂

7hViews 51

@NandoDF multi-stage pipeline feels like duct taping training phases together

the interventional agent is cleaner but do we have the infra to pull it off

14hViews 34
Blissy@BlissyOnX

@NandoDF the single stream feels inevitable tbh, but the gap between proposing it and making it work is where the real friction lives

14hViews 25
Strata@ChainZenit

@NandoDF this take on causality is super interesting, how did you start?

14hViews 24
Pim de Witte@PimDeWitte

Loop would hold for causal / non causally trained WMs no? You are definitely the expert on the former, I was mostly commenting on the fact that your loop is why WMs are so exciting, as it rapidly accelerates the amount of environments. We do generative WMs (not causal). For what it’s worth, I like your post and my comment was intended as light jest, not criticism!

4hViews 23

@NandoDF What is the conceptual difference between stages 2 and 3, or 5 and 6?

12hViews 21
Rugbist@rugbist_

@NandoDF single stream agent approach would def reduce all the prompt engineering hell we deal with now

question is what happens to fine tuning in that setup

14hViews 19
Alex YGift@Radipdegen

@NandoDF hard to disagree that multi-stage feels like patching a leak with more patches

wonder if the compute budget holds up in practice though

14hViews 16
dimenwarper@tsuname

@NandoDF I like the causal take, makes a lot of sense as a unifier. How would you also merge pre-training here? or is that simply too much of a completely different thing (non-interventional bootstrapping)

5hViews 15
Invincible@InvincibleEdge

@NandoDF curious how u see continual interaction scaling in practice tho

most labs cant even keep one training run stable

14hViews 12
Load more posts