The world of artificial intelligence has been obsessed with one simple idea: bigger models perform better. Companies have been locked in an arms race, building models with hundreds of billions of parameters, consuming massive amounts of energy, and requiring enormous computational resources. Microsoft's new research report introduces rStar2-Agent, that takes a different approach: instead of just thinking longer, it teaches models to think smarter by actively using coding tools to verify, explore, and refine their reasoning process.