Qwen's former lead argues hybrid thinking fell short and makes the case for agents

Junyang Lin, the former technical lead of Alibaba's Qwen team, offered a candid postmortem on Qwen3's design bets — specifically its hybrid thinking modes and dynamic thinking budgets, which tried to let a single model toggle between fast answers and deliberate reasoning. His conclusion: the merge fell short, and the more promising direction is a shift from 'reasoning thinking' to 'agentic thinking,' backed by harder agentic reinforcement-learning infrastructure.
The critique is significant because it comes from an insider who built one of the most influential open-weight model families. Hybrid thinking was a widely copied idea (variable reasoning budgets appear across the industry), so a founder saying it didn't pan out as hoped is a meaningful signal about where model design is heading.
The 'agentic thinking' framing means optimizing models to act — plan, use tools, execute multi-step tasks in environments — rather than just reason longer internally. That requires RL environments and infrastructure that are harder to build than static reasoning benchmarks, which is Lin's core argument for where effort should go next.
Competitive context: this crystallizes the week's theme-of-the-week — the pivot to agents. It aligns with NVIDIA's ASPIRE robotics framework, AWS's agentic deployment push, xAI's agent products, and even Apple's cautionary multi-agent research. Notably, it's tempered by Apple's finding that current multi-agent coordination underperforms — suggesting the agentic era's infrastructure is still immature. What to watch: whether Alibaba's next Qwen release reflects this agentic-first philosophy.