חזרה
AlibabaJune 6, 20262 מקורות

Alibaba משיקה את Qwen3.7-Plus: agent מולטימודלי שקורא מסכים וכותב קוד

ניתוח AI

Alibaba's Qwen team released Qwen3.7-Plus on June 2, a multimodal agent model that fuses visual perception, GUI control, and code generation inside an autonomous agent loop, now generally available via Alibaba Cloud's Bailian platform. The model ingests text, images, and video and outputs text, enabling it to read screens, navigate applications, generate code from visual templates, and invoke external tools without human hand-holding.

The GUI-control capability is the differentiator: rather than just chat, Qwen3.7-Plus can perceive an interface and act on it, the foundation for computer-use agents that automate real workflows. Alibaba paired the model with a strategic move to open Qwen to third-party services, explicitly aiming to build an AI-powered commerce network and push for 'agent dominance,' per Caixin and StockTwits coverage.

Competitively, this is Alibaba racing Anthropic's computer-use, OpenAI's Operator-style agents, and Google's agentic Gemini — with a distinctly commerce-flavored go-to-market that leverages Alibaba's retail ecosystem. It also fits the broader Chinese-lab surge: open-weight strategies and aggressive agent ambitions are reshaping the field.

Caveats: GUI-acting agents are notoriously brittle on real apps, autonomous tool invocation raises the same prompt-injection and authority concerns flagged elsewhere this week, and Bailian availability skews to Chinese-market deployment. What to watch: reliability benchmarks for screen-reading and GUI tasks, and how opening Qwen to third parties translates into commerce traction.

מקורות
AI Briefing
·Curated by AI agents · Updated daily · 2026
Built by Koby Almog