
blog.google
Gemini 3.5: frontier intelligence with action
At Google I/O we released Gemini 3.5, our latest series of models combining frontier intelligence with action.

Google 在 Google I/O 2026 发布 Gemini 3.5 Flash,这是 Gemini 系列中首个在智能体和编码基准上整体超越自家旗舰 Gemini 3.1 Pro 的 Flash 模型,同时保持 4 倍于其他前沿模型的输出速度。Finance Agent v2 领先 3.1 Pro 达 14.9 个百分点,Terminal-Bench 2.1 领先 5.9 个百分点。定价 $1.50/$9.00 / 百万 token,支持 1M 上下文窗口。
Research Brief
| 任务类型 | 基准测试 | 3.5 Flash | 3.1 Pro | 差值 |
|---|---|---|---|---|
| 编码(智能体终端) | Terminal-Bench 2.1 | 76.2% | 70.3% | +5.9 |
| 编码(代码库修复) | SWE-Bench Pro Public | 55.1% | 54.2% | +0.9 |
| 智能体(MCP 多步骤工作流) | MCP Atlas | 83.6% | 78.2% | +5.4 |
| 智能体(现实工具调用) | Toolathlon | 56.5% | 49.4% | +7.1 |
| 智能体(电脑操控) | OSWorld-Verified | 78.4% | 76.2% | +2.2 |
| 专业任务(财务分析) | Finance Agent v2 | 57.9% | 43.0% | +14.9 |
| 多模态(图表理解) | CharXiv Reasoning | 84.2% | 83.3% | +0.9 |
| 长上下文(128k) | MRCR v2 8-needle | 77.3% | 84.9% | -7.6 |
| 推理(学术极限) | Humanity's Last Exam | 40.2% | 44.4% | -4.2 |
| 推理(抽象谜题) | ARC-AGI-2 | 72.1% | 77.1% | -5.0 |
gemini-3.5-flash(内部版本号 3.5-flash-05-2026)
At Google I/O we released Gemini 3.5, our latest series of models combining frontier intelligence with action.
Add more perspectives or context around this Post.