Gemini CLI runs on Google's Gemini models, all of which carry a 1M-token context. Gemini 3.1 Pro is the quality flagship; Gemini 3.5 Flash is the fast, well-priced everyday default; and Gemini 3.1 Flash Lite is the cheapest option for high-volume calls.
The flagship — frontier quality and the best long-context recall, for the hardest tasks and biggest codebases.
The everyday default: fast, multimodal, a 1M context, and a great price for the quality.
The cheapest way to keep a 1M context — ideal for high-volume, low-complexity steps on Gemini CLI's free-tier quota.
The model picks the moves; the agent runs the loop, the tools, and the guardrails. Once you've chosen a model, see which agent gets the most out of it.