2026年1月28日

Introducing Agentic Vision: New Gemini 3 Flash-Powered Image Recognition

Key Points of This Article

Google has announced “Agentic Vision,” a new image recognition technology based on “Gemini 3 Flash.”
“Agentic Vision” is a new image recognition technology designed to accurately recognize uploaded images.
While conventional AI models process an entire image as a single unit, “Agentic Vision” achieves precise image recognition by actively looping through three stages—Thinking, Acting, and Observing—such as zooming in to recognize fine details within an uploaded image.

On Tuesday, January 27, 2026, Google announced “Agentic Vision,” a new image recognition technology based on “Gemini 3 Flash.”

“Agentic Vision” is a new image recognition technology designed for the accurate recognition of uploaded images. While conventional AI models process an entire image at once, “Agentic Vision” achieves precise image recognition by actively looping through a three-stage process of Thinking, Acting, and Observing, which includes actions like zooming in on specific details within the uploaded image.

This enables the AI to provide highly accurate responses by precisely understanding both the image and the prompt, including tasks such as character recognition within images, understanding fine details, accurately counting items, and interpreting numbers or graphs.

“Agentic Vision” is being rolled out via the Gemini API on the multimodal generative AI development platform “Google AI Studio” and the machine learning development platform “Vertex AI.” Additionally, in the “Gemini” app, “Agentic Vision” becomes functional by switching the AI model to “Thinking” mode, allowing users to leverage these advanced image recognition capabilities.

Source：Google

Share this article

Comments

コメントを残すコメントをキャンセル

*This site uses affiliate advertising.

Introducing Agentic Vision: New Gemini 3 Flash-Powered Image Recognition

Share this article

Comments

コメントを残す コメントをキャンセル

Gemini

コメントを残すコメントをキャンセル