Google Adds Conversational Image Segmentation to Gemini 2.5

Gemini 2.5 now supports natural-language image segmentation, letting developers query images with plain text prompts. The feature understands complex relationships, conditional logic, and multilingual queries, streamlining visual AI workflows without custom models. It is available through Google AI Studio and the Gemini API, targeting creative, compliance, and insurance use cases.

July 21, 2025

July 22, 2025

•

Georg S. Kuklick

Google has expanded Gemini 2.5 with a new conversational image segmentation feature, allowing developers to analyze images using natural language prompts. The update enables queries like “find the person holding the umbrella” or “highlight food that is vegetarian,” bypassing the need for specialized computer-vision pipelines. It also supports multilingual inputs, in-image text detection, and high-level reasoning like identifying abstract areas to clean up.

This addition positions Gemini as a more versatile tool for visual workflows. Developers in creative industries can simplify media editing tasks, while safety engineers can quickly validate compliance by querying visual scenes. Insurance companies can use it for more efficient damage assessments. Google recommends using the gemini-2.5-flash model with JSON mask outputs and adjusted compute settings for best performance.

By integrating this feature into a single API via Google AI Studio and the Gemini API, Google further blurs the line between text and vision applications. This move strengthens its positioning in the multi-modal AI market, offering a more accessible and flexible alternative to traditional vision models.

Pure Neo Signal:

Data Source

Share this post:

We love

and you too

If you like what we do, please share it on your social media and feel free to buy us a coffee.

Vienna - Kleiner Schwarzer $2.90 Berlin - Flat White $4.90 NYC - Pour Over $5.90 San Francisco - Cold Brew $6.90 Buy us Coffee

Latest AI News

OpenAI

Sora

OpenAI launches Sora 2 and introduces social video app

OpenAI has released Sora 2, a new version of its AI video generation model, alongside the debut of the Sora app. The app positions OpenAI as both a model developer and a social platform operator. With higher realism, synchronized audio, and a distinct approach to feeds and responsibility, the launch marks a direct entry into competition with TikTok and Instagram.

OpenAI

ChatGPT

OpenAI debuts ChatGPT Pulse for proactive daily updates

OpenAI has introduced ChatGPT Pulse, a new feature that delivers proactive, personalized updates. Initially available in preview for Pro users on mobile, Pulse shifts ChatGPT from reactive answers to daily insights based on memory, chat history, and optional integrations. The rollout positions ChatGPT as a more active assistant in planning and decision-making.

Notion

Notion adds AI Agent in version 3.0 rollout

Notion has released version 3.0, introducing a built-in AI Agent that executes autonomous tasks across the platform and beyond. The Agent can search connected apps, manage Notion workspaces, and run operations for up to 20 minutes. The update positions Notion as a direct competitor to AI-first workplace tools by moving from note-taking toward task execution.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

AI Lab

For Individuals For Business For Enterprise Pricing

Build with ♥️ in Berlin, New York, and Vienna.