Aug 28, 2025
xAI launches Grok Code Fast 1 for agentic coding tasks
xAI has introduced Grok Code Fast 1, a specialized model built for speed and cost efficiency in coding workflows. The model brings tool-calling and agentic capabilities to development environments, supports multiple programming languages, and is rolling out in preview through GitHub Copilot and other partners.
xAI
Grok Code Fast
Aug 28, 2025
OpenAI launches GPT-Realtime for production voice agents
OpenAI has made its Realtime API generally available, introducing GPT-Realtime as a speech-to-speech model designed for production-scale voice agents. The release improves response quality, speed, and naturalness, offering developers and enterprises a single-model solution for low-latency conversational AI.
OpenAI
Realtime API
Aug 27, 2025
Anthropic pilots Claude for Chrome with security safeguards
Anthropic has released a browser extension that integrates its Claude AI assistant directly into Google Chrome. The tool is in early testing with a limited group of users on the Claude Max plan. Alongside the launch, the company detailed new research on AI security threats and its measures to counter cybercriminal misuse.
Anthropic
Claude
Aug 27, 2025
Microsoft releases VibeVoice-1.5B open-source long-form TTS model
Microsoft has released VibeVoice-1.5B, an open-source text-to-speech model capable of generating up to 90 minutes of continuous audio. The model can synthesize expressive conversations with up to four distinct speakers and is available under the MIT license with safeguards for responsible use.
Microsoft
VibeVoice
OpenSource
Aug 26, 2025
Google debuts Gemini Nano-Banana image editing with lower cost than OpenAI
Google has upgraded its Gemini image editing model, now named Gemini 2.5 Flash Image. The model improves likeness preservation and consistency across edits while undercutting OpenAI’s average image generation cost. Pricing positions Gemini as a lower-cost option for developers and enterprises focused on high-volume content workflows.
Gemini
Aug 25, 2025
xAI open-sources Grok 2 model on Hugging Face
Elon Musk’s AI company xAI has released Grok 2 as open source on Hugging Face. The model, with an estimated 270 billion parameters, comes under a custom license that permits commercial use but restricts model training. Musk confirmed plans to release Grok 3 in the coming months and reiterated xAI’s target of scaling compute to 50 million H100-equivalent GPUs within five years.
xAI
Grok
OpenSource
Aug 22, 2025
ByteDance releases Seed-OSS-36B with 512K token context
ByteDance has released Seed-OSS-36B, an open-source large language model with a native context length of 512,000 tokens. The model achieves leading open-source results in math, reasoning, and coding tasks while also supporting efficient deployment through quantization.
ByteDance
Seed
OpenSource
Aug 19, 2025
DeepSeek releases V3.1 with 685B parameters and 128k context window
DeepSeek has launched its latest open-source AI model, DeepSeek-V3.1-Base, which comes with 685 billion parameters and a 128 000-token context length. The model posts benchmark results close to leading proprietary systems and is freely available for download, marking a significant move in the open-source AI landscape.
DeepSeek
DeepSeek V3
OpenSource
Aug 18, 2025
Alibaba releases Qwen-Image-Edit, an open-source foundation model for image editing
Alibaba’s Qwen team has released Qwen-Image-Edit, a 20-billion-parameter foundation model for text-driven image editing. The model supports both semantic and appearance-level modifications, including precise bilingual text editing, and is licensed under Apache-2.0 for commercial use.
Qwen
Wan 2.2
OpenSource
Aug 14, 2025
Google adds automatic memory and temporary chat controls to Gemini
Google has begun rolling out automatic memory in Gemini AI, allowing the assistant to remember details from past interactions by default. The update also introduces a “Temporary Chat” mode that does not store or use conversations for training and expires after 72 hours. The changes aim to balance personalization with stronger privacy controls.
Gemini
Aug 14, 2025
Google tests Magic View in NotebookLM as potential new data visualization feature
Google is trialing a new experimental feature in NotebookLM called Magic View. The tool displays an animated, dynamic canvas when activated, resembling a generative simulation. While the exact function is unconfirmed, early indications suggest it could provide new ways to visualize and interact with notebook content. The development follows recent updates to NotebookLM that expand multimedia support, including video overviews and mind maps.
NotebookLM
Aug 14, 2025
Google releases Gemma 3 270M, an ultra-efficient open-source AI model for smartphones
Google DeepMind has released Gemma 3 270M, a compact 270 million-parameter model designed for instruction following and text structuring. The open-source model is optimized for low-power hardware, including smartphones, browsers, and single-board computers. Its efficiency enables AI capabilities in privacy-sensitive and resource-constrained environments.
Gemma 3
OpenSource
Aug 12, 2025
Claude adds memory search for past conversations
Anthropic has introduced a new memory feature in Claude that lets users search and reference past conversations in new chats. The feature launches today for Max, Team, and Enterprise plans, with availability for other plans expected soon. It aims to streamline workflows by eliminating repeated context-sharing and enabling project continuity.
Anthropic
Claude
Aug 12, 2025
Anthropic expands Claude Sonnet 4 to 1 M token context
Anthropic has increased the context window for Claude Sonnet 4 to 1 million tokens, allowing the model to process entire codebases or dozens of research papers in a single prompt. The upgrade is available in public beta through Anthropic’s API and Amazon Bedrock, with Google Cloud Vertex AI integration planned. The change enables more coherent large-scale reasoning and workflow automation for developers and enterprises.
Anthropic
Claude
Aug 12, 2025
Google makes Jules, its AI coding agent, generally available
Google has transitioned Jules from Google Labs beta to full public release. Powered by Gemini 2.5 Pro, Jules runs asynchronously in a cloud VM to read, test, improve, and visualize code with minimal developer oversight. The launch adds a free tier alongside paid “Pro” and “Ultra” plans and introduces a critic capability that flags issues before changes are submitted.
Jules
Aug 8, 2025
OpenAI Publishes GPT-5 Prompting Guide and Releases Prompt Optimizer Tool
OpenAI has added a GPT-5 prompting guide to its public Cookbook and launched a Prompt Optimizer in the Playground. The resources are designed to improve the quality, efficiency, and consistency of prompts for the new model, with specific guidance for agentic tasks, coding workflows, and instruction adherence.
OpenAI
GPT-5
Aug 8, 2025
OpenAI launches GPT-5 with unified fast and reasoning modes for API and ChatGPT
OpenAI has released GPT-5, a flagship language model combining high-speed responses with advanced reasoning in a single system. The model is available immediately in the OpenAI API and to ChatGPT Team users, with Enterprise and Education accounts gaining access next week. GPT-5 introduces a 400,000-token context window, enhanced coding performance, expanded developer controls, and improved reliability over previous models.
OpenAI
GPT-5
Aug 5, 2025
GPT‑OSS: OpenAI Publishes 20B and 120B Open‑Weight Models for Local Deployment
OpenAI has released gpt‑oss‑120b and gpt‑oss‑20b, its first open‑weight models since GPT‑2. The models match or exceed the performance of proprietary counterparts and mark a rare moment of open source leadership from a U.S.-based AI lab. With support for tool use, chain‑of‑thought reasoning, and smooth MacBook deployment, gpt‑oss is designed for full local control.
OpenAI
GPT-OSS
OpenSource
Aug 5, 2025
Claude Opus 4.1 Raises Coding and Reasoning Performance
Anthropic has released Claude Opus 4.1, a mid-cycle upgrade focused on coding accuracy, reasoning stability, and better multi-file task tracking. It delivers improved benchmark scores and enhanced real-world usability, particularly for enterprise and developer workflows. The update is live across Claude Pro, API, Bedrock, and Vertex AI with no pricing change.
Anthropic
Claude
Aug 5, 2025
DeepMind Debuts Genie 3 for Real-Time Text-to-3D World Generation
A new generation of world models is here. DeepMind’s Genie 3 turns text prompts into playable 3D environments at 24 fps with persistent memory and interactive elements. While still in research preview, it represents a major step toward AI agents that can learn and act in open-ended virtual worlds.
Genie
Aug 1, 2025
Alibaba Shrinks Its Coding AI to Run Locally
Qwen3-Coder-Flash is a compact, 30B MoE model capable of local inference on modern MacBooks. It joins Alibaba’s broader Qwen3 ecosystem as a nimble counterpart to the heavyweight 480B hosted version, giving developers a pragmatic hybrid setup for coding workflows.
Alibaba
Qwen
OpenSource
Jul 31, 2025
NVIDIA Releases Llama Nemotron Super v1.5 to Push Open-Source Agent Reasoning
The new 49B model tops open benchmarks with a 128K context window, tool-use capabilities, and single‑GPU efficiency. It's a signal that NVIDIA aims to lead in agent‑focused LLMs that actually run in production.
Nvidia
Nemotron
Jul 31, 2025
Krea Releases FLUX.1 Krea Model with Open Weights
The team behind the FLUX image ecosystem just open-sourced FLUX.1 Krea, a distilled model fine-tuned for photorealistic and aesthetic image generation. Developed with Krea, the model is now freely available under a non-commercial license and slots directly into the broader FLUX.1-dev workflow.
Krea
Flux.1
Jul 31, 2025
Deep Cogito Releases Cogito v2 Models, Challenging Frontier AI on a Budget
Four open-source language models with advanced reasoning and sub-$3.5M training cost aim to rival top-tier AI. Deep Cogito’s approach blends inference-time search with distilled intuition, offering a compelling alternative to closed frontier models.
Deep Cogito
Cogito v2













