AI Updates-May 2025

Generative AI

Jun 01, 2025

Image generated by DALL-E 3 on a prompt by the author

OpenAI

Windsurf

OpenAI has acquired Windsurf, an AI coding company (formerly called Codeium). I had actually used their Codeium product before OpenAI released their version of ChatGPT which is aware of on-screen Xcode windows and can assist coding.

iO

OpenAI announced a merger with iO, the AI device company founded by former Apple design leader Jony Ive. It is not clear what products this merger targets, but it is clear that they have in mind something other than what we can use AI with, namely, most likely, some kind of wearables. The way Sam Altman described how it takes a considerable time to search ideas at the time with what the duo call “legacy tools”, namely with a laptop or phone pushes me to think that it must be wearables they have in mind. Given Jony Ive’s history with the design of iPhone and MacBook at Apple, this merger might result in some interesting tools for AI. It’s a pity that Apple parted ways with Ive. There are also reports that iO has a lot of ex-Apple engineers.

Tencent

Hunyuan 3D v. 2.5

Tencent has released a new version of their Hunyuan model which can create 3D assets from 2D images.

Initial reaction is very positive, since the model can generate detailed 3D assets.

Alibaba

Vace

Built on Alibaba’s WAN 2.1 backbone model, the open-source model Vace can do everything to do about generating and editing videos in the same tool.

O1CN01aUJ8D51N3v5zvDAgV_!!6000000001515-

Bytedance

Seed1.5-VL

TikTok owner Bytedance has released a vision-language multimodal large model called Seed 1.5-VL. The model is composed of a 532M-parameter vision encoder and a Mixture-of-Experts (MoE) LLM of 20B activate parameters. It has been trained on 3-trillion tokens.

Deerflow

Deerflow is a deep research tool produced by ByteDance and put into the open-source community. It is a framework combining MMLs, agents, search tools. Using the MCP (Model Context Protocol, a standard protocol introduced by Anthropic in 2024), it can combine tools with models effectively.

Google

Gemini 2.5 Pro

Google has released the most advanced version of their Gemini model, which excels at coding and complex prompts. It is multimodal and can perform advanced reasoning, using a 1 million token context window.

Veo 3

Google released their new text-to-video model Veo 3 and it was an immediate sensation on the Internet. In addition to producing videos in 4K, Veo 3 can produce audio, synchronising it with the video itself, it can use different languages and accents in producing the audio, it can add ambient sound and sound effects.

Search

Google now enables a user to use the “AI Mode” while searching, thus combining the search function with the Gemini model.

Imagen 4

New version of Imagen can produce 2K images. Imagen is a latent diffusion model and it has better photorealistic images, sharper clarity, improved spelling and typography. I’ve included an image generated from a simple prompt below and the results are quite realistic.

Flow

Flow is an AI filmmaking tool working on top of Google models. It can use the underlying generative AI tools Veo, Imagen and Gemini to produce clips and longer sequences.

Anthropic

Claude 4

Anthropic has released the latest version of their Claude LLM. Coming in the Opus and Sonnet variants, this constitutes a significant upgrade to the previous versions.

Claude Opus 4 is the most powerful model and it is especially successful in coding and complex problem-solving. Claude Sonnet 4 is not as powerful, but provides a balanced performance.

Claude Code is a coding model that can be integrated to coding platforms (and is already available within GitHub).

Back to Software Development

Discussion about this post

Ready for more?