Apple
More Apple Intelligence Features
Apple released iOS 18.4 and MacOS 15.4 with additional Apple Intelligence features. Notably, Visual Intelligence is now supported on iPhone 15 Pro models (which does not have the special physical camera button that iPhone 16 has).
Google
Gemma 3
Google has released a new version of the Gemma LLM. Gemma 3 is multimodal and has advanced maths and reasoning capabilities. Available in 140 languages, it can also be used on single-CPU and phone environments. This last bit is possible with a 1-billion parameter version of the model.
Gemini 2.5 Pro
Google released this new version of their Gemini LLM, with the most significant update being a 1M token context window, which should be extended to a 2-token window in the near future.
OpenAI
Work with Apps
OpenAI has deployed Work with Apps for all Mac users with their MacOS desktop application. It is now possible to launch Xcode and use a ChatGPT chat window on top of it, where the chat will recognise the context and execute prompts for what is on the Xcode screen. I have written an article about my test with the new version.
New Image Generation capability
OpenAI has announced that they now have native image generation in GPT-4o (through ChatGPT). Previously ChatGPT would first generate text based on a prompt and then use that to pass to DALL-E to generate the image. Now this can be done natively within ChatGPT using the 4o model, which has now become truly multimodal. Asking for redrawing uploaded images in Studio Ghibli style is now almost a meme, since it seems to produce incredible results. See the result from a photo I uploaded. (Notice anything odd?)

Alibaba
Qwen 2.5 Omni
This is a new release by Qwen and has video chat and voice chat capabilities.
Manus
Manus AI
Manus AI is a new “general AI agent” released in China. It is actually powered by Anthropic’s Claude model and a Qwen model underneath. Its performance is similar to the “Research” mode in OpenAI or in other similar models.