Anthropic
Claude
Anthropic’s LLM Claude can now search the Internet.
Meta
Llama 4
Meta has released a new version (4) of their open-source LLM model Llama. It comes in three different sizes: Behemoth is a 2-trillion parameter model with a Mixture of Experts (MoE) architecture and 16 experts; Maverick is a 400-billion parameter model with 128 experts (more experts but with smaller parameter space) and Scout is a 109-billion parameter model with 16 experts. Maverick and Scout are immediately available, whereas Behemoth is available as a preview. All three models are multimodal, thus they can be used to generate text or images.
Meta claims that all Llama 4 models are very competitive with the top models available from other companies and in some benchmarks they perform better.
However, there have been several claims that the performance of Llama 4 has been rigged and it is not as high as Meta might be claiming (See another Substack post for a detailed description of this “scandal”)
Midjourney
Midjourney v. 7 Alpha
Midjourney has released an alpha of their version 7. It is much smarter with text prompts, with better image quality and textures. It is possible to personalise the model, namely you can rank images so that the model gets your preferences.
It is also possible to use it in draft mode, which is less costly and is a good way to explore possibilities before finalising a result.
OpenAI
GPT 4.1
OpenAI has released new versions of its flagship model (4.1, 4.1 Mini and 4.1 Nano). This new model works much better than 4.5 on many tasks including coding and as such OpenAI is retiring the 4.5 versions. The new models use a 1 million token context window.
GPT o3 and GPT o4-Mini
OpenAI has released two new models of their o-series. o3 is specialised on coding, math, science, visual perception and other features. o4-Mini is, as its name suggest, a small and efficient model specialising in math, coding and visual tasks. However, it has been shown that the hallucination rates have gone up to 30% or more, thus it is a mixed success.
Kuaishou
Kling 2.0 and Kolors 2.0
Kuaishou has released new versions of their text-to-video and text-to-image models.
Kling 2.0 has better prompt adherence, greatly enhanced dynamics and improved aesthetics. Kolors 2.0 has better prompt adherence, adherence to 60+ styles and more cinematic visuals for realistic imagery.
It is also possible now to edit the video or image with text prompts only (e.g. replacing a character with another etc.)
Perplexity
Perplexity AI
Perplexity introduced the voice assistant mode for the iOS application that is seemingly doing what Siri should have done for the iPhone.
DomoAI Pte.
DomoAI
This is a company in Singapore and DomoAI is a tool to generate images and videos. What is striking about the model, which has an output similar to good examples like Sora or Kling, is that the generation is done with minimal input and simple prompts.