Liquid AI
Liquid Foundation Models
Liquid AI, a new MIT-spinup in Boston, has announced new foundation models. These consist of 3 models with 1.3B, 3B and 40B parameters, with the last also being an MoE (Mix of Experts) type of model. They claim that their model performs better than most models including the open-source Llama models.
The models can be tested in the Playground on their site.
Luma AI
Dream Machine 1.6 is the new version of the text-to-video generation model by Luma AI. I had previously tried an earlier version of the model and the results were not very special.
The examples on Luma’s page are good, but the prompts must have been meticulously engineered. I tried my hand at it, but was not impressed by the result, as shown below. (Just check the weird motion of the soldiers at the very end. The video also goes blurry as we use instructions like zoom out.)
Meta
Meta Movie Gen
Meta has released a text-to-video generation model named Movie Gen Video, which can generate 1080p videos from text prompt. It is also possible to generate sound effects and/or soundtracks for the generated video, using Movie Gen Audio, a 13B parameter model, resulting in 48kHz audio. The generated video is limited to 16 seconds when used at 16 frames per second. The tool is based on a 30B parameter transformer model.
It is possible to base a video on a given image by using reference images.
Meta has published a research paper to provide details about the model.
Black Forest Labs
Flux 1.1 Pro
Black Forest Labs have announced the latest version of their text-to-image generation model, Flux 1.1. The claim is that the new version produces images 6 times faster and the video quality is better as compared to the previous version, Flux 1.
They have also announced the beta version of a BFL API.
I tried the beta model through together.ai and the results were not bad.
Pika
Pika 1.5
Pika released a new version of their text-to-video generation model, Pika 1.5.
Pika lets the user use Pikaffects, such as melting, with impressive results.
I had used the 1.0 version, but the new one seems to produce much more impressive videos.
Apple
MM 1.5
Apple has released a new version of their small multi-modal models.
Apple Intelligence - Second Batch
Apple has released beta versions for new Apple Intelligence features as part of iOS 18.2 and MacOS 15.2. These include ChatGPT integration (with Siri and Writing Tools everywhere possible), Image Playground (an image generation tool generating animation and illustration-style images), Genmoji (a tool to create custom emojis) and Visual Intelligence (which is basically analysing scenes to describe images).
See this post for an in-depth analysis of Apple Intelligence features.
nVidia
NVLM
Nvidia has released their open-weights LLM, the NVLM family of models.
OpenAI
ChatGPT
OpenAI has updated ChatGPT so that it starts up with a prompt interface that looks suspiciously similar to the default Search view of Google or the recently popular AI-powered search engine Perplexity. Canvas is also a more prominent feature.
Mistral
Ministral models
Mistral has released two smaller models, Ministral 8B and Ministral 3B with 8B and 3B parameters. Mistral published results implying that their small models perform as good as some of the large models of competitors.
Anthropic
New models
Anthropic has released Claude 3.5 Sonnet New and Claude 3.5 Haiku. The interesting claim is that Claude 3.5 Sonnet New outperforms even OpenAI’s o1 model.
Computer Use
The latest Claude version allows Claude to use your computer and this is demonstrated in a simple web-site building demo, in which Claude builds a web site and solves coding problems by itself. Although experimental, this shows enormous potential for future AI coding tasks.
Midjourney
Midjourney now has an Image Editor that can take an existing image and edit it based on prompts. Open to paid memberships, this editor can also re-texture existing photos.
Adobe
Firefly Video Model
Adobe has released a previously announced video generation model to users in its waiting list. One important claim by Adobe is that its models have been trained only with material whose legal rights were acquired.


