OpenAI
GPT Mentions
OpenAI released a new feature which allows the user to type a @ to talk to one or more applications (GPTs) within a chat line and combine their functionality. Although open to limited users at the time, this might be a game changer.
SORA
OpenAI has announced a Text-to-Video model called SORA. It can produce much longer videos as compared to similar models. It is being red-team tested currently and is not open to the public yet.
Memory
OpenAI has introduced the use of memory in ChatGPT. Keeping chat prompts has now become the norm, whereas the user can opt out of using memory.
RWKV
This is an open-source initiative under the Linux Foundation. Their new model Eagle seems to match or outperform most multilingual LLMs. The distinctive feature of the RWKV models is that they are a hybrid between Recursive Neural Networks and Transformer models. This results in longer context windows and a faster performance. RWKV stands for Receptance Weighted Key Value and combines the efficient parallelizable training of transformers with the efficient inference of RNNs. You can find the theoretical paper here and there is a blog for the group here.
Google
Bard —> Gemini
Google has now re-branded their AI front-end Bard as Gemini. Gemini models were released recently, but the most powerful Ultra model has just been released in a version called Gemini Advanced and Google is offering this model through paid subscription.
ImageFX
Google has released an image tool called ImageFX which uses the Imagen 2.0 image model underneath. It is available through the Google AI Test Kitchen but only to users in the U.S, Kenya, New Zealand and Australia.
Gemini 1.5 Pro
Google launched Gemini 1.5 Pro, offering quality comparable to Gemini 1.0 Ultra while utilizing less compute power. It has a 1 million-token context window and improved understanding of all modes including video. It seems to be a Mixture-of-Experts type of model and is currently available in limited preview.
Gemma
Google has released two open-source model to be run on laptops (2B parameters) and desktops (7B parameters). Fine-tuned models are also available and have been fine-tuned on human textual interaction.
Apple
MGIE
Apple has collaborated with UC Santa Barbara to develop a model named the Multi-Modal Large Language Model (MLLM)-Guided Image Editing (MGIE). The theoretical paper can be found here. Initial results seem to indicate that by using a multimodal LLM to start with, instructions to pass to a Diffusion model to generat ethe image can be more instructive and edits can be described precisely, to yield accurate editing of images based on simple instructions.
Meta
Code Llama 70B
Meta has released a specific LLM version for coding. The model has a base version, a Python-specific version and a version for code instruction.
Mistral
Mistral has released their next version of open LLM called Mistral Next. It can be accessed from here and first impressions are that it is good for matk problems.
Stability AI
Stable Diffusion 3
Stability AI has released an early preview of its Stable Diffusion 3 model for image generation. It has a combination of models with number of parameters ranging from 800M to 8B. It uses a Diffusion Transformer architecture (not all diffusion models use transformers)
Stability Video
Stability AI has released a website for the use of its previously released Stable Video Diffusion model.
Midjourney
Midjourney v.6 has added a Style option to use different styles in their image generation.
Leonardo AI
Leonardo has released a new version of their photorealistic image generation model named Lightning XL. They have also released an anime-specific model.

