Updates in Generative AI - December 2023

Investigation

Dec 18, 2023

Image generated with DALl-E 3 and depicting a developer at work — Author with ChatGPT 4-DALL-E 3

Pika Labs

Pika Labs released their text-to-video model Pika. At this point, there is not much detail about the model, and Pika Labs is running a waitlist for those who want to try the new model.

Google

Gemini

Google and Google Deepmind announced their new model Gemini. It comes in Ultra, Pro and Nano sizes. Google claims that Gemini Ultra slightly outperforms GPT 4 in most benchmarks. Gemini uses multimodal prompting to generate the required results.

Google simultaneously released AlphaCode 2, their new version of the code generation model.

Google has deployed Gemini Pro to be used with their Bard engine. They are also going to use the Gemini Nano model in their Pixel phone.

Apple

MLX

Apple has released MLX, a Machine Learning framework with APIs in Python and C++ as an open-source library on Github. Although Apple does not seem to have built their own AI models, MLX can run on Apple hardware and can use popular AI models like LLaMA, Mistral or Stable Diffusion.

Mistral

Mixtral

The open-source LLM developer has released Mixtral, which is an 8-expert, sparse mixture of experts (SMoE) model with about 52 billion total parameters of which 12B are active at inference. Sparse Mixture-of-Experts models can decouple model size from inference efficiency by only activating a small subset of the model parameters for any given input token. (See this paper for more details on SMoE) MoEs are neural networks that are trained on subsets of data. These expert models use a much smaller number of parameters as compared to traditional deep learning networks.

Mixtral performance results show that it is performing similarly to GPT-3.5 or Llama 2 70B.

Open Hermes 2.5

I’ve also got information about a Mistral-based model named Open Hermes 2.5 released in October 2023. It is based on training the Mistral 7B model and was trained on 1,000,000 entries of primarily GPT-4 generated data, as well as other high-quality data from open datasets. It has also been given an additional 100K examples for code generation. It performs well on many benchmarks, mainly for code generation.

Distillery

Distillery is a new image generation service which has been compared to Midjourney and is open source. It has been built by a company called FollowFox. It is based on Stable Diffusion 1.5 and the latest model is called Cosmopolitan. They trained the model with Midjourney data. It is an alpha version that can be tried through a Discord invite. It gives you 10 images to create per day. The results are fine, but I have not tried detailed prompts yet.

Writesonic

Although this was released in December 2022, I just discovered it now. Writesonic is an AI company in India. It has released Chatsonic, an advanced AI chatbot which uses ChatGPT but also has a link to Google search, thus effectively searching for up-to-date data while generating text and images. In free (premium) mode, it gives the user 10000 “premium words” and uses ChatGPT 3.5. The “Superior” model targets small teams and uses ChatGPT4.

Back to Software Development

Discussion about this post