AI Updates - April 2024

Generative AI

Dr. Levent Mollamustafaoğlu

Apr 29, 2024

Image generated with DALL-E 3 and depicting a software developer working on artificial intelligence

OpenAI

Fine-Tuning

OpenAI announced improvements to their fine-tuning API, giving users more control over their custom-trained models.

GPT-4 Vision

OpenAI has now released a production version of their GPT-4 Vision model, which was previously released as a preview.

New Personal AI Device

OpenAI CEO Sam Altman and ex-Apple designer Johnny Ive are seeking $1 billion in funding to build a new personal AI device.

Microsoft

Stargate

Microsoft and OpenAI are planning to invest $100 billion in AI data centres or a supercomputer, according to some rumours named Stargate. There are claims that the goal is to reach Artificial General Intelligence (AGI).

Wizard-LM 2

Microsoft has released a family of open-source models under the name Wizard-LM 2. It comes in 8x22B, 70B and 7B variants. It seems to perform better than or similar to GPT4 models. It was, however, removed by Microsoft shortly after its release due to “toxicity checks”. Wizard-LM 2 is based on the Mixture-of-Experts architecture.

Phi-3

Microsoft has released a “tiny” open-source Small Language Model named Phi-3, with variant Mini having 3.8B parameters released now, Small having 7B parameters and Medium having 14B parameters to be released later. The model is available on Hugging Face. These models will typically be used on-device with no Internet connection necessary.

Udio

Udio has released a Beta version of an AI Music generator. It is free to use in the Beta period and creates music based on styles and tags. Check this and this to see two rock songs I generated with the prompts “A song about a scientist desperately in love with a girl who spurns his advances, rock” and “a song about the heat death of the universe and the feelings it raises in the mind of a young man, symphonic rock”. The novelty of this generator seems to be that it produces songs with full lyrics, although it allows you to supply the lyrics yourself.

Stability AI

Stable Audio 2.0

Stability AI has released Stable Audio 2.0, which can produce music for up to 3 minutes at 44.1kHz stereo. Apart from text-to-music generation, Stable Audio can also allow audio-to-audio generation by allowing users to upload music and ask for changes through a text prompt. The model was trained on a licensed audio dataset. A free account can generate up to 30 seconds of music when uploading existing audio and up to 3 minutes when generating from a text prompt. Free accounts are given 20 music points and 2 points are needed per generation. Pro accounts have higher limits.

I generated two pieces of music. The first one is smooth jazz, which I found to be really bland. The second one was an upbeat progressive metal piece, which I also found to be quite unoriginal and not very creative. My impression about Stable Audio 2.0 is that it is not great, given alternatives like Udio.

Freepik

Freepik released an image generator. It is based on a prompt and an image style. Although simple in design, it has some nice features. Once the initial set of images is generated, the user can continue generating images with the same prompt by scrolling further. The speed is fine. Alternate versions can be created by “permutation prompts” namely by providing alternatives for some of the words in the prompt in one go. The free subscription gives 10 downloads and 20 image generations per day.

Adobe

Firefly 3.0

Adobe has released Firefly 3.0 within beta versions of their applications like Photoshop etc.

Premier Pro

Premier Pro has now included OpenAI’s Sora model to generate videos. They are also working on a video generation model of their own.

Apple

Apple has released a family of Small Language Models named OpenELM. There are 8 models, ranging from 270M parameters to 3B parameters. These models are small enough to be run on a smartphone.

Back to Software Development

Discussion about this post