January 2024 - AI Tidbits Monthly Roundup

Generative models that turn text and images into high-quality videos, cheaper GPT, a powerful non-transformer language model, and novel open-source multimodal AI that understand documents and images

Feb 04, 2024

∙ Paid

Welcome to the monthly curated round-up, where we curate the firehose of AI research papers and tools so you won’t have to. If you're pressed for time and can only catch one AI Tidbits edition, this is the one to read—featuring the absolute must-knows.

Welcome to the first monthly edition of AI Tidbits Monthly, where we unravel the latest and greatest in AI. January has kicked off 2024 with a strong start and breakthroughs across generative AI modalities, from text to video and audio.

More significantly, January was the month of image and video generation. Google unveiled Lumiere, its new text-to-video generative model, TikTok open-sourced a new state-of-the-art depth estimation model, and an image synthesis tool called InstantID took the internet by storm.

We also saw substantial progress in multimodal AI with the release of LLaVA 1.6 and Qwen-VL-Max, two commercially permissive models capable of understanding images and reading documents.

On the open-source front, Microsoft allowed the commercial use of its powerful and small Phi-2 model, Meta open-sourced the 70B version of its powerful coding language model Code Llama, and a non-transformer model rivaled large transformer-based SOTA models while being cheaper and faster.

This roundup includes these and many other exciting updates across generative audio, autonomous agents, useful AI-powered tools, and open-source repositories are part of this month’s roundup.

Let's dive in!

Overview

Industry announcements (6 entries)
Large Language Models
- Open-source (10 entries)
- Research (16 entries)
Autonomous Agents (2 entries)
Image and Video (12 entries)
Audio (4 entries)
Multimodal (5 entries)
Open-source packages (5 entries)
AI tools (7 entries)

Read full story

Industry announcements

temp.mov [optimize output image] — Microsoft’s Copilots are now everywhere

Large Language Models (LLMs)

Open-source

Become a premium member to get full access to my content and $1k in free credits for leading AI tools and APIs. It’s common to expense the paid membership from your company’s learning and development education stipend.

Upgrade to Premium

Research

Keep reading with a 7-day free trial

Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.