December 2023 - AI Tidbits Monthly Roundup

Google’s long-awaited Gemini, Apple's first major strides into generative AI, a state-of-the-art transcription model outperforming Whisper, and an autonomous AI that operates smartphone apps for you

Jan 06, 2024

∙ Paid

Welcome to the monthly curated round-up, where I curate the firehose of AI research papers and tools so you won’t have to. If you're pressed for time and can only catch one AI Tidbits edition, this is the one to read—featuring the absolute must-knows.

Welcome to the December edition of AI Tidbits Monthly, where we unravel the latest and greatest in AI. December provided an exciting finale to a year filled with innovative breakthroughs and groundbreaking research.

This December, Google debuted its long-awaited large multimodal AI, Gemini, incorporating it into its Bard chatbot and providing API access. Apple also made its first major strides into generative AI with its efficient on-device inference framework, an open-source multimodal model named Ferret, and a new robust Apple silicon framework for enhanced ML efficiency.

On the open-source front, Mistral released a fully open-source Mixture of Experts model that outperforms GPT-3.5 and Llama 70B. Deci released a state-of-the-art 7B base model, and Microsoft introduced a coding LLM, CodeOcean, that beats the current SOTA open and closed LLMs on coding tasks.

Also on the open-source front, though for speech understanding and generation, Nvidia released Parakeet, a speech-to-text model that outperforms Whisper v3. Additional noteworthy developments include the unveiling of OpenVoice, a novel voice cloning technology, and Amphion, an extensive toolkit dedicated to generating audio, music, and speech.

These and many more exciting updates across novel promoting frameworks, autonomous agents, multimodal AI, and open-source repositories are part of this month’s roundup.

Let's dive in!

Overview

Industry announcements (6 entries)
Large Language Models
- Open-source (9 entries)
- Prompting techniques (3 entries)
- Research (8 entries)
Autonomous Agents (3 entries)
Image and Video (8 entries)
Audio (5 entries)
Multimodal (5 entries)
Open-source packages (4 entries)

Recent Deep Dives

AI Tidbits 2023 SOTA Report

Sahar Mor

December 28, 2023

Read full story

Most popular and upcoming Generative AI tools and APIs

Sahar Mor

December 19, 2023

Read full story

Harnessing research-backed prompting techniques for enhanced LLM performance

Sahar Mor

December 10, 2023

Read full story

Revolutionizing document processing with multimodal GPT

Sahar Mor

October 30, 2023

Read full story

Industry announcements

Large Language Models (LLMs)

Open-source

Become a premium member to get full access to my content and $1k in free credits for leading AI tools and APIs. It’s common to expense the paid membership from your company’s learning and development education stipend.

Upgrade to Premium

Prompting techniques

OpenAI releases a prompt engineering guide to improve LLM performance

Keep reading with a 7-day free trial

Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.