AI Tidbits

AI Tidbits

Share this post

AI Tidbits
AI Tidbits
December 2023 - AI Tidbits Monthly Roundup
Copy link
Facebook
Email
Notes
More
Monthly's

December 2023 - AI Tidbits Monthly Roundup

Google’s long-awaited Gemini, Apple's first major strides into generative AI, a state-of-the-art transcription model outperforming Whisper, and an autonomous AI that operates smartphone apps for you

Arthur Mor's avatar
Arthur Mor
Jan 06, 2024
∙ Paid
22

Share this post

AI Tidbits
AI Tidbits
December 2023 - AI Tidbits Monthly Roundup
Copy link
Facebook
Email
Notes
More
2
Share

Welcome to the monthly curated round-up, where I curate the firehose of AI research papers and tools so you won’t have to. If you're pressed for time and can only catch one AI Tidbits edition, this is the one to read—featuring the absolute must-knows.


Welcome to the December edition of AI Tidbits Monthly, where we unravel the latest and greatest in AI. December provided an exciting finale to a year filled with innovative breakthroughs and groundbreaking research.

This December, Google debuted its long-awaited large multimodal AI, Gemini, incorporating it into its Bard chatbot and providing API access. Apple also made its first major strides into generative AI with its efficient on-device inference framework, an open-source multimodal model named Ferret, and a new robust Apple silicon framework for enhanced ML efficiency.

On the open-source front, Mistral released a fully open-source Mixture of Experts model that outperforms GPT-3.5 and Llama 70B. Deci released a state-of-the-art 7B base model, and Microsoft introduced a coding LLM, CodeOcean, that beats the current SOTA open and closed LLMs on coding tasks.

Also on the open-source front, though for speech understanding and generation, Nvidia released Parakeet, a speech-to-text model that outperforms Whisper v3. Additional noteworthy developments include the unveiling of OpenVoice, a novel voice cloning technology, and Amphion, an extensive toolkit dedicated to generating audio, music, and speech.

These and many more exciting updates across novel promoting frameworks, autonomous agents, multimodal AI, and open-source repositories are part of this month’s roundup.

Let's dive in!


Overview

  • Industry announcements (6 entries)

  • Large Language Models

    • Open-source (9 entries)

    • Prompting techniques (3 entries)

    • Research (8 entries)

  • Autonomous Agents (3 entries)

  • Image and Video (8 entries)

  • Audio (5 entries)

  • Multimodal (5 entries)

  • Open-source packages (4 entries)

Recent Deep Dives

AI Tidbits 2023 SOTA Report

AI Tidbits 2023 SOTA Report

Sahar Mor
·
December 28, 2023
Read full story
Most popular and upcoming Generative AI tools and APIs

Most popular and upcoming Generative AI tools and APIs

Sahar Mor
·
December 19, 2023
Read full story
Harnessing research-backed prompting techniques for enhanced LLM performance

Harnessing research-backed prompting techniques for enhanced LLM performance

Sahar Mor
·
December 10, 2023
Read full story
Revolutionizing document processing with multimodal GPT

Revolutionizing document processing with multimodal GPT

Sahar Mor
·
October 30, 2023
Read full story

Industry announcements

  1. Google debuts Gemini - its large multimodal AI coming in three sizes: Nano, optimized for mobile devices and offering offline functionality; Pro, now powering Google's Bard and designed for a wide range of AI services; and Ultra, the most powerful version, targeting data centers and enterprise applications, set to release in Q1 ’24

  2. With €385M in new funding, Mistral AI debuts a new platform featuring its Mixtral MoE model, becoming a direct competitor of OpenAI, Google, and Anthropic

  3. DeepMind releases Imagen 2 - its next-generation text-to-image model that competes with DALL-E 3 and Midjourney

  4. Meta launches a standalone AI-powered image generator

  5. Midjourney breaks out of Discord and launches an alpha web-based image generation platform, offering an enhanced interface

  6. The New York Times files a billion-dollar lawsuit against OpenAI and Microsoft, accusing them of copyright infringement for using its articles in training ChatGPT and Copilot

    Google’s Gemini demo

Large Language Models (LLMs)

Open-source

  1. Mistral AI releases a commercially permissive Mixture of Experts model, featuring eight experts with 7B parameters each, outperforming models like GPT-3.5 and Llama 2 70B 

  2. Microsoft publishes CodeOcean and WaveCoder, outperforming existing models in code-related tasks by generating and leveraging high-quality instruction data

  3. Microsoft open sources E5 mistral-7b - a groundbreaking text embedding approach, leveraging synthetic data from LLMs and efficient training to achieve top performance on major benchmarks

  4. Alibaba releases Qwen-72B, a 32K context window LLM, and Qwen-1.8B, an efficient AI model requiring only 3GB GPU memory

  5. TII open-sources its family of Falcon LLMs, led by Falcon-180B, and publishes a paper outlining detailed evaluations and training methods used to develop Falcon

  6. DeciAI open sources DeciLM 7B - the fastest and most cost-effective 7B pretrained model, topping the OpenLLM Leaderboard

  7. Korean AI startup Upstage introduces SOLAR 10.7B, leveraging Depth Up-Scaling to surpass larger models in natural language tasks and achieve top rankings in the HF Open Leaderboard without the complexities of MoE scaling

  8. Meta releases Purple Llama - a project encompassing CyberSec Eval and Llama Guard to enhance generative AI safety and trust

  9. Apple open-sources MLX - a robust framework for Apple silicon featuring user-friendly APIs and advanced computational features for enhanced ML efficiency 

    Microsoft’s WaveCoder outperforms previous state-of-the-art LLMs across programming languages
Become a premium member to get full access to my content and $1k in free credits for leading AI tools and APIs. It’s common to expense the paid membership from your company’s learning and development education stipend.

Upgrade to Premium

Prompting techniques

  1. OpenAI releases a prompt engineering guide to improve LLM performance

Keep reading with a 7-day free trial

Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Substack Inc
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More