August 2024 - AI Tidbits Monthly Roundup

Musk's xAI unveils Grok-2, OpenAI's new GPT-4o model, Microsoft's open-source Phi 3.5 series, Nvidia's Eagle multimodal LLMs, Black Forest Labs' FLUX.1 to flood the internet with uncensored images

Sep 08, 2024

∙ Paid

Welcome to the monthly curated round-up, where we curate the firehose of AI research papers and tools so you won’t have to. If you're pressed for time and can only catch one AI Tidbits edition, this is the one to read—featuring the absolute must-knows.

August has been a month of remarkable progress across various AI domains, from industry giants to open-source breakthroughs.

Elon Musk's xAI made waves with the unveiling of Grok-2 and Grok-2 mini, showcasing advanced capabilities that rival top models. OpenAI continued to refine its offerings with a more efficient GPT-4o and the introduction of Structured Outputs. The open-source community saw significant advancements, with Microsoft's Phi 3.5 series and AI21's Jamba models pushing the boundaries of what's possible with freely available models.

In the realm of multimodal AI, Nvidia's Eagle and Alibaba's Qwen2-VL demonstrated impressive performance in visual understanding tasks. The image and video generation field saw major leaps with Black Forest Labs' FLUX.1 and Tsinghua University's CogVideoX-5B. Audio AI also made strides, with Qwen2-Audio enabling multilingual voice interaction and HuggingFace's Parler TTS v1 offering enhanced text-to-speech capabilities.

Perhaps most intriguingly, Sakana AI introduced The AI Scientist, a system that could revolutionize scientific research by automating idea generation, execution, and documentation.

These developments, along with many more exciting updates across language models, multimodal AI, and specialized applications, are part of this month's comprehensive roundup.

Let's dive in!

Overview

Industry announcements (8 entries)
Large Language Models
- Open-source (8 entries)
- Research (8 entries)
Multimodal (11 entries)
Image and Video (10 entries)
Audio (6 entries)
Robotics (2 entries)
Open-source Packages (8 entries)

Recent Deep Dives

Top 8 leaderboards to choose the right AI model for your task

Sahar Mor

February 17, 2024

Read full story

12 techniques to reduce your LLM API bill and launch blazingly fast products

Sahar Mor

January 13, 2024

Read full story

Harnessing research-backed prompting techniques for enhanced LLM performance

Sahar Mor

December 10, 2023

Read full story

[cross-post] 7 methods to secure LLM apps from prompt injections and jailbreaks

Sahar Mor

February 9, 2024

Read full story

Become a premium member to get full access to my content and $1k in free credits for leading AI tools and APIs like Claude, Replicate, and Hugging Face. It’s common to expense the paid membership from your company’s learning and development education stipend.

Upgrade to Premium

Industry announcements

Large Language Models

Open-source

Keep reading with a 7-day free trial

Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.