AI Roundup 06/06 -> 06/13/2024
Apple's long-awaited AI announcements, new text-to-video models to rival OpenAI's Sora, Meta's open agent solving complex tasks, and new benchmarks to reliably measure generative models' performance
Welcome to the weekly edition of AI Tidbits, where I curate the firehose of AI research papers and tools every week so you won’t have to.
Overview
✨ Special feature: Apple WWDC
✨ Highlights (5 entries)
Language Models (13 entries)
Autonomous Agents (3 entries)
Vision (6 entries)
Audio (2 entries)
Recent Deep Dives
✨ Special feature: Apple Worldwide Developers Conference (WWDC)
Apple Intelligence - a suite of new AI features for iPhone, Mac, and other devices. This includes a more conversational Siri, custom AI-generated "Genmoji," and integration with OpenAI's GPT-4o. Despite being efficient for their size, Apple's on-device AI models perform below state-of-the-art systems like GPT-4. However, they excel in specific tasks using an adapter strategy, with on-device models outperforming larger models in summarizing and composing text.
Enhanced Siri capabilities - Siri will gain new abilities such as managing notifications, writing and summarizing text, and carrying out actions across multiple apps. Users can interact with Siri through voice or typing, with many features functioning on-device for enhanced privacy.
Genmoji and Image Playground - Apple is launching Genmoji to create emoji-like reactions on demand and Image Playground for AI-generated images. These features will be integrated into various apps, including Photos, which will have improved search and editing capabilities similar to Google's Magic Eraser.
OpenAI integration - Siri will leverage ChatGPT, powered by GPT-4o, for complex requests, ensuring user permission before sending data. ChatGPT will be available across iOS, macOS, and iPadOS, supporting AI writing and image generation tools.
—> Apple WWDC 2024 keynote in 18 minutes
✨ Highlights
Luma AI debuts Dream Machine - an OpenAI Sora-like tool enabling users to create realistic videos from text prompts in just two minutes (Company blog)
Stability AI releases the weights for Stable Diffusion 3 Medium - its latest and most advanced text-to-image model (Stability AI)
Language Models
Autonomous Agents
Vision
Audio
Last week’s AI Tidbits roundup
Reach AI builders, researchers, and entrepreneurs by partnering with AI Tidbits
If you find AI Tidbits valuable, share it with a friend and consider showing your support.