AI Roundup 09/26 -> 10/03/2024
OpenAI debuts Realtime API for low-latency voice interactions, Whisper v3 Turbo boosts multilingual speech-to-text, Pika Labs introduces surreal video effects, and Salesforce's SOTA LLM-as-a-Judge
Welcome to the weekly edition of AI Tidbits, where I curate the firehose of AI research papers and tools every week so you won’t have to.
📩 Published a new breakthrough paper? Just released an open-source package? Submit it here to ensure we don’t miss it and that it gets featured in next week’s post.
Overview
✨ Highlights (5 entries)
Language Models (8 entries)
Multimodal (6 entries)
Vision (5 entries)
AI Tools (3 entries)
Open-source Packages (2 entries)
✨ Highlights
Microsoft upgrades Copilot with new AI features like Copilot Voice for natural voice control and Copilot Vision for real-time screen understanding (Microsoft Blog)
Recent AI Tidbits Deep Dives
Language Models
Multimodal
Vision
AI Tools
PDF to Audio - convert PDFs into an audio podcast, lecture, summary and others
Voice-Restore - fix background noise, reverberations, distortions, and signal loss
Open-source Packages
Plus >70 more open-source packages for AI engineers
![ssstwitter.com_1727936762039.mp4 [optimize output image] ssstwitter.com_1727936762039.mp4 [optimize output image]](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddee856e-5957-432c-9b2f-40b55425f43a_600x338.gif)
Last week’s AI Tidbits roundup
Reach AI builders, researchers, and entrepreneurs by partnering with AI Tidbits
If you find AI Tidbits valuable, share it with a friend and consider showing your support.
A toilet, of all things.... compute must be dirt-cheap when your angel investors can make the chips...