AI Roundup 05/16 -> 05/23/2024
Microsoft's new AI PC, Copilot agents to automate work, Anthropic peek into LLMs' brain, open-source Phi-3 models with 128k context window and vision, and Mistral's new 7B model with function calling
Welcome to the weekly edition of AI Tidbits, where I curate the firehose of AI research papers and tools every week so you won’t have to.
Overview
✨ Special feature: Microsoft Build 2024
✨ Highlights (3 entries)
Language Models (6 entries)
Multimodal (4 entries)
Vision (6 entries)
Open-source Packages (3 entries)
Recent Deep Dives
✨ Special feature: Microsoft Build 2024
Introduction of Copilot+ PCs - Microsoft revealed a new category of AI-optimized Windows PCs called Copilot+ PCs. These devices feature powerful new silicon, all-day battery life, and access to advanced AI models. Capable of tasks like real-time image creation with Cocreator and language translation with Live Captions, these devices will be available from June 18, starting at $999.
Copilot AI Agents - Microsoft unveiled the expansion of Copilot AI agents to handle menial tasks autonomously. These agents can monitor emails, automate tasks, assist with onboarding, and perform data entry. The new capabilities will be available in Copilot Studio later this year.
Phi-3 Vision and multimodal AI - Microsoft introduced Phi-3-vision, a compact multimodal AI model capable of reading text and analyzing images on mobile devices. This model is part of the Phi-3 family, designed for efficient performance and available in preview now.
Phi Silica local language model - Microsoft announced Phi Silica, a 3.3B transformer-based model optimized for Copilot+ PCs. It is the first small language model deployed locally on Windows, providing fast performance.
Real-time video translation in Edge - Edge browser will soon feature AI-powered real-time video translation. This tool will enable users to translate videos from platforms like YouTube and LinkedIn in multiple languages, enhancing accessibility and global communication.
✨ Highlights
Microsoft open sources new Phi-3 models, including a 7B, 14b, and a new multimodal variant with vision capabilities featuring a 128k context window (Microsoft Blog)
Mistral releases 7B v0.3 - a new version of its small and powerful open language model, with function calling support (Hugging Face)
Language Models
Multimodal
Vision
Open-source Packages
Plus >70 more open-source packages for AI engineers
Last week’s AI Tidbits roundup
Reach AI builders, researchers, and entrepreneurs by partnering with AI Tidbits
If you find AI Tidbits valuable, share it with a friend and consider showing your support.
I don't have a Computer Science background, as I often repeat. But I always read AI Roundup and it strikes me how it manages to fascinate and surprise me every time. Thanks for the links shared, the ones that struck me the most are certainly the ones from this week in the 'Vision' section.
Great, there have been a lot of AI product announcements in the past few weeks, which is very exciting!