October 2023 - AI Tidbits Monthly Roundup
Multimodal AI soars with Adept’s Fuyu, LLaVA 1.5, and Obsidian, new language models deliver unmatched performance at a fraction of the cost, and a host of new techniques to advance robotics.
Welcome to a subscriber-only edition 🔒 of AI Tidbits, where I curate the firehose of AI research papers and tools so you won’t have to.
This is the monthly curated round-up, so if you're pressed for time and can only catch one AI Tidbits edition, this is the one to read—featuring the absolute must-knows.
If you find AI Tidbits valuable, share it with a friend and consider showing your support.
Welcome to the October edition of AI Tidbits, where we unravel the latest and greatest in AI. October was filled with innovative breakthroughs and groundbreaking research showcasing the astounding pace of progress in AI.
Leading the charge were open-source multimodal models with the likes of Adept’s Fuyu, LLaVA 1.5, and Obsidian - the world’s smallest multimodal AI. On the open-source front, Hugging Face released Zephyr, a language model beating Anthropic’s Claude 2 on AlpacaEval, and Distil-Whisper - a speech2text model that is 6x faster compared to OpenAI’s Whisper.
Apple joined the generative AI race with a few new papers (Matryoshka, SAM-CLIP) and Google DeepMind was hard at work with new techniques to generate high-quality training data for robotics.
These and many more exciting updates across multimodal AI, video models, and open-source tools are part of this month’s roundup.
Let's dive in!
Large Language Models
Commercial (5 entries)
Research (4 entries)
Open-source (8 entries)
✨ Special feature: Multimodal AI (10 entries)
Autonomous Agents (4 entries)
Image and Video (10 entries)
Robotics (5 entries)
Cool Tools (5 entries)
Open-source (6 entries)
Recent AI Tidbits Deep Dives
Large Language Models (LLMs)
DeepMind presents Step-Back Prompting - a two-step abstraction-and-reasoning process resulting in significant performance gains, including a 27% improvement on TimeQA and up to 36% over other prompting methods
Nvidia introduces SteerLM - a technique that enables real-time customization of LLMs during inference, showcasing superior performance on benchmarks and broad applicability across gaming, education, and enterprise sectors
Keep reading with a 7-day free trial
Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.