October 2023 - AI Tidbits Monthly Roundup

Multimodal AI soars with Adept’s Fuyu, LLaVA 1.5, and Obsidian, new language models deliver unmatched performance at a fraction of the cost, and a host of new techniques to advance robotics.

Nov 12, 2023

∙ Paid

Welcome to a subscriber-only edition 🔒 of AI Tidbits, where I curate the firehose of AI research papers and tools so you won’t have to.

This is the monthly curated round-up, so if you're pressed for time and can only catch one AI Tidbits edition, this is the one to read—featuring the absolute must-knows.

If you find AI Tidbits valuable, share it with a friend and consider showing your support.

Support AI Tidbits

Welcome to the October edition of AI Tidbits, where we unravel the latest and greatest in AI. October was filled with innovative breakthroughs and groundbreaking research showcasing the astounding pace of progress in AI.

Leading the charge were open-source multimodal models with the likes of Adept’s Fuyu, LLaVA 1.5, and Obsidian - the world’s smallest multimodal AI. On the open-source front, Hugging Face released Zephyr, a language model beating Anthropic’s Claude 2 on AlpacaEval, and Distil-Whisper - a speech2text model that is 6x faster compared to OpenAI’s Whisper.

Apple joined the generative AI race with a few new papers (Matryoshka, SAM-CLIP) and Google DeepMind was hard at work with new techniques to generate high-quality training data for robotics.

These and many more exciting updates across multimodal AI, video models, and open-source tools are part of this month’s roundup.

Let's dive in!

Overview

Large Language Models
- Commercial (5 entries)
- Research (4 entries)
- Open-source (8 entries)
✨ Special feature: Multimodal AI (10 entries)
Autonomous Agents (4 entries)
Image and Video (10 entries)
Robotics (5 entries)
Cool Tools (5 entries)
Open-source (6 entries)