AI Tidbits

AI Tidbits

Share this post

AI Tidbits
AI Tidbits
July 2024 - AI Tidbits Monthly Roundup
Copy link
Facebook
Email
Notes
More
Monthly's

July 2024 - AI Tidbits Monthly Roundup

New language models from OpenAI, Mistral, and Google, Meta’s open release of its powerful Llama 3.1 models, Segment Anything v2 leapfrogs computer vision, and advanced RAG techniques

Arthur Mor's avatar
Arthur Mor
Aug 04, 2024
∙ Paid
13

Share this post

AI Tidbits
AI Tidbits
July 2024 - AI Tidbits Monthly Roundup
Copy link
Facebook
Email
Notes
More
2
Share

Welcome to the monthly curated round-up, where we curate the firehose of AI research papers and tools so you won’t have to. If you're pressed for time and can only catch one AI Tidbits edition, this is the one to read—featuring the absolute must-knows.


July was filled with remarkable progress across various AI domains, from LLMs to text-to-video models. But most of all, July was the month of open-source AI, with Meta's Llama 3.1 suite and Google's Gemma-2-2B pushing the boundaries of what's possible with freely available models, rivaling top proprietary LLM providers such as GPT-4o and Claude Sonnet.

OpenAI made waves with the release of GPT-4o mini, a cost-effective multimodal model, and the unveiling of SearchGPT, a potential Google Search competitor. Meanwhile, Mistral AI flexed its muscles with Mistral Large 2 and partnered with Mamba creators to release the impressive Codestral-Mamba 7B.

In the realm of research, DeepMind's AlphaProof and AlphaGeometry 2 showcased AI's growing prowess in advanced mathematics. The emergence of autonomous agents continued with OpenDevin, while image and video generation saw leaps forward with Meta's Segment Anything Model (SAM) 2 and Stability AI's Stable Video 4D.

These developments, along with many more exciting updates across language models, multimodal AI, and specialized applications, are part of this month's comprehensive roundup.

Let's dive in!


Overview

  • Industry announcements (6 entries)

  • Large Language Models

    • Open-source (14 entries)

    • Research (8 entries)

  • Autonomous Agents (4 entries)

  • Multimodal (3 entries)

  • Image and Video (10 entries)

  • Audio (2 entries)

  • AI Tools (5 entries)

  • Open-source Packages (7 entries)

Recent Deep Dives

Top 8 leaderboards to choose the right AI model for your task

Top 8 leaderboards to choose the right AI model for your task

Sahar Mor
·
February 17, 2024
Read full story
12 techniques to reduce your LLM API bill and launch blazingly fast products

12 techniques to reduce your LLM API bill and launch blazingly fast products

Sahar Mor
·
January 13, 2024
Read full story
Harnessing research-backed prompting techniques for enhanced LLM performance

Harnessing research-backed prompting techniques for enhanced LLM performance

Sahar Mor
·
December 10, 2023
Read full story
[cross-post] 7 methods to secure LLM apps from prompt injections and jailbreaks

[cross-post] 7 methods to secure LLM apps from prompt injections and jailbreaks

Sahar Mor
·
February 9, 2024
Read full story
Most popular and upcoming Generative AI tools and APIs

Most popular and upcoming Generative AI tools and APIs

Sahar Mor
·
December 19, 2023
Read full story
Become a premium member to get full access to my content and $1k in free credits for leading AI tools and APIs like Perplexity, Replicate, and Hugging Face. It’s common to expense the paid membership from your company’s learning and development education stipend.

Upgrade to Premium

Industry announcements

  1. OpenAI releases GPT-4o mini - a cost-effective multimodal model, offering superior performance at 60% less cost than GPT-3.5 Turbo

  2. Mistral releases Mistral Large 2, boasting a 128k context window and superior performance in multiple languages and coding tasks

  3. OpenAI unveils SearchGPT - a Google Search competitor combining real-time web information with conversational abilities for precise and fast answers

  4. OpenAI begins alpha testing Advanced Voice Mode - ChatGPT's new real-time and natural conversational AI demoed in OpenAI's April Spring Updates event

  5. Anthropic releases a revamped developer console to help LLM users and developers optimize and evaluate prompts, including a side-by-side comparison of outputs

  6. Odyssey, a new startup emerging from stealth, presents Hollywood-grade visual AI, offering filmmakers and artists detailed control over geometry, materials, lighting, and motion for high-end productions

    ssstwitter.com_1722364357402.mp4 [optimize output image]
    ChatGPT Advanced Voice Mode is starting to roll out to alpha users

Large Language Models

Open-source

  1. Meta unveils Llama 3.1 - an open suite of three language models (7B, 80B, 405B), with the largest one outperforming GPT-4o and Claude Sonnet on multiple benchmarks

  2. Google releases Gemma-2-2B - a commercially permissive compact 2 billion parameter model that outperforms models 20x its size, including Mixtral 8x7B, GPT 3.5, and Llama-2 70B

  3. Mistral partners with the Mamba creators to release the commercially permissive Codestral-Mamba 7B - the strongest code model for its size, featuring a 256k context window

  4. Mistral releases Mathstral - an open model excelling in multi-step logical reasoning for STEM subjects, achieving state-of-the-art scores on key benchmarks 

  5. Mistral AI releases Mistral NeMo - a powerful 12B open language model developed with Nvidia, featuring a 128k context window and outperforming competitors in reasoning and coding tasks

Keep reading with a 7-day free trial

Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Substack Inc
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More