AI Tidbits

AI Tidbits

Share this post

AI Tidbits
AI Tidbits
January 2024 - AI Tidbits Monthly Roundup
Copy link
Facebook
Email
Notes
More
Monthly's

January 2024 - AI Tidbits Monthly Roundup

Generative models that turn text and images into high-quality videos, cheaper GPT, a powerful non-transformer language model, and novel open-source multimodal AI that understand documents and images

Arthur Mor's avatar
Arthur Mor
Feb 04, 2024
∙ Paid
14

Share this post

AI Tidbits
AI Tidbits
January 2024 - AI Tidbits Monthly Roundup
Copy link
Facebook
Email
Notes
More
2
Share

Welcome to the monthly curated round-up, where we curate the firehose of AI research papers and tools so you won’t have to. If you're pressed for time and can only catch one AI Tidbits edition, this is the one to read—featuring the absolute must-knows.


Welcome to the first monthly edition of AI Tidbits Monthly, where we unravel the latest and greatest in AI. January has kicked off 2024 with a strong start and breakthroughs across generative AI modalities, from text to video and audio.

More significantly, January was the month of image and video generation. Google unveiled Lumiere, its new text-to-video generative model, TikTok open-sourced a new state-of-the-art depth estimation model, and an image synthesis tool called InstantID took the internet by storm.

We also saw substantial progress in multimodal AI with the release of LLaVA 1.6 and Qwen-VL-Max, two commercially permissive models capable of understanding images and reading documents.

On the open-source front, Microsoft allowed the commercial use of its powerful and small Phi-2 model, Meta open-sourced the 70B version of its powerful coding language model Code Llama, and a non-transformer model rivaled large transformer-based SOTA models while being cheaper and faster.

This roundup includes these and many other exciting updates across generative audio, autonomous agents, useful AI-powered tools, and open-source repositories are part of this month’s roundup.

Let's dive in!


Overview

  • Industry announcements (6 entries)

  • Large Language Models

    • Open-source (10 entries)

    • Research (16 entries)

  • Autonomous Agents (2 entries)

  • Image and Video (12 entries)

  • Audio (4 entries)

  • Multimodal (5 entries)

  • Open-source packages (5 entries)

  • AI tools (7 entries)

Recent Deep Dives

[cross-post] 7 methods to secure LLM apps from prompt injections and jailbreaks

[cross-post] 7 methods to secure LLM apps from prompt injections and jailbreaks

Sahar Mor
·
February 9, 2024
Read full story
12 techniques to reduce your LLM API bill and launch blazingly fast products

12 techniques to reduce your LLM API bill and launch blazingly fast products

Sahar Mor
·
January 13, 2024
Read full story
AI Tidbits 2023 SOTA Report

AI Tidbits 2023 SOTA Report

Sahar Mor
·
December 28, 2023
Read full story
Most popular and upcoming Generative AI tools and APIs

Most popular and upcoming Generative AI tools and APIs

Sahar Mor
·
December 19, 2023
Read full story
Harnessing research-backed prompting techniques for enhanced LLM performance

Harnessing research-backed prompting techniques for enhanced LLM performance

Sahar Mor
·
December 10, 2023
Read full story

Industry announcements

  1. OpenAI announces new embedding models with superior performance and affordability, alongside reduced pricing for GPT-3.5 Turbo and an improved GPT-4 Turbo preview addressing response "laziness"

  2. OpenAI launches its GPT Store, its platform for sharing and monetizing custom GPTs

  3. Rabbit introduces R1 - a $199 standalone AI device featuring voice control and a Large Action Model for universal application control

  4. Microsoft launches a Pro plan for its Copilot chatbot, infusing its Microsoft 365 suite of apps (Outlook, Word, Excel, etc.) with ChatGPT-like capabilities

  5. Meta's Mark Zuckerberg announces the development of AGI and Llama 3, integrating FAIR and GenAI teams, and a significant GPU expansion to advance AI capabilities

  6. Stability AI introduces Stable LM 2 1.6B - an efficient multilingual language model, setting a new benchmark in small-scale LMs

temp.mov [optimize output image]
Microsoft’s Copilots are now everywhere

Large Language Models (LLMs)

Open-source

  1. Microsoft changes the license for its small open-source powerful language model phi-2, allowing commercial use

  2. Meta open-sources Code Llama 70B - a coding language model outperforming GPT-4 in complex coding tasks, available for commercial use

  3. Alibaba open sources Qwen-VL-Max - a large vision language model outperforming all previous open-source models and performing on par with Gemini Ultra and GPT-4V

  4. Abacus AI releases SMAUG, the world-leading 30B open-source LLM, with a top MMLU score of 76%, inching towards GPT-4 performance

  5. RWKV releases Eagle 7B, a non-transformer multilingual language model rivaling larger models in performance while being faster and cheaper, available for open commercial use 

  6. Moondream1 - a tiny 1.6B parameter vision language model that punches above its weight

  7. Researchers open source TinyLlama - a 1.1B parameter language model, showcasing superior performance over similar-sized models

  8. Researchers release Orion-14B - a collection of state-of-the-art multilingual LLMs achieving superior performance in diverse tasks

  9. Researchers release Medusa-2 - combining parallel token predictions and tree-based attention to achieve up to 3.6x faster inference without compromising quality or accuracy

  10. A new leaderboard on Hugging Face to track and evaluate hallucinations in open-source language models

    Qwen-VL-Max is better at recognizing, extracting, and analyzing details within images and texts
Become a premium member to get full access to my content and $1k in free credits for leading AI tools and APIs. It’s common to expense the paid membership from your company’s learning and development education stipend.

Upgrade to Premium

Research

  1. Mistral published the Mixtral MoE 8x7B paper, shedding light on its architecture and training process

  2. Anthropic uncovers that LLMs can learn and retain deceptive behaviors, with standard safety techniques proving ineffective

  3. University of Maryland introduces Binoculars - a 90% accurate detection method for distinguishing LLM-generated text, outperforming existing tools with minimal false positives

Keep reading with a 7-day free trial

Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Substack Inc
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More