AI Roundup 08/22 -> 09/05/2024
A new text-to-video model flooding the internet, Alibaba's new Qwen2-VL to dominate vision tasks, an AI coding startup featuring a 400M context window, new techniques for RAG, and $1B for safe AI
Welcome to the weekly edition of AI Tidbits, where I curate the firehose of AI research papers and tools every week so you won’t have to.
📩 Published a new breakthrough paper? Just released an open-source package? Submit it here to ensure we don’t miss it and that it gets featured in next week’s post.
Overview
✨ Highlights (7 entries)
Language Models (17 entries)
Multimodal (6 entries)
Vision (8 entries)
Audio (4 entries)
Open-source Packages (4 entries)
Recent Deep Dives
✨ Highlights
Tsinghua University openly release its state-of-the-art text-to-video generation model, CogVideoX-5B (Hugging Face)
Ex-OpenAI chief scientist Ilya Sutskever raises $1B at a $5B valuation to develop safe superintelligence (TechCrunch)
Language Models
Multimodal
Vision
Audio
Open-source Packages
Kotaemon - a RAG-based tool for chatting with your documents
v1 - build a production-ready SaaS using the latest Next.js framework
Plus >70 more open-source packages for AI engineers
Last week’s AI Tidbits roundup
Reach AI builders, researchers, and entrepreneurs by partnering with AI Tidbits
If you find AI Tidbits valuable, share it with a friend and consider showing your support.





![361438165-c182f606-8f8c-421d-b414-8487070fcfcb.mp4 [optimize output image] 361438165-c182f606-8f8c-421d-b414-8487070fcfcb.mp4 [optimize output image]](https://substackcdn.com/image/fetch/$s_!sOtm!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0679774-23a8-43f4-925c-80b1e531dbc6_600x400.gif)


![temp.mov [video-to-gif output image] temp.mov [video-to-gif output image]](https://substackcdn.com/image/fetch/$s_!DIZO!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F25b3dff8-b331-45a9-868d-56b98eb3b94a_600x362.gif)
![temp.mov [video-to-gif output image] temp.mov [video-to-gif output image]](https://substackcdn.com/image/fetch/$s_!oGiK!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd37e0724-b129-43a1-b5b9-24de6ad1befc_600x133.gif)


















![Show-o.mp4 [optimize output image] Show-o.mp4 [optimize output image]](https://substackcdn.com/image/fetch/$s_!n2vP!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2be23604-251e-4a98-bd36-fdaacd31b7d7_600x302.gif)





![temp.mov [optimize output image] temp.mov [optimize output image]](https://substackcdn.com/image/fetch/$s_!AcXP!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcadb1ac9-454f-45bf-b626-0a31a7da04e5_600x336.gif)

![temp.mov [optimize output image] temp.mov [optimize output image]](https://substackcdn.com/image/fetch/$s_!Pf8M!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53c25f2-f172-4782-94ac-ef6c1650435e_600x178.gif)
![KITTI_02.mp4 [optimize output image] KITTI_02.mp4 [optimize output image]](https://substackcdn.com/image/fetch/$s_!YQW_!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F642884d7-25f9-4e5d-8a7d-eedfbbf4011b_600x432.gif)



![temp.mov [optimize output image] temp.mov [optimize output image]](https://substackcdn.com/image/fetch/$s_!-x76!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa0ae5d67-b7b1-40f9-ba4c-18e00c12caf1_600x338.gif)

woah, ltm-2-mini handling 100m tokens for real-time code reasoning is impressive.
I'm looking for a mentor who would give me personalized mentorship or... maybe learn with me. Anyone?