7 Comments
User's avatar
Daniel Nest's avatar

Great round-up!

I'm a huge fan of Midjourney and agree that it currently leads the text-to-image pack, especially when it comes to photographic images. But it strikes me just how fast text-to-image models are converging. I recently did a round-up of 7 available image models and one major conclusion is that most of them offer robust, free alternatives to Midjourney, which by now is the only fully paid model around.

Expand full comment
Sahar Mor's avatar

Fully agree. I'm constantly amazed by the results I get with the open-source Fooocus https://github.com/lllyasviel/Fooocus

Expand full comment
Jelena's avatar

Wow, I learned so much! Excited to read another newsletter today! Hi five Sahar!!

Expand full comment
Andreas's avatar

Nice highlight. I think the next thing after LLM is Multimodal LLM. Which is by the end of 2023 is starting on the rise.

Expand full comment
Sahar Mor's avatar

That is especially true when combined with wearables—Meta plans to equip its Ray-Ban smart glasses with its multimodal AI as early as this quarter https://www.theverge.com/2023/12/12/23998780/ray-ban-smart-glasses-hey-meta-multimodal-ai-features

Expand full comment
Joseph Carley's avatar

Excellent overview. Love the depth, interesting to see your take on best open source models at different parameter levels, particularly the lower ones. Very exciting the recent developments in the <7B arena!

Expand full comment
Sahar Mor's avatar

My 2024 bet: Mistral open-sourcing a GPT-4 level small language model, igniting a wave of new language models akin to the surge that followed the release of Mistral 7B.

Expand full comment