7 Comments

Great round-up!

I'm a huge fan of Midjourney and agree that it currently leads the text-to-image pack, especially when it comes to photographic images. But it strikes me just how fast text-to-image models are converging. I recently did a round-up of 7 available image models and one major conclusion is that most of them offer robust, free alternatives to Midjourney, which by now is the only fully paid model around.

Expand full comment
author

Fully agree. I'm constantly amazed by the results I get with the open-source Fooocus https://github.com/lllyasviel/Fooocus

Expand full comment
Jan 4Liked by Sahar Mor

Wow, I learned so much! Excited to read another newsletter today! Hi five Sahar!!

Expand full comment
Jan 3Liked by Sahar Mor

Nice highlight. I think the next thing after LLM is Multimodal LLM. Which is by the end of 2023 is starting on the rise.

Expand full comment
author

That is especially true when combined with wearables—Meta plans to equip its Ray-Ban smart glasses with its multimodal AI as early as this quarter https://www.theverge.com/2023/12/12/23998780/ray-ban-smart-glasses-hey-meta-multimodal-ai-features

Expand full comment

Excellent overview. Love the depth, interesting to see your take on best open source models at different parameter levels, particularly the lower ones. Very exciting the recent developments in the <7B arena!

Expand full comment
author

My 2024 bet: Mistral open-sourcing a GPT-4 level small language model, igniting a wave of new language models akin to the surge that followed the release of Mistral 7B.

Expand full comment