The Multiprocessor of Language Models

How having a centralized LLM API can leapfrog companies ahead of the competition

Aug 20, 2023

∙ Paid

Welcome to Deep Dives - an AI Tidbits section providing editorial takes and insights to make sense of the latest in AI. Let’s go!

I was somewhat new to the Payments space when I joined Stripe. I remember being dazzled by the sheer amount of complexities taking place when one hits the “Book a Ride” button to catch an Uber. It was at Stripe where I got to learn about a new concept - Multiprocessor.

Imagine you are the Head of Payments at Amazon or Uber—you operate across dozens of countries, serve multiple types of customers, and employ different pricing structures, such as subscription-based and one-time payments. How can you maximize your approval rate, i.e. ensure every legitimate payment is approved? How do you avoid losing millions of dollars for every minute your partnering payment processor is down?

Enter Multiprocessor.

The likes of Amazon partner with more than one payment processor (Stripe is one). That way, they ensure redundancy by falling back to another processor if their main one is down. They also achieve maximal approval rates by routing payments to regional processors based on their geography, e.g., using a European processor for a Norway-based cardholder. Beyond better payment performance, they get to leverage this multiprocessor setup to drive down partners’ costs and push them to deliver better results. Otherwise, they will churn some of their payment traffic.

Now let’s talk about language models. Today, there are five major proprietary LLM providers: OpenAI (GPT), Google (PaLM), Anthropic (Claude), Inflection (Pi), and Cohere. There are dozens more open-source language models, small and large, with the best-performing ones including Meta’s Llama 2, Stability AI’s StableBeluga2, and the recently-introduced Platypus.

A snapshot of today’s open-source leaderboard, Aug 20th, 2023

Each one of these models has its own traits and quirks: faster, cheaper, easier to experiment with, fits on consumer hardware, or is better suited for specific language tasks. Numerous guides teach how to choose the right LLM, with everyone stating the obvious - start with OpenAI and move from there. The truth is everyone anyway defaults to GPT-4. “We will take care of costs and latency later” is usually what I hear.

There is merit to it, but why choose one?

A unified LLM API layer, à la LLM Multiprocessor

Keep reading with a 7-day free trial

Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.