LinkedIn Highlights, Aug 2024

AI models for PDF extraction, Mistral's fine-tuning SDK, an open-source Perplexity clone, HippoRAG's 20% performance boost over RAG methods, and OpenAI's new evaluation library for benchmarking LLMs

Sep 22, 2024

∙ Paid

Introducing: AI Tidbits LinkedIn Highlights

Welcome to a new AI Tidbits series! Each month, I'll share my five top-performing LinkedIn posts, bringing you the best of AI straight from the frontlines of academia and industry.

As a frequent LinkedIn contributor, I regularly share insights on groundbreaking papers, promising open-source packages, and significant AI product launches. These posts offer more depth and detail than our weekly snippets, providing a comprehensive look at the latest AI developments.

Whether you're not on LinkedIn or simply missed a post, this monthly roundup ensures you stay informed about the most impactful AI news and innovations.

Become a premium member to get full access to my content and $1k in free credits for leading AI tools and APIs like Claude, Replicate, and Hugging Face. It’s common to expense the paid membership from your company’s learning and development education stipend.

Upgrade to Premium

1. PDF Extract Kit

Extracting information from documents has been one of AI’s holy grails. A new open-source project deploys specialized AI models to tackle this challenge head-on.

PDF-Extract-Kit is a comprehensive pipeline that breaks down PDF content extraction into several components:

Layout detection - leveraging LayoutLMv3 to precisely identify regions like images, tables, titles, and text
Table recognition - featuring StructEqTable for converting complex tables into LaTeX
OCR - utilizing PaddleOCR for high-performance text extraction in multiple languages
Formula detection - using YOLOv8 to accurately detect inline and isolated formulas
Formula recognition - employing UniMERNet to rival commercial software in formula recognition quality

Trained on diverse datasets, these models handle various document types, from academic papers to financial reports.

GitHub repo https://github.com/opendatalab/PDF-Extract-Kit

Deep Dives

Revolutionizing document processing with multimodal GPT

Sahar Mor

October 30, 2023

Revolutionizing document processing with multimodal GPT

Welcome to Deep Dives - an AI Tidbits section providing editorial takes and insights to make sense of the latest in AI. Let’s go!

Read full story

2. Mistral fine-tuning API + SDK

Mistral just dropped a game-changing fine-tuning API and SDK to help developers easily fine-tune Mistral variants on a single GPU. Clone, prep, train.

The SDK is a lightweight GitHub repository that leverages LoRA, allowing for memory-efficient training by freezing most model weights and only updating 1-2% with low-rank matrix perturbations. It's optimized for multi-GPU setups but can also be used with a single GPU for smaller models like the 7B.

To get started:

Clone the repo and install dependencies
Download and prepare your model and data
Validate and start training with a few simple commands

This repository is opinionated to simplify the finetuning process, focusing on Mistral models and specific hardware. It also includes a Colab notebook to hit the ground running.

Full details and setup instructions are in the GitHub repo https://github.com/mistralai/mistral-finetune

3. Perplexica

A new open-source project called Perplexica replicates the $2.5B startup Perplexity so developers can easily launch AI-powered search tools.

Perplexica is an open-source AI-powered search tool that dives deep into the internet to find precise answers. Perplexica not only searches the web but also understands your questions, delivering clear answers with cited sources.

It supports local models such as Llama3 and Mixtral for faster and cheaper inference and has six specialized modes tailored to answer specific types of questions:

All Mode - searches the entire web for the best results
Writing Assistant Mode - assists with writing tasks without web searches
Academic Search Mode - ideal for finding articles and papers for academic research
YouTube Search Mode - finds YouTube videos based on search queries
Wolfram Alpha Search Mode - uses Wolfram Alpha for calculations and data analysis
Reddit Search Mode - searches Reddit for discussions and opinions

Unlike other tools that use outdated data, Perplexica provides the latest information using a metasearch engine called SearxNG.

GitHub repo https://github.com/ItzCrazyKns/Perplexica

4. HippoRAG

Keep reading with a 7-day free trial

Subscribe to AI Tidbits to keep reading this post and get 7 days of free access to the full post archives.