kreuzberg — Skillopedia

Kreuzberg Document Extraction Kreuzberg is a high-performance document intelligence library with a Rust core and native bindings for Python, Node.js/TypeScript, Ruby, Go, Java, C#, PHP, and Elixir. It extracts text, tables, metadata, and images from 91+ file formats including PDF, Office documents, images (with OCR), HTML, email, archives, and academic formats. Use this skill when writing code that: - Extracts text or metadata from documents - Performs OCR on scanned documents or images - Batch-processes multiple files - Configures extraction options (output format, chunking, OCR, language de…