Kreuzberg

High-performance document extraction library supporting over 62 formats.

GrowingOpen SourceLow lock-in

Visit Website ↗Compare ⇄

Pricing

Free tier

Flat rate

Adoption

↗Rising

License

Open Source

Data freshness

Verified · Jul 16, 2026

Overview

What is Kreuzberg?

Kreuzberg is a high-performance document extraction library with a Rust core, supporting over 62 formats including PDF, Office documents, images with OCR, HTML, email, and archives. It's ideal for developers needing to extract text and metadata from various file types efficiently.

Key differentiator

“Kreuzberg stands out with its high-performance Rust core and support for over 62 document formats, making it an ideal choice for developers needing efficient extraction capabilities in various file types without the need for cloud services.”

Capability profile

Capability Radar

Honest assessment

Strengths & Weaknesses

↑ Strengths

Supports over 62 document formats including PDF, Office documents, images with OCR, HTML, email, and archives.medium

High-performance extraction using Rust core.medium

Extensive documentation and community support.medium

↓ Weaknesses

Limited direct support for non-Rust languageshigh

Primary language is Rust, and while there are community-maintained SDKs for other languages like TypeScript, they may not be as robust or up-to-date.

Complex setup processmedium

Setting up Kreuzberg requires a deep understanding of Rust package management and dependencies which can be challenging for developers unfamiliar with the ecosystem.

Documentation lacks examples for advanced use caseslow

While there is extensive documentation, it tends to focus more on basic usage rather than complex scenarios or edge cases that some users might encounter.

Fit analysis

Who is it for?

✓ Best for

Developers needing high-performance extraction from various file types in Rust or Python environments.

Projects requiring efficient handling and parsing of multiple document formats without cloud dependency.

✕ Not a fit for

Applications that require real-time streaming capabilities as Kreuzberg is designed for batch processing.

Teams looking for a managed service solution, as it requires self-hosting.

Cost structure

Pricing

Free Tier

Available

Open source — free to use

Starts at

Model

Flat rate

Enterprise

None

Performance benchmarks

How Fast Is It?

Ecosystem

Relationships

Alternatives

pdfminer.six

Works well with

Airflow Pandas

Integrations

(community)(community)(community)(community)

Next step

Get Started with Kreuzberg

Step-by-step setup guide with code examples and common gotchas.

View Setup Guide →