Kreuzberg

High-performance document extraction library supporting over 62 formats.

EstablishedOpen SourceLow lock-in

Pricing

See website

Flat rate

Adoption

Stable

License

Open Source

Data freshness

Overview

What is Kreuzberg?

Kreuzberg is a high-performance document extraction library with a Rust core, supporting over 62 formats including PDF, Office documents, images with OCR, HTML, email, and archives. It's ideal for developers needing to extract text and metadata from various file types efficiently.

Key differentiator

Kreuzberg stands out with its high-performance Rust core and support for over 62 document formats, making it an ideal choice for developers needing efficient extraction capabilities in various file types without the need for cloud services.

Capability profile

Strength Radar

Supports over 62…High-performance…Extensive docume…

Honest assessment

Strengths & Weaknesses

↑ Strengths

Supports over 62 document formats including PDF, Office documents, images with OCR, HTML, email, and archives.

High-performance extraction using Rust core.

Extensive documentation and community support.

Fit analysis

Who is it for?

✓ Best for

Developers needing high-performance extraction from various file types in Rust or Python environments.

Projects requiring efficient handling and parsing of multiple document formats without cloud dependency.

✕ Not a fit for

Applications that require real-time streaming capabilities as Kreuzberg is designed for batch processing.

Teams looking for a managed service solution, as it requires self-hosting.

Cost structure

Pricing

Free Tier

None

Starts at

See website

Model

Flat rate

Enterprise

None

Performance benchmarks

How Fast Is It?

Ecosystem

Relationships

Alternatives

Next step

Get Started with Kreuzberg

Step-by-step setup guide with code examples and common gotchas.

View Setup Guide →