lakeFS

Versioned data lake on top of object storage.

EstablishedOpen SourceLow lock-in

Pricing

See website

Flat rate

Adoption

Stable

License

Open Source

Data freshness

Overview

What is lakeFS?

lakeFS provides a repeatable, atomic and versioned layer for your data lake, built on top of existing object storage systems. It enables teams to manage their data with the same rigor as code, ensuring reproducibility and collaboration.

Key differentiator

lakeFS stands out by providing robust version control capabilities to data lakes, enabling teams to manage their datasets with the same rigor as code repositories.

Capability profile

Strength Radar

Version control …Atomic commits a…Branching and me…Integration with…Audit trails and…

Honest assessment

Strengths & Weaknesses

↑ Strengths

Version control for data lakes

Atomic commits and rollbacks

Branching and merging capabilities

Integration with existing object storage systems

Audit trails and reproducibility

Fit analysis

Who is it for?

✓ Best for

Teams managing large-scale data lakes who need version control and reproducibility

Organizations with complex data workflows that require collaboration and audit trails

✕ Not a fit for

Projects requiring real-time data processing (lakeFS is optimized for batch operations)

Use cases where the overhead of a versioned system would be prohibitive

Cost structure

Pricing

Free Tier

None

Starts at

See website

Model

Flat rate

Enterprise

None

Performance benchmarks

How Fast Is It?

Ecosystem

Relationships

Alternatives

Next step

Get Started with lakeFS

Step-by-step setup guide with code examples and common gotchas.

View Setup Guide →