Git for Data - lakeFS allows you to manage your data as code, enabling reproducible and high-quality data pipelines. It supports all standard computation engines and is format agnostic, making it versatile for any data type. With support for major object stores and seamless integration with your data stack, lakeFS is the ideal solution for data engineers and data scientists. Visit lakefs.io for more information.
Manage your data as code using Git-like operations and achieve reproducible, high-quality data pipelines. Available Open Source or on the Cloud.
Take control of your data
COMPUTE ENGINES
lakeFS supports all standard computation engines.
lakefs
lakeFS uses metadata to manage data versions. Its versioning engine is highly scalable with minor impact to storage performance
formats
lakeFS is format agnostic, regardless of format type be it structured, unstructured, open table, or anything else.
Object Storage
lakeFS supports data in all object stores including all major cloud providers S3, Azure Blob, GCP, and on prem MinIO, Ceph, Dell EMC and any other S3 compatible storage.
Use Cases
lakeFS helps data engineers and data scientists in every field manage their data like code — at scale
- Data Science
- Data engineering
- Data Ops
lakeFS is already helping thousands of developers
UP TO 80%
Reduce storage costs
2X
Double efficiency
UP TO 99%
Increase production
outage recovery
Here's what ML and Data Engineers using lakeFS have to say
Official partners
Seamless integration with
all your data stack
Best Practices Best Practices Data Engineering, Thought Leadership