Git for Data - lakeFS

Data Stack
Data LakeData ModelingData Governance
Paying Customers

Git for Data - lakeFS allows you to manage your data as code, enabling reproducible and high-quality data pipelines. It supports all standard computation engines and is format agnostic, making it versatile for any data type. With support for major object stores and seamless integration with your data stack, lakeFS is the ideal solution for data engineers and data scientists. Visit for more information.

Who's using?
VolvoWindwardEpcorPaigeenigmaKariusToyotaBAE SystemsAppsFlyerAir AsiaNetflixTomraDaimlerapexsimilarwebContext LabsTerrameraProton Mail

Manage your data as code using Git-like operations and achieve reproducible, high-quality data pipelines. Available Open Source or on the Cloud.

Run open source locally

Take control of your data


lakeFS supports all standard computation engines.


lakeFS uses metadata to manage data versions. Its versioning engine is highly scalable with minor impact to storage performance


lakeFS is format agnostic, regardless of format type be it structured, unstructured, open table, or anything else.

Object Storage

lakeFS supports data in all object stores including all major cloud providers S3, Azure Blob, GCP, and on prem MinIO, Ceph, Dell EMC and any other S3 compatible storage.

Use Cases

lakeFS helps data engineers and data scientists in every field manage their data like code — at scale

  • Data Science
  • Data engineering
  • Data Ops

lakeFS is already helping thousands of developers

UP TO 80%

Reduce storage costs


Double efficiency

UP TO 99%

Increase production

outage recovery

Here's what ML and Data Engineers using lakeFS have to say

Run locally

Official partners

Seamless integration with

all your data stack

Best Practices Best Practices Data Engineering, Thought Leadership