Brains Up AnalyticsBRAINSUPAnalytics
PythonData EngineeringTools

uv: the Python manager every Data Engineer should know

How uv, from Astral, replaced pip, venv and pyenv in our Databricks and Azure projects — with a 10–100× speed boost.

If you still run pip install in your day-to-day Data Engineering work, you're leaving time and reliability on the table. uv, built by Astral (the same team behind the Ruff linter), is the tool that changed my workflow in 2026 — and it will probably change yours too.

The classic Python environment problem

Anyone working with Python pipelines knows the pain: installing a typical Data Engineering stack (polars, dbt-databricks, azure-storage-blob, pyarrow) with pip takes 45 to 60 seconds on a clean machine. In CI/CD, that time multiplies with every build.

On top of that, managing Python versions across projects, keeping requirements.txt up to date and guaranteeing that the dev environment is identical to production are constant challenges — even with Poetry or Conda.

What is uv?

uv is a Python package and project manager written in Rust, developed by Astral. In a single tool, it replaces:

  • pip — to install packages
  • venv — to create virtual environments
  • pyenv — to manage Python versions
  • pip-compile / pip-tools — to lock dependencies

Performance is the main differentiator: 10 to 100× faster than pip, thanks to smart caching and Rust-powered parallelism.

Installation

curl -Ls https://astral.sh/uv | sh

That's it. uv installs as a single binary, with no external dependencies.

Data Engineering workflow

Create a new project

uv init meu_pipeline
cd meu_pipeline

uv automatically creates:

  • pyproject.toml — project configuration
  • .python-version — the pinned Python version
  • main.py — a starter file

Add dependencies

uv add polars dbt-databricks azure-storage-blob pyarrow

uv resolves and installs everything in parallel. On first use, with a cold cache, the gain is already visible. With a warm cache (subsequent builds), the time drops to under 1 second.

A uv.lock file is generated automatically with all the exact versions — including transitive dependencies. This file should be committed to Git to guarantee full reproducibility.

Run scripts

uv run python etl.py

uv checks the virtual environment automatically before running. If the environment doesn't exist or is out of date, it creates/updates it before running.

Manage Python versions

uv python install 3.12
uv python pin 3.12

No pyenv or manual PATH configuration required.

Use in Databricks and Azure projects

In our Databricks and Azure Data Factory projects, uv brought specific benefits:

Local development environments: With uv sync, any developer replicates the exact project environment with a single command — no setup README, no makefile.

CI/CD: On GitHub Actions and Azure DevOps, uv cuts environment setup time from 2–3 minutes to under 20 seconds.

Isolated ETL scripts: uv supports inline dependencies in scripts:

# /// script
# dependencies = ["polars", "azure-storage-blob"]
# ///

import polars as pl
# ...

Running with uv run etl_script.py, uv installs the script's dependencies automatically, without polluting the global environment.

Important limitations

uv does not replace Conda for projects that depend on native system libraries (CUDA, cuDNN, ffmpeg). For ML pipelines with GPU or video processing, Conda is still needed for the system dependencies.

For pure Data Engineering (ETL, orchestration, transformation with Polars/DuckDB/dbt), uv covers 100% of the use cases.

Conclusion

uv is the natural evolution of the Python ecosystem for Data Engineers. With superior speed, automatic dependency locking and a unified interface that replaces several tools, it eliminates an entire class of environment problems that eat up time in projects.

If you work with Python on data projects — especially with Databricks, Azure or ETL pipelines — it's worth spending 15 minutes trying uv on your next project.


Official documentation: docs.astral.sh/uv

Related articles

Enjoyed this? Check out the e-books for in-depth content.

E-books