Reproducible software environments
Last updated on 2025-10-20 | Edit this page
Overview
Questions
- What are virtual environments and why do we use them?
- How can we manage Python and external libraries reliably and repeatably?
Objectives
- Understand what a virtual environment is and why it supports
reproducible work.
- Use uv to create, manage, and share a reproducible
project environment.
- Learn what information lives in
pyproject.tomlanduv.lock. - Practice recreating an environment and verifying that results match.
So far, we have created a local Git repository to track changes in our software project and pushed it to GitHub to enable others to see and contribute to it. We now want to start developing the code further — and ensure that it runs the same way every time.
At this point, the code in your local software project’s directory should be as in: https://github.com/carpentries-incubator/bbrs-software-project/tree/03-reproducible-dev-environment
Software dependencies
If we have a look at our script, we may notice a few
import lines such as:import json, import csv,
import datetime as dt, and
import matplotlib.pyplot as plt.
This means that our code depends on several libraries — some built in (standard library) and some external.
json, csv, and datetime come
with Python.
Packages like matplotlib or pandas must be
installed separately.
Each project often needs specific versions of libraries. To avoid one project breaking another, we isolate each in its own environment.
Key terms
| Term | Meaning |
|---|---|
| Environment | A self-contained directory holding a Python interpreter and its installed packages. |
| Dependency | A library your code needs to run. |
| Package manager | A tool that installs and tracks dependencies (e.g., uv,
pip, or conda). |
| Lockfile | A record of exact versions of dependencies for reproducibility. |
What are virtual software environments?
A Python virtual environment is an isolated working
copy of Python plus the packages your project needs.
This isolation prevents version conflicts and makes collaboration
easier.
We can visualize multiple projects on the same computer, each with its own environment:

True reproducibility is complex
Creating isolated environments makes your work much more reproducible, but full computational reproducibility can also depend on the OS, CPU/GPU, and system libraries. For most research projects, environment management like this is good enough — and a major improvement over doing nothing.
Note on environment managers
Python packaging and environment management tools are changing
quickly.
This lesson uses uv because it provides a simple,
modern interface that unifies dependency installation, environment
creation, and project management in one step.
The broader Python community is still exploring which tools will become the long-term standard. Tools such as Pixi, Poetry, and PDM offer similar goals, and each has its own strengths and trade-offs:
- Pixi builds on Conda’s solver and supports multiple languages.
- Poetry and PDM emphasize packaging and publishing workflows.
- uv is new and fast, but it’s developed by a company (Astral) rather than a community-governed project, which some researchers consider when choosing tools.
The core idea stays the same: isolate dependencies and record exact versions for reproducibility. As the ecosystem matures, you may wish to test more than one tool and see which fits your workflow best.
2. Add dependencies
Explore the generated files
Open pyproject.toml in VS Code.
Find the line listing your dependencies.
Now open uv.lock — this file records exact
versions so anyone can reproduce your setup later.
Collaborators can now reproduce your setup exactly:
BASH
git clone <your-repo>
cd <your-repo>
uv python install
uv sync
uv run python eva_data_analysis.py
Handling encoding issues (Windows example)
When running your script, you might see this error on Windows:
This happens because Windows uses a different default text
encoding.
Fix it by explicitly setting the encoding when reading and writing
files:
PYTHON
data_f = open('./eva-data.json', 'r', encoding='ascii')
data_t = open('./eva-data.csv', 'w', encoding='utf-8')
Then commit your fix:
Reflection
How might using an environment manager like uv help your future self or a collaborator reproduce your work?
Further reading
- Official
Python Packaging:
pyproject.toml - uv documentation
- Python Tutorial: Virtual Environments and Packages
Other environment managers
You may encounter other tools that manage environments differently or focus on different workflows:
| Tool | Distinguishing feature |
|---|---|
| Pixi | Rust-based, cross-language manager using Conda’s dependency solver |
| Poetry | Long-standing project manager emphasizing packaging and publishing |
| PDM | Modern, PEP-compliant tool similar to Poetry but lighter-weight |
| Conda / Mamba | Popular in data science for compiled packages and cross-language use |
Each follows the same principle: isolate dependencies and record them for reproducibility.
This workshop focuses on uv for its straightforward, fast setup and beginner-friendly interface, but we encourage exploring these other tools as you grow more comfortable with Python project management.
- A virtual environment isolates your project’s Python version and
dependencies.
- Tools such as
uv,pixi, andpoetryautomate this process in different ways. -
pyproject.tomldeclares what you need;uv.lock(or equivalent) records exact versions. - Reproducibility depends on capturing this metadata, not on which
specific tool you use.
-
uv runexecutes code safely inside the environment. - Share
pyproject.toml+uv.lock; ignore.venv/. - Recreate an environment anytime with
uv syncor the equivalent for your tool of choice.