Production-grade notebooks¶
This page summarizes some of the external tools for productionizing Jupyter notebooks - and can be run easily within bluprint projects.
Version control¶
We can omit certain metadata, such as cell timestamps and execution counts, or cell output in Jupyter notebooks for the purposes of version control using nbstripout. To install nbstripout package in your project, run this one-time setup:
uv add nbstripout
and then install a git filter:
uv run nbstripout --install
To keep cell output under version control, add --keep-output
argument to
.git/config:
...
[filter "nbstripout"]
clean = \"/path/to/project/.venv/bin/python\" -m nbstripout --keep-output
smudge = cat
[diff "ipynb"]
textconv = \"/path/to/project/.venv/bin/python\" -m nbstripout --keep-output -t
For more details see nbstripout instructions.
Best coding practices¶
Install nbqa with:
uv add 'nbqa[toolchain]'
nbqa
allows you to use tools usually used for Python package development in
Jupyter notebooks. There are several tools worth mentioning that can help write
production-grade code in notebooks.
Linting¶
Flake8 is a linter, which is tool used to analyze and detect potential errors,
bugs, and code style violations. To run flake8 on a
notebooks/example_jupyter.ipynb
notebook, run:
uv run nbqa flake8 notebooks/example_jupyter.ipynb
You can check the details of each of the violations on wemake-python-styleguide by using the search box on the left.
There are few ways to ignore violations:
Ignoring them in one specific line (for example for WPS221 violation), use:
<python code that violates WPS211> # noqa: WPS221
2. Ignoring them across the entire file; to achieve this add the following to
pyproject.toml
:
[tool.flake8] per-file-ignores = ['notebooks/example_jupyter.ipynb: WPS211']
Ignoring them in the entire project; for this add this to
pyproject.toml
:[tool.flake8] ignore = ['WPS211']
For more details check the flake8 documentation.
Note
Flake8 imposes a very strict set of rules that most authors do not follow to the letter - keep this in mind - more over with notebooks. However, it is still a valuable tool that can be used to write better code and preventing the need to rewrite the notebook code in separate Python scripts / packages.
Sorting imports¶
Since notebooks tend to have a lot of functions, objects or modules imported, I recommend using isort to automatically sort your imports and group them into sections:
uv run nbqa isort notebooks/example_jupyternb.ipynb
This will update your notebook in-place.
Python scripts¶
You can run flake8, isort, etc. on Python scripts as well, just omit nbqa
from commands above. For example, to run a flake8 linter:
uv run flake8 project_name/example.py