Production-grade notebooks

This page summarizes some of the external tools for productionizing Jupyter notebooks - and can be run easily within bluprint projects.

Version control

We can omit certain metadata, such as cell timestamps and execution counts, or cell output in Jupyter notebooks for the purposes of version control using nbstripout. To install nbstripout package in your project, run this one-time setup:

uv add nbstripout

and then install a git filter:

uv run nbstripout --install

To keep cell output under version control, add --keep-output argument to .git/config:

...
[filter "nbstripout"]
  clean = \"/path/to/project/.venv/bin/python\" -m nbstripout --keep-output
  smudge = cat
[diff "ipynb"]
  textconv = \"/path/to/project/.venv/bin/python\" -m nbstripout --keep-output -t

For more details see nbstripout instructions.

Best coding practices

Install nbqa with:

uv add 'nbqa[toolchain]'

nbqa allows you to use tools usually used for Python package development in Jupyter notebooks. There are several tools worth mentioning that can help write production-grade code in notebooks.

Linting

Flake8 is a linter, which is tool used to analyze and detect potential errors, bugs, and code style violations. To run flake8 on a notebooks/example_jupyter.ipynb notebook, run:

uv run nbqa flake8 notebooks/example_jupyter.ipynb

You can check the details of each of the violations on wemake-python-styleguide by using the search box on the left.

There are few ways to ignore violations:

  1. Ignoring them in one specific line (for example for WPS221 violation), use:

    <python code that violates WPS211>  # noqa: WPS221
    

2. Ignoring them across the entire file; to achieve this add the following to pyproject.toml:

[tool.flake8]
per-file-ignores = ['notebooks/example_jupyter.ipynb: WPS211']
  1. Ignoring them in the entire project; for this add this to pyproject.toml:

    [tool.flake8]
    ignore = ['WPS211']
    

For more details check the flake8 documentation.

Note

Flake8 imposes a very strict set of rules that most authors do not follow to the letter - keep this in mind - more over with notebooks. However, it is still a valuable tool that can be used to write better code and preventing the need to rewrite the notebook code in separate Python scripts / packages.

Sorting imports

Since notebooks tend to have a lot of functions, objects or modules imported, I recommend using isort to automatically sort your imports and group them into sections:

uv run nbqa isort notebooks/example_jupyternb.ipynb

This will update your notebook in-place.

Python scripts

You can run flake8, isort, etc. on Python scripts as well, just omit nbqa from commands above. For example, to run a flake8 linter:

uv run flake8 project_name/example.py