Clearing Jupyter Notebooks Before Comitting

Whilst dealing with and versioning Jupyter notebooks, I needed to reset output cells before commiting, as to reduce the noise, and make the changes more accessible to reviewers.

For me, reviewing notebooks is still a bit of a manual process: I’m looking at the two revisions of the notebook, typically rendered by Gitlab. And it’s hard to establish whether the changes in output cells are significant. We therefore agreed that

nbconvert has an option to reset output cells during a conversion, which itself can be done --inplace, modifying the notebook file itself:

jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace notebook.ipynb

This I combine with a pre-commit hook, which is effectively a bash script, residing in the .git/hooks/ directory of a repo:

jupyter nbconvert --ClearOutputPreprocessot.enabled=True --inplace *.ipynb
git add *.ipynb

This will re-add the files right before the commit is done. Since the file is changed inplace, Jupyter will notify you that the file has been externally modified.

Hooks can be added to the repo, such that they can be shared by developers. However, they are not executable when freshly cloned, and need to be marked so.