Foreword
This may be a post that will be updated regularly after its initial publication.
Background
A repository with a couple of python packages for the handling of ensemble forecast time series needs a refresh and refactor. It currently uses packaging recipes that were devised quite a few years ago, and need (perhaps) to be reassessed given the zoo of software tools contending for python packaging.
As it stands “now” the packages rely on requirements.txt, setuptools, pytest, and a few other patterns and practices that may still work but are (alledgedly) superseded.
Some months ago I migrated the refcount package to use poetry. I landed on some semi-manual recipes by adopting and adapting https://py-pkgs.org. I recall I was rather underwhelmed by the added value of poetry in this context; perhaps because the package was not suffering from dependency issues, rife in the Python ecosystem.
A quote from the Zen of Python is “In the face of ambiguity, refuse the temptation to guess. There should be one– and preferably only one –obvious way to do it. Although that way may not be obvious at first unless you’re Dutch. Now is better than never.”. Acknowledging I am using it way out of its initial and historical context, the state of python packaging is still at odds with this motto.
Let’s first scan in this post what the landscape is these days, with a view to devise one or more (preferably one) packaging template I intend to use and require use of in upcoming projects.
Packaging templates
Some candidates amongst many:
- microsoft/python-package-template
- Kwpolska/python-project-template
- rochacbruno/python-project-template
- an_extremely_modern_and_configurable_python template
- copier-uv by the author of
mkdocs
. - nbdev may not be thought of as a python packaging template, but in a way it is. See also a previous post on nbdev.
Other templates internal to my organisation in e.g. the energy domain.
While adopting wholesale a template would be ideal, scaning these options quickly show that they cannot be expected to be a perfect match to our needs. It is worth having a look at options available in various aspects of packaging, to assess whether templates offer a good balance of features versus simplicity of use.
Options for various aspects of python packaging
Environment Managers and dependency management
Unit Testing
- pytest
- unittest
- coverage
Template Installation
Continuous Integration
- GitHub Actions
- Azure
- etc?
Doc Generation
- sphinx
- mkdocs
- jupyter-books
- quarto via quartodoc, and also when using
nbdev
.
Licensing
- pip-licenses
- license checks (python-license-check) etc.
Code Quality
pre-commit hooks
formatting
- (autopep8, isort)
- linting (flake8)
- ruff
- black
Type safety
static type checking (mypy)
Code Security
Check code for common security flaws.
- bandit
Containerisation
- docker
Cloud Deployment
- terraform
Data Tracking
- git-lfs
Build Tasks
- tox
- nox
- make
- rake
- invoke
- hatch
- doit
- duty
- poethepoet
Deployment
- pipx: allows your project to be installed with global launch scripts in a managed, sandboxed virtualenv
Conclusion
The downside of all these choices is the multitude of templates one can come up with. For better or worse. It is usually not possible to really grasp the pros and cons of tools without a sustain use, and possibly running into a particularly annoying situation such as a broken environment, breaking changes in dependencies, etc.