Python packaging

Take stock of the python packaging landscape to refresh practices
python
packaging
template
practices
Author

J-M

Published

August 9, 2024

Foreword

This may be a post that will be updated regularly after its initial publication.

Background

A repository with a couple of python packages for the handling of ensemble forecast time series needs a refresh and refactor. It currently uses packaging recipes that were devised quite a few years ago, and need (perhaps) to be reassessed given the zoo of software tools contending for python packaging.

As it stands “now” the packages rely on requirements.txt, setuptools, pytest, and a few other patterns and practices that may still work but are (alledgedly) superseded.

Some months ago I migrated the refcount package to use poetry. I landed on some semi-manual recipes by adopting and adapting https://py-pkgs.org. I recall I was rather underwhelmed by the added value of poetry in this context; perhaps because the package was not suffering from dependency issues, rife in the Python ecosystem.

A quote from the Zen of Python is “In the face of ambiguity, refuse the temptation to guess. There should be one– and preferably only one –obvious way to do it. Although that way may not be obvious at first unless you’re Dutch. Now is better than never.”. Acknowledging I am using it way out of its initial and historical context, the state of python packaging is still at odds with this motto.

Let’s first scan in this post what the landscape is these days, with a view to devise one or more (preferably one) packaging template I intend to use and require use of in upcoming projects.

Packaging templates

Some candidates amongst many:

Other templates internal to my organisation in e.g. the energy domain.

While adopting wholesale a template would be ideal, scaning these options quickly show that they cannot be expected to be a perfect match to our needs. It is worth having a look at options available in various aspects of packaging, to assess whether templates offer a good balance of features versus simplicity of use.

Options for various aspects of python packaging

Environment Managers and dependency management

Unit Testing

  • pytest
  • unittest
  • coverage

Template Installation

Continuous Integration

  • GitHub Actions
  • Azure
  • etc?

Doc Generation

Licensing

Code Quality

pre-commit hooks

formatting

  • (autopep8, isort)
  • linting (flake8)
  • ruff
  • black

Type safety

static type checking (mypy)

Code Security

Check code for common security flaws.

  • bandit

Containerisation

  • docker

Cloud Deployment

  • terraform

Data Tracking

  • git-lfs

Build Tasks

  • tox
  • nox
  • make
  • rake
  • invoke
  • hatch
  • doit
  • duty
  • poethepoet

See also Configuration and build tools on python wiki

Deployment

  • pipx: allows your project to be installed with global launch scripts in a managed, sandboxed virtualenv

Conclusion

The downside of all these choices is the multitude of templates one can come up with. For better or worse. It is usually not possible to really grasp the pros and cons of tools without a sustain use, and possibly running into a particularly annoying situation such as a broken environment, breaking changes in dependencies, etc.