Software Management#
This guide asks a range of questions to consider when planning your software management. Each question has resources for more information.
Examples are not an exhaustive list of possible options and many are Python-specific.
Checklist
Use version control for all of your code (e.g., GitHub).
Create one environment for each project (e.g., conda).
Choose the most appropriate method for capturing your computational environment.
Capture your computational environment.
Share your captured computational environment (along with your results/analysis) with a citable DOI and license.
Ensure (at least) the key functionality is correct, by writing, and regularly running, tests.
Ensure (at least) the key functionality is documented.
Questions#
What is the software environment setup?#
Integrated Development Environment (IDE)#
e.g., Jupyter Lab, Visual Studio Code, PyCharm (Python)
Version control#
-
e.g., Dolt (DoltHub), Git Large File Storage (LFS), Datalad, git-annex
Resources
Reproducibility#
Package management#
User-level package management systems
-
e.g., Python: pip and venv, virtualenv, pipenv, poetry
Entire computational environment#
-
e.g., Singularity (guide), Docker, repo2docker
Manage and deploy containers e.g., Kubernetes
-
e.g., Vagrant (guide), VirtualBox
Workflow management#
e.g., SnakeMake, Luigi, Nextflow (with Singularity), Make, Research Compendium, protocols.io, Ploomber
Resources#
How will software correctness be verified?#
Testing#
e.g., pytest (Python), unittest (Python), NumPy testing (Python)
-
pdb (Python)
Resources
Continuous integration (CI)#
e.g., GitHub Actions (Guide), Travis CI, GitLab CI, Jenkins, Azure Pipelines, Circle CI
Code quality#
Code review#
e.g., Codacy, CodeFactor
Code coverage
How to manage the software?#
Project structure
e.g., cookiecutter (Data Science, Jupyter Book, EasyData)
-
e.g., Jupyter Book, Read the Docs, Sphinx, nbdev, fastdoc
Publish online using GitHub Pages (for a Jupyter Book)
Security
e.g, GitHub Workflows and Essentials
Training#
SWD3: Software development practices for Research, Research Computing, University of Leeds.
SWD7: Introduction to reproducible workflows in Python, Research Computing, University of Leeds.
Intermediate Research Software Development in Python, The Carpentries.
Other general resources#
Examples#
SHAP (SHapley Additive exPlanations)
Version control with GitHub
Continuous integration with GitHub Actions, Travis CI
Capture and share project environment via conda or pip
MIT license
Documentation with examples
Tests with pytest
Reproducible via Binder
TPOT (Tree-based Pipeline Optimization Tool)
Version control with GitHub
Continuous integration with GitHub Actions, Travis CI, AppVeyor
Capture and share project environment via conda or pip
LGPL-3.0 license
Documentation with examples
Tests with pytest
Citable DOI through Zenodo
Code coverage with Coveralls
PyHealth (A Python Library for Health Predictive Models)
Version control with GitHub
Continuous integration with Circle CI, Travis CI, AppVeyor
Capture and share project environment via pip
BSD-2-Clause license
Documentation with examples
Tests with pytest
Reproducible via Binder
VEROS (versatile ocean simulator Python / JAX)
Version control with GitHub
Continuous integration with GitHub Actions
Capture and share project environment via conda or pip
MIT license
Documentation with examples
Tests with pytest
Code coverage with codecov