SciPy Developer Guide¶
Decision making process¶
SciPy has a formal governance model, documented in SciPy project governance. The section below documents in an informal way what happens in practice for decision making about code and commit rights. The formal governance model is leading, the below is only provided for context.
Code¶
Any significant decisions on adding (or not adding) new features, breaking backwards compatibility or making other significant changes to the codebase should be made on the scipy-dev mailing list after a discussion (preferably with full consensus).
Any non-trivial change (where trivial means a typo, or a one-liner maintenance commit) has to go in through a pull request (PR). It has to be reviewed by another developer. In case review doesn’t happen quickly enough and it is important that the PR is merged quickly, the submitter of the PR should send a message to mailing list saying he/she intends to merge that PR without review at time X for reason Y unless someone reviews it before then.
Changes and new additions should be tested. Untested code is broken code.
Commit rights¶
Who gets commit rights is decided by the SciPy Steering Council; changes in commit rights will then be announced on the scipy-dev mailing list.
Deciding on new features¶
The general decision rule to accept a proposed new feature has so far been conditional on:
- The method is applicable in many fields and “generally agreed” to be useful,
- Fits the topic of the submodule, and does not require extensive support frameworks to operate,
- The implementation looks sound and unlikely to need much tweaking in the future (e.g., limited expected maintenance burden), and
- Someone wants to do it.
Although it’s difficult to give hard rules on what “generally useful and generally agreed to work” means, it may help to weigh the following against each other:
- Is the method used/useful in different domains in practice? How much domain-specific background knowledge is needed to use it properly?
- Consider the code already in the module. Is what you are adding an omission? Does it solve a problem that you’d expect the module be able to solve? Does it supplement an existing feature in a significant way?
- Consider the equivalence class of similar methods / features usually expected. Among them, what would in principle be the minimal set so that there’s not a glaring omission in the offered features remaining? How much stuff would that be? Does including a representative one of them cover most use cases? Would it in principle sound reasonable to include everything from the minimal set in the module?
- Is what you are adding something that is well understood in the literature? If not, how sure are you that it will turn out well? Does the method perform well compared to other similar ones?
- Note that the twice-a-year release cycle and backward-compatibility policy makes correcting things later on more difficult.
The scopes of the submodules also vary, so it’s probably best to consider each as if it’s a separate project - “numerical evaluation of special functions” is relatively well-defined, but “commonly needed optimization algorithms” less so.
Development on GitHub¶
SciPy development largely takes place on GitHub; this section describes the
expected way of working for issues, pull requests and managing the main
scipy
repository.
Labels and Milestones¶
Each issue and pull request normally gets at least two labels: one for the
topic or component (scipy.stats
, Documentation
, etc.), and one for the
nature of the issue or pull request (enhancement
, maintenance
,
defect
, etc.). Other labels that may be added depending on the situation:
easy-fix
: for issues suitable to be tackled by new contributors.needs-work
: for pull requests that have review comments that haven’t been addressed for a while.needs-decision
: for issues or pull requests that need a decision.needs-champion
: for pull requests that were not finished by the original author, but are worth resurrecting.backport-candidate
: bugfixes that should be considered for backporting by the release manager.
A milestone is created for each version number for which a release is planned. Issues that need to be addressed and pull requests that need to be merged for a particular release should be set to the corresponding milestone. After a pull request is merged, its milestone (and that of the issue it closes) should be set to the next upcoming release - this makes it easy to get an overview of changes and to add a complete list of those to the release notes.
Dealing with pull requests¶
- When merging contributions, a committer is responsible for ensuring that those meet the requirements outlined in Contributing to SciPy. Also check that new features and backwards compatibility breaks were discussed on the scipy-dev mailing list.
- New code goes in via a pull request (PR).
- Merge new code with the green button. In case of merge conflicts, ask the PR submitter to rebase (this may require providing some git instructions).
- Backports and trivial additions to finish a PR (really trivial, like a typo or PEP8 fix) can be pushed directly.
- For PRs that add new features or are in some way complex, wait at least a day or two before merging it. That way, others get a chance to comment before the code goes in.
- Squashing commits or cleaning up commit messages of a PR that you consider too messy is OK. Make sure though to retain the original author name when doing this.
- Make sure that the labels and milestone on a merged PR are set correctly.
- When you want to reject a PR: if it’s very obvious you can just close it and explain why, if not obvious then it’s a good idea to first explain why you think the PR is not suitable for inclusion in Scipy and then let a second committer comment or close.
Backporting¶
All pull requests (whether they contain enhancements, bug fixes or something else),
should be made against master. Only bug fixes are candidates for backporting
to a maintenance branch. The backport strategy for SciPy is to (a) only backport
fixes that are important, and (b) to only backport when it’s reasonably sure
that a new bugfix release on the relevant maintenance branch will be made.
Typically, the developer who merges an important bugfix adds the
backport-candidate
label and pings the release manager, who decides on
whether and when the backport is done. After the backport is completed, the
backport-candidate
label has to be removed again.
Release notes¶
When a PR gets merged, consider if the changes need to be mentioned in the release notes. What needs mentioning: new features, backwards incompatible changes, deprecations, and “other changes” (anything else noteworthy enough, see older release notes for the kinds of things worth mentioning).
Release note entries are maintained on the wiki, (e.g.
https://github.com/scipy/scipy/wiki/Release-note-entries-for-SciPy-1.1.0). The
release manager will gather content from there and integrate it into the html
docs. We use this mechanism to avoid merge conflicts that would happen if
every PR touched the same file under doc/release/
directly.
Changes can be monitored (Atom feed)
and pulled (the wiki is a git repo: https://github.com/scipy/scipy.wiki.git
).
Other¶
PR status page: When new commits get added to a pull request, GitHub doesn’t send out any
notifications. The needs-work
label may not be justified anymore though.
This page gives an overview of PRs
that were updated, need review, need a decision, etc.
Cross-referencing: Cross-referencing issues and pull requests on GitHub is
often useful. GitHub allows doing that by using gh-xxxx
or #xxxx
with
xxxx
the issue/PR number. The gh-xxxx
format is strongly preferred,
because it’s clear that that is a GitHub link. Older issues contain #xxxx
which is about Trac (what we used pre-GitHub) tickets.
PR naming convention: Pull requests, issues and commit messages usually start
with a three-letter abbreviation like ENH:
or BUG:
. This is useful to
quickly see what the nature of the commit/PR/issue is. For the full list of
abbreviations, see writing the commit message.
Licensing¶
Scipy is distributed under the modified (3-clause) BSD license. All code, documentation and other files added to Scipy by contributors is licensed under this license, unless another license is explicitly specified in the source code. Contributors keep the copyright for code they wrote and submit for inclusion to Scipy.
Other licenses that are compatible with the modified BSD license that Scipy uses are 2-clause BSD, MIT and PSF. Incompatible licenses are GPL, Apache and custom licenses that require attribution/citation or prohibit use for commercial purposes.
It regularly happens that PRs are submitted with content copied or derived from unlicensed code. Such contributions cannot be accepted for inclusion in Scipy. What is needed in such cases is to contact the original author and ask him to relicense his code under the modified BSD (or a compatible) license. If the original author agrees to this, add a comment saying so to the source files and forward the relevant email to the scipy-dev mailing list.
What also regularly happens is that code is translated or derived from code in R, Octave (both GPL-licensed) or a commercial application. Such code also cannot be included in Scipy. Simply implementing functionality with the same API as found in R/Octave/… is fine though, as long as the author doesn’t look at the original incompatibly-licensed source code.
Version numbering¶
Scipy version numbering complies to PEP 440. Released final versions, which
are the only versions appearing on PyPI, are numbered MAJOR.MINOR.MICRO
where:
MAJOR
is an integer indicating the major version. It changes very rarely; a change inMAJOR
indicates large (possibly backwards-incompatible) changes.MINOR
is an integer indicating the minor version. Minor versions are typically released twice a year and can contain new features, deprecations and bug-fixes.MICRO
is an integer indicating a bug-fix version. Bug-fix versions are released when needed, typically one or two per minor version. They cannot contain new features or deprecations.
Released alpha, beta and rc (release candidate) versions are numbered
like final versions but with postfixes a#
, b#
and rc#
respectively,
with #
an integer. Development versions are postfixed with .dev0+<git-commit-hash>
.
Examples of valid Scipy version strings are:
0.16.0
0.15.1
0.14.0a1
0.14.0b2
0.14.0rc1
0.17.0.dev0+ac53f09
An installed Scipy version contains these version identifiers:
scipy.__version__ # complete version string, including git commit hash for dev versions
scipy.version.short_version # string, only major.minor.micro
scipy.version.version # string, same as scipy.__version__
scipy.version.full_version # string, same as scipy.__version__
scipy.version.release # bool, development or (alpha/beta/rc/final) released version
scipy.version.git_revision # string, git commit hash from which scipy was built
Deprecations¶
There are various reasons for wanting to remove existing functionality: it’s buggy, the API isn’t understandable, it’s superseded by functionality with better performance, it needs to be moved to another Scipy submodule, etc.
In general it’s not a good idea to remove something without warning users about that removal first. Therefore this is what should be done before removing something from the public API:
- Propose to deprecate the functionality on the scipy-dev mailing list and get agreement that that’s OK.
- Add a
DeprecationWarning
for it, which states that the functionality was deprecated, and in which release. - Mention the deprecation in the release notes for that release.
- Wait till at least 6 months after the release date of the release that
introduced the
DeprecationWarning
before removing the functionality. - Mention the removal of the functionality in the release notes.
The 6 months waiting period in practice usually means waiting two releases. When introducing the warning, also ensure that those warnings are filtered out when running the test suite so they don’t pollute the output.
It’s possible that there is reason to want to ignore this deprecation policy for a particular deprecation; this can always be discussed on the scipy-dev mailing list.
Distributing¶
Distributing Python packages is nontrivial - especially for a package with complex build requirements like Scipy - and subject to change. For an up-to-date overview of recommended tools and techniques, see the Python Packaging User Guide. This document discusses some of the main issues and considerations for Scipy.
Dependencies¶
Dependencies are things that a user has to install in order to use (or build/test) a package. They usually cause trouble, especially if they’re not optional. Scipy tries to keep its dependencies to a minimum; currently they are:
Unconditional run-time dependencies:
Conditional run-time dependencies:
- nose (to run the test suite)
- asv (to run the benchmarks)
- matplotlib (for some functions that can produce plots)
- Pillow (for image loading/saving)
- scikits.umfpack (optionally used in
sparse.linalg
) - mpmath (for more extended tests in
special
)
Unconditional build-time dependencies:
- Numpy
- A BLAS and LAPACK implementation (reference BLAS/LAPACK, ATLAS, OpenBLAS, MKL, Accelerate are all known to work)
- (for development versions) Cython
Conditional build-time dependencies:
- setuptools
- wheel (
python setup.py bdist_wheel
) - Sphinx (docs)
- matplotlib (docs)
- LaTeX (pdf docs)
- Pillow (docs)
Furthermore of course one needs C, C++ and Fortran compilers to build Scipy, but those we don’t consider to be dependencies and are therefore not discussed here. For details, see https://scipy.github.io/devdocs/building/.
When a package provides useful functionality and it’s proposed as a new
dependency, consider also if it makes sense to vendor (i.e. ship a copy of it with
scipy) the package instead. For example, six and decorator are vendored in
scipy._lib
.
The only dependency that is reported to pip is Numpy, see
install_requires
in Scipy’s main setup.py
. The other dependencies
aren’t needed for Scipy to function correctly, and the one unconditional build
dependency that pip knows how to install (Cython) we prefer to treat like a
compiler rather than a Python package that pip is allowed to upgrade.
Issues with dependency handling¶
There are some serious issues with how Python packaging tools handle dependencies reported by projects. Because Scipy gets regular bug reports about this, we go in a bit of detail here.
Scipy only reports its dependency on Numpy via install_requires
if Numpy
isn’t installed at all on a system. This will only change when there are
either 32-bit and 64-bit Windows wheels for Numpy on PyPI or when
pip upgrade
becomes available (with sane behavior, unlike pip install
-U
, see this PR). For more details, see
this summary.
The situation with setup_requires
is even worse; pip doesn’t handle that
keyword at all, while setuptools
has issues (here’s a current one) and invokes
easy_install
which comes with its own set of problems (note that Scipy doesn’t
support easy_install
at all anymore; issues specific to it will be closed
as “wontfix”).
Supported Python and Numpy versions¶
The Python versions that Scipy supports are listed in the list of PyPI
classifiers in setup.py
, and mentioned in the release notes for each
release. All newly released Python versions will be supported as soon as
possible. The general policy on dropping support for a Python version is that
(a) usage of that version has to be quite low (say <5% of users) and (b) the
version isn’t included in an active long-term support release of one of the
main Linux distributions anymore. Scipy typically follows Numpy, which has a
similar policy. The final decision on dropping support is always taken on the
scipy-dev mailing list.
The lowest supported Numpy version for a Scipy version is mentioned in the
release notes and is encoded in scipy/__init__.py
and the
install_requires
field of setup.py
. Typically the latest Scipy release
supports 3 or 4 minor versions of Numpy. That may become more if the frequency
of Numpy releases increases (it’s about 1x/year at the time of writing).
Support for a particular Numpy version is typically dropped if (a) that Numpy
version is several years old, and (b) the maintenance cost of keeping support
is starting to outweigh the benefits. The final decision on dropping support
is always taken on the scipy-dev mailing list.
Supported versions of optional dependencies and compilers is less clearly documented, and also isn’t tested well or at all by Scipy’s Continuous Integration setup. Issues regarding this are dealt with as they come up in the issue tracker or mailing list.
Building binary installers¶
Note
This section is only about building Scipy binary installers to distribute. For info on building Scipy on the same machine as where it will be used, see this scipy.org page.
There are a number of things to take into consideration when building binaries and distributing them on PyPI or elsewhere.
General
- A binary is specific for a single Python version (because different Python versions aren’t ABI-compatible, at least up to Python 3.4).
- Build against the lowest Numpy version that you need to support, then it will work for all Numpy versions with the same major version number (Numpy does maintain backwards ABI compatibility).
Windows
- The currently most easily available toolchain for building
Python.org compatible binaries for Scipy is installing MSVC (see
https://wiki.python.org/moin/WindowsCompilers) and mingw64-gfortran.
Support for this configuration requires numpy.distutils from
Numpy >= 1.14.dev and a gcc/gfortran-compiled static
openblas.a
. This configuration is currently used in the Appveyor configuration for https://github.com/MacPython/scipy-wheels - For 64-bit Windows installers built with a free toolchain, use the method documented at https://github.com/numpy/numpy/wiki/Mingw-static-toolchain. That method will likely be used for Scipy itself once it’s clear that the maintenance of that toolchain is sustainable long-term. See the MingwPy project and this thread for details.
- The other way to produce 64-bit Windows installers is with
icc
,ifort
plusMKL
(orMSVC
instead oficc
). For Intel toolchain instructions see this article and for (partial) MSVC instructions see this wiki page. - Older Scipy releases contained a .exe “superpack” installer. Those contain 3 complete builds (no SSE, SSE2, SSE3), and were built with https://github.com/numpy/numpy-vendor. That build setup is known to not work well anymore and is no longer supported. It used g77 instead of gfortran, due to complex DLL distribution issues (see gh-2829). Because the toolchain is no longer supported, g77 support isn’t needed anymore and Scipy can now include Fortran 90/95 code.
OS X
- To produce OS X wheels that work with various Python versions (from python.org, Homebrew, MacPython), use the build method provided by https://github.com/MacPython/scipy-wheels.
- DMG installers for the Python from python.org on OS X can still be produced
by
tools/scipy-macosx-installer/
. Scipy doesn’t distribute those installers anymore though, now that there are binary wheels on PyPi.
Linux
- PyPi-compatible Linux wheels can be produced via the manylinux project. The corresponding build setup for TravisCI for Scipy is set up in https://github.com/MacPython/scipy-wheels.
Other Linux build-setups result to PyPi incompatible wheels, which would need to be distributed via custom channels, e.g. in a Wheelhouse, see at the wheel and Wheelhouse docs.
Making a SciPy release¶
At the highest level, this is what the release manager does to release a new Scipy version:
- Propose a release schedule on the scipy-dev mailing list.
- Create the maintenance branch for the release.
- Tag the release.
- Build all release artifacts (sources, installers, docs).
- Upload the release artifacts.
- Announce the release.
- Port relevant changes to release notes and build scripts to master.
In this guide we attempt to describe in detail how to perform each of the above steps. In addition to those steps, which have to be performed by the release manager, here are descriptions of release-related activities and conventions of interest:
- Backporting
- Labels and Milestones
- Version numbering
- Supported Python and Numpy versions
- Deprecations
Proposing a release schedule¶
A typical release cycle looks like:
- Create the maintenance branch
- Release a beta version
- Release a “release candidate” (RC)
- If needed, release one or more new RCs
- Release the final version once there are no issues with the last release candidate
There’s usually at least one week between each of the above steps. Experience shows that a cycle takes between 4 and 8 weeks for a new minor version. Bug-fix versions don’t need a beta or RC, and can be done much quicker.
Ideally the final release is identical to the last RC, however there may be minor difference - it’s up to the release manager to judge the risk of that. Typically, if compiled code or complex pure Python code changes then a new RC is needed, while a simple bug-fix that’s backported from master doesn’t require a new RC.
To propose a schedule, send a list with estimated dates for branching and beta/rc/final releases to scipy-dev. In the same email, ask everyone to check if there are important issues/PRs that need to be included and aren’t tagged with the Milestone for the release or the “backport-candidate” label.
Creating the maintenance branch¶
Before branching, ensure that the release notes are updated as far as possible.
Include the output of tools/gh_lists.py
and tools/authors.py
in the
release notes.
Maintenance branches are named maintenance/<major>.<minor>.x
(e.g. 0.19.x).
To create one, simply push a branch with the correct name to the scipy repo.
Immediately after, push a commit where you increment the version number on the
master branch and add release notes for that new version. Send an email to
scipy-dev to let people know that you’ve done this.
Tagging a release¶
First ensure that you have set up GPG correctly. See https://github.com/scipy/scipy/issues/4919 for a discussion of signing release tags, and https://keyring.debian.org/creating-key.html for instructions on creating a GPG key if you do not have one.
To make your key more readily identifiable as you, consider sending your key to public keyservers, with a command such as:
gpg --send-keys <yourkeyid>
Check that all relevant commits are in the branch. In particular, check issues and PRs under the Milestone for the release (https://github.com/scipy/scipy/milestones), PRs labeled “backport-candidate”, and that the release notes are up-to-date and included in the html docs.
Then edit setup.py
to get the correct version number (set
ISRELEASED = True
) and commit it with a message like REL: set version to
<version-number>
. Don’t push this commit to the Scipy repo yet.
Finally tag the release locally with git tag -s <v1.x.y>
(the -s
ensures
the tag is signed). Continue with building release artifacts (next section).
Only push the release commit to the scipy repo once you have built the
sdists and docs successfully. Then continue with building wheels. Only push
the release tag to the repo once all wheels have been built successfully on
TravisCI and Appveyor (if it fails, you have to move the tag otherwise - which
is bad practice). Finally, after pushing the tag, also push a second
commit which increment the version number and sets ISRELEASED
to False
again.
Building release artifacts¶
Here is a complete list of artifacts created for a release:
- source archives (
.tar.gz
,.zip
and.tar.xz
for GitHub Releases, only.tar.gz
is uploaded to PyPI) - Binary wheels for Windows, Linx and OS X
- Documentation (
html
,pdf
) - A
README
file - A
Changelog
file
Source archives, Changelog and README are built by running paver release
in
the repo root, and end up in REPO_ROOT/release/
. Do this after you’ve
created the signed tag locally. If this completes without issues, push the release
commit (not the tag, see section above) to the scipy repo.
To build wheels, push a commit to the master branch of
https://github.com/MacPython/scipy-wheels . This triggers builds for all needed
Python versions on TravisCI. Update and check the .travis.yml
and appveyor.yml
config files what commit to build, and what Python and Numpy are used for the
builds (it needs to be the lowest supported Numpy version for each Python
version). See the README file in the scipy-wheels repo for more details.
The TravisCI and Appveyor builds run the tests from the built wheels and if they pass, upload the wheels to a container pointed to at https://github.com/MacPython/scipy-wheels
From there you can download them for uploading to PyPI. This can be done in an automated fashion with terryfy (note the -n switch which makes it only download the wheels and skip the upload to PyPI step - we want to be able to check the wheels and put their checksums into README first):
$ python wheel-uploader -n -v -c -u https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com -w REPO_ROOT/release/installers -t win scipy 0.19.0
$ python wheel-uploader -n -v -c -u https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com -w REPO_ROOT/release/installers -t macosx scipy 0.19.0
$ python wheel-uploader -n -v -c -u https://3f23b170c54c2533c070-1c8a9b3114517dc5fe17b7c3f8c63a43.ssl.cf2.rackcdn.com -w REPO_ROOT/release/installers -t manylinux1 scipy 0.19.0
The correct URL to use is shown in https://github.com/MacPython/scipy-wheels and should agree with the above one.
After this, we want to regenerate the README file, in order to have the MD5 and SHA256 checksums of the just downloaded wheels in it. Run:
$ paver write_release_and_log
Uploading release artifacts¶
For a release there are currently five places on the web to upload things to:
- PyPI (tarballs, wheels)
- Github releases (tarballs, release notes, Changelog)
- scipy.org (an announcement of the release)
- docs.scipy.org (html/pdf docs)
PyPI:
Upload first the wheels and then the sdist:
twine upload -s REPO_ROOT/release/installers/*.whl
twine upload -s REPO_ROOT/release/installers/scipy-1.x.y.tar.gz
Github Releases:
Use GUI on https://github.com/scipy/scipy/releases to create release and upload all release artifacts.
scipy.org:
Sources for the site are in https://github.com/scipy/scipy.org.
Update the News section in www/index.rst
and then do
make upload USERNAME=yourusername
.
docs.scipy.org:
First build the scipy docs, by running make dist
in scipy/doc/
. Verify
that they look OK, then upload them to the doc server with
make upload USERNAME=rgommers RELEASE=0.19.0
. Note that SSH access to the
doc server is needed; ask @pv (server admin) or @rgommers (can upload) if you
don’t have that.
The sources for the website itself are maintained in
https://github.com/scipy/docs.scipy.org/. Add the new Scipy version in the
table of releases in index.rst
. Push that commit, then do make upload
USERNAME=yourusername
.
Wrapping up¶
Send an email announcing the release to the following mailing lists:
- scipy-dev
- scipy-user
- numpy-discussion
- python-announce (not for beta/rc releases)
For beta and rc versions, ask people in the email to test (run the scipy tests and test against their own code) and report issues on Github or scipy-dev.
After the final release is done, port relevant changes to release notes, build
scripts, author name mapping in tools/authors.py
and any other changes that
were only made on the maintenance branch to master.
Module-Specific Instructions¶
Some SciPy modules have specific development workflows that it is useful to be aware of while contributing.
scipy.special
¶
Many of the functions in special
are vectorized versions of scalar
functions. The scalar functions are written by hand and the necessary
loops for vectorization are generated automatically. This section
discusses the steps necessary to add a new vectorized special
function.
The first step in adding a new vectorized function is writing the
corresponding scalar function. This can be done in Cython, C, C++, or
Fortran. If starting from scratch then Cython should be preferred
because the code is easier to maintain for developers only familiar
with Python. If the primary code is in Fortran then it is necessary to
write a C wrapper around the code; for examples of such wrappers see
specfun_wrappers.c
.
After implementing the scalar function, register the new function by
adding a line to the FUNC
string in generate_ufuncs.py
. The
docstring for that file explains the format. Also add documentation
for the new function by adding an entry to add_newdocs.py
; look in
the file for examples.