SciPy Roadmap¶
This roadmap page contains only the most important ideas and needs for SciPy going forward. For a more detailed roadmap, including per-submodule status, many more ideas, API stability and more, see Detailed SciPy Roadmap.
Evolve BLAS and LAPACK support¶
The Python and Cython interfaces to BLAS and LAPACK in scipy.linalg
are one
of the most important things that SciPy provides. In general scipy.linalg
is in good shape, however we can make a number of improvements:
1. Library support. Our released wheels now ship with OpenBLAS, which is currently the only feasible performant option (ATLAS is too slow, MKL cannot be the default due to licensing issues, Accelerate support is dropped because Apple doesn’t update Accelerate anymore). OpenBLAS isn’t very stable though, sometimes its releases break things and it has issues with threading (currently the only issue for using SciPy with PyPy3). We need at the very least better support for debugging OpenBLAS issues, and better documentation on how to build SciPy with it. An option is to use BLIS for a BLAS interface (see numpy gh-7372).
2. Support for newer LAPACK features. In SciPy 1.2.0 we increased the minimum supported version of LAPACK to 3.4.0. Now that we dropped Python 2.7, we can increase that version further (MKL + Python 2.7 was the blocker for >3.4.0 previously) and start adding support for new features in LAPACK.
Implement sparse arrays in addition to sparse matrices¶
The sparse matrix formats are mostly feature-complete, however the main issue
is that they act like numpy.matrix
(which will be deprecated in NumPy at
some point). What we want is sparse arrays that act like numpy.ndarray
.
This is being worked on in https://github.com/pydata/sparse, which is quite far
along. The tentative plan is:
Start depending on
pydata/sparse
once it’s feature-complete enough (it still needs a CSC/CSR equivalent) and okay performance-wise.Add support for
pydata/sparse
toscipy.sparse.linalg
(and perhaps toscipy.sparse.csgraph
after that).Indicate in the documentation that for new code users should prefer
pydata/sparse
over sparse matrices.When NumPy deprecates
numpy.matrix
, vendor that or maintain it as a stand-alone package.
Fourier transform enhancements¶
We want to integrate PocketFFT into scipy.fftpack
for significant
performance improvements (see this NumPy PR for details),
add a backend system to support PyFFTW and mkl-fft,
and align the function signatures of numpy.fft
and scipy.fftpack
.
Support for distributed arrays and GPU arrays¶
NumPy is splitting its API from its execution engine with
__array_function__
and __array_ufunc__
. This will enable parts of SciPy
to accept distributed arrays (e.g. dask.array.Array
) and GPU arrays (e.g.
cupy.ndarray
) that implement the ndarray
interface. At the moment it is
not yet clear which algorithms will work out of the box, and if there are
significant performance gains when they do. We want to create a map of which
parts of the SciPy API work, and improve support over time.
In addition to making use of NumPy protocols like __array_function__
, we can
make use of these protocols in SciPy as well. That will make it possible to
(re)implement SciPy functions like, e.g., those in scipy.signal
for Dask
or GPU arrays (see
NEP 18 - use outside of NumPy).
Improve source builds on Windows¶
SciPy critically relies on Fortran code. This is still problematic on Windows. There are currently only two options: using Intel Fortran, or using MSVC + gfortran. The former is expensive, while the latter works (it’s what we use for releases) but is quite hard to do correctly. For allowing contributors and end users to reliably build SciPy on Windows, using the Flang compiler looks like the best way forward long-term. Until Flang support materializes, we need to streamline and better document the MSVC + gfortran build.
Improve benchmark system for optimize
¶
scipy.optimize
has an extensive set of benchmarks for accuracy and speed of
the global optimizers. That has allowed adding new optimizers (shgo
and
dual_annealing
) with significantly better performance than the existing
ones. The optimize
benchmark system itself is slow and hard to use
however; we need to make it faster and make it easier to compare performance of
optimizers via plotting performance profiles.
Linear programming enhancements¶
Recently all known issues with optimize.linprog
have been solved. Now we
have many ideas for additional functionality (e.g. integer constraints, sparse
matrix support, performance improvements), see gh-9269.