Skip to content

Add an FAQ entry for limiting NumPy threads#1401

Closed
StefansM wants to merge 3 commits intobiopython:masterfrom
StefansM:numpy-threads
Closed

Add an FAQ entry for limiting NumPy threads#1401
StefansM wants to merge 3 commits intobiopython:masterfrom
StefansM:numpy-threads

Conversation

@StefansM
Copy link
Contributor

This pull request addresses issue #1397 by adding an entry to the FAQ giving some workarounds to the problem.

I won't be offended if you ask me to rewrite it: I tend to be a bit too verbose when it comes to this sort of thing.


I hereby agree to dual licence this and any previous contributions under both
the Biopython License Agreement AND the BSD 3-Clause License.

I am happy to be thanked by name in the NEWS.rst and
CONTRIB.rst files.

I have read the CONTRIBUTING.rst file and understand that AppVeyor and
TravisCI will be used to confirm the Biopython unit tests and flake8 style
checks pass with these changes.

@codecov
Copy link

codecov bot commented Sep 26, 2017

Codecov Report

Merging #1401 into master will decrease coverage by 1.39%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #1401     +/-   ##
=========================================
- Coverage   84.79%   83.39%   -1.4%     
=========================================
  Files         319      318      -1     
  Lines       49050    49002     -48     
=========================================
- Hits        41591    40866    -725     
- Misses       7459     8136    +677
Impacted Files Coverage Δ
Bio/Phylo/CDAOIO.py 5.42% <0%> (-80.1%) ⬇️
Bio/PDB/mmtf/__init__.py 29.41% <0%> (-58.83%) ⬇️
Bio/Phylo/CDAO.py 42.85% <0%> (-57.15%) ⬇️
Bio/PopGen/GenePop/EasyController.py 22.68% <0%> (-54.64%) ⬇️
Bio/PopGen/GenePop/Controller.py 10.42% <0%> (-54.25%) ⬇️
Bio/DocSQL.py 7.58% <0%> (-31.04%) ⬇️
Bio/Wise/__init__.py 50.87% <0%> (-29.83%) ⬇️
Bio/Phylo/Applications/_Phyml.py 71.42% <0%> (-28.58%) ⬇️
Bio/Wise/psw.py 48.05% <0%> (-22.08%) ⬇️
Bio/GA/Evolver.py 54.54% <0%> (-18.19%) ⬇️
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7df15de...0b52714. Read the comment docs.

\begin{verbatim}
import os
try:
os.environ["OMP_NUM_THREADS"] = "1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Four space indentation please

We deprecated the \verb|Bio.Fasta| module in Biopython 1.51 (August 2009) and removed it in Biopython 1.55 (August 2010). There is a brief example showing how to convert old code to use \verb|Bio.SeqIO| instead in the \href{https://github.com/biopython/biopython/blob/master/DEPRECATED.rst}{DEPRECATED.rst} file.

\item \emph{Why does Biopython start so many threads?} \\
Some parts of Biopython make use of \href{http://www.numpy.org}{NumPy}, which in turn makes use of a system-dependent BLAS (Basic Linear Algebra Subprograms) library. Some BLAS libraries automatically spawn one thread for each available CPU core. On systems where this is the case, simply importing NumPy (or any Biopython module relying on NumPy) will spawn several threads. The number of threads can be changed by setting an environment variable before NumPy is loaded: if your version of NumPy is linked to the Intel MKL, set \verb|MKL_NUM_THREADS| to the desired number; for OpenBLAS, set \verb|OMP_NUM_THREADS|; otherwise, consult the documentation for your BLAS library.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say "... which in turn may make use of ..." here (ie may)? After all, NumPy might be built with ATLAS instead of BLAS?

@peterjc
Copy link
Member

peterjc commented Sep 27, 2017

@StefansM Other than those very minor tweaks, this looks good. I've double checked the LaTeX rendering locally (since the verbatim section can be tricky given all the special characters), and that's perfect.

@mdehoon you do a lot more with NumPy than me - any comments on this?

@StefansM
Copy link
Contributor Author

@peterjc I've fixed the indentation of the python sample. I should know better than to trust autoindent for code embedded in LaTeX documents.

I also reworded the text slightly following your suggestion. The Intel MKL isn't just a BLAS library, so I changed that to "mathematics library".

@peterjc
Copy link
Member

peterjc commented Sep 27, 2017

Looks good. If there are no objections or further suggestions, I'll merge this later this week.

@peterjc
Copy link
Member

peterjc commented Sep 28, 2017

Apparently it can be $OPENBLAS_NUM_THREADS which needs setting, see https://mail.python.org/pipermail/numpy-discussion/2017-September/077224.html

@StefansM
Copy link
Contributor Author

It looks like it can be OPENBLAS_NUM_THREADS, GOTO_NUM_THREADS or OMP_NUM_THREADS, depending on how OpenBLAS was linked: https://github.com/xianyi/OpenBLAS#set-the-number-of-threads-with-environment-variables

I think these are the most common libraries that NumPy is linked with:

  • Netlib LAPACK/BLAS: Single-threaded only, nothing to set.
  • ATLAS: Set at compile time, nothing that can be done.
  • BLIS: Set BLIS_NUM_THREADS
  • Intel MKL: Set MKL_NUM_THREADS
  • OpenBLAS:
    • Compiled with OpenMP: Set OMP_NUM_THREADS
    • Otherwise, set OPENBLAS_NUM_THREADS or GOTO_NUM_THREADS.
    • Practically speaking, it's probably best to set both OPENBLAS_NUM_THREADS and OMP_NUM_THREADS.

I've not actually tested whether this problem occurs with these other libraries, but it seems likely. I'm happy to add these to the FAQ entry if you think it wouldn't bulk it out too much.

@peterjc
Copy link
Member

peterjc commented Sep 28, 2017

Let's give it a day or two in case that thread on the NumPy discussion list has a more authoritative answer (e.g. in some cases is it function call time rather than import time which matters?)

@peterjc
Copy link
Member

peterjc commented Jan 3, 2025

We never did merge this - and surprisingly I don't recall any queries in the years since about threads via numpy. I'm going to just close this, sorry, rebasing is non-trivial.

Note the tutorial has since moved from LaTeX to reStructuredText and the FAQ is now here:
https://github.com/biopython/biopython/blob/biopython-184/Doc/Tutorial/chapter_introduction.rst

@peterjc peterjc closed this Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments