Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit threads usage in numpy during test to avoid time-out #4584

Draft
wants to merge 11 commits into
base: develop
Choose a base branch
from
7 changes: 7 additions & 0 deletions .github/workflows/gh-ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,13 @@ jobs:
- name: run_tests
if: contains(matrix.name, 'asv_check') != true
run: |
export OPENBLAS_NUM_THREADS=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: there's another place to put this so it applies everywhere

export GOTO_NUM_THREADS=1
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
# limit to 2 workers to avoid overloading the CI
export PYTEST_XDIST_AUTO_NUM_WORKERS=2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be set to auto, because that's what we are already using. Does auto not work correctly?

If auto does not work then I'd prefer we define a variable GITHUB_CI_MAXCORES or similar and then use it invoking pytest -n $GITHUB_CI_MAXCORES. I prefer commandline arguments over env vars when determining code behavior because you immediately see what affects the command itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning to find a way to find the default number of workers pytest will use and reduce it by one because the Ubuntu runner has 4 cores and Mac has 3 cores. but I ended up setting it to 2 and found the performance acceptable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ubuntu runner has 4 cores and Mac has 3 cores.

It's unclear from this conversation that altering the value of -n is actually resulting in any difference in the issue we're seeing with the the one multiprocessing test failing (i.e. the only cause of timeouts), could you confirm this please?

If it's not affecting things, then my suggestion is to stick with auto unless there's a substantial difference in performance that we've missed.


PYTEST_FLAGS="--disable-pytest-warnings --durations=50"
if [ ${{ matrix.codecov }} = "true" ]; then
PYTEST_FLAGS="${PYTEST_FLAGS} --cov-config=.coveragerc --cov=MDAnalysis --cov-report=xml"
Expand Down
7 changes: 7 additions & 0 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,13 @@ jobs:
displayName: 'Check installed packages'
- powershell: |
cd testsuite
$env:OPENBLAS_NUM_THREADS=1
$env:GOTO_NUM_THREADS=1
$env:OMP_NUM_THREADS=1
$env:MKL_NUM_THREADS=1
# limit to 2 workers to avoid overloading the CI
$env:PYTEST_XDIST_AUTO_NUM_WORKERS=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is confusing: if you set workers to 1 but say you limit to 2 workers. Change comment, otherwise same as above: perhaps just add to commandline args?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to correct the comment line when I realized Azure has only 2 cores so I changed # workers to 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, is this actually affecting timeouts or is this some separate optimization?


pytest MDAnalysisTests --disable-pytest-warnings -n auto --timeout=200 -rsx --cov=MDAnalysis
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
displayName: 'Run MDAnalysis Test Suite'
- script: |
Expand Down
17 changes: 9 additions & 8 deletions testsuite/MDAnalysisTests/analysis/test_encore.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,10 +125,10 @@ def test_triangular_matrix(self):
reason="Not yet supported on Windows.")
def test_parallel_calculation(self):

arguments = [tuple([i]) for i in np.arange(0,100)]
arguments = [tuple([i]) for i in np.arange(0,10)]

parallel_calculation = encore.utils.ParallelCalculation(function=function,
n_jobs=4,
n_jobs=2,
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
args=arguments)
results = parallel_calculation.run()

Expand All @@ -142,12 +142,12 @@ def test_rmsd_matrix_with_superimposition(self, ens1):
conf_dist_matrix = encore.confdistmatrix.conformational_distance_matrix(
ens1,
encore.confdistmatrix.set_rmsd_matrix_elements,
select="name CA",
select="name CA and resnum 1:3",
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
pairwise_align=True,
weights='mass',
n_jobs=1)

reference = rms.RMSD(ens1, select="name CA")
reference = rms.RMSD(ens1, select="name CA and resnum 1:3")
reference.run()
err_msg = (
"Calculated RMSD values differ from "
Expand All @@ -159,24 +159,25 @@ def test_rmsd_matrix_with_superimposition_custom_weights(self, ens1):
conf_dist_matrix = encore.confdistmatrix.conformational_distance_matrix(
ens1,
encore.confdistmatrix.set_rmsd_matrix_elements,
select="name CA",
select="name CA and resnum 1:3",
pairwise_align=True,
weights='mass',
n_jobs=1)

conf_dist_matrix_custom = encore.confdistmatrix.conformational_distance_matrix(
ens1,
encore.confdistmatrix.set_rmsd_matrix_elements,
select="name CA",
select="name CA and resnum 1:3",
pairwise_align=True,
weights=(ens1.select_atoms('name CA').masses, ens1.select_atoms('name CA').masses),
weights=(ens1.select_atoms("name CA and resnum 1:3").masses,
ens1.select_atoms("name CA and resnum 1:3").masses),
n_jobs=1)

for i in range(conf_dist_matrix_custom.size):
assert_allclose(conf_dist_matrix_custom[0, i], conf_dist_matrix[0, i], rtol=0, atol=1.5e-7)

def test_rmsd_matrix_without_superimposition(self, ens1):
selection_string = "name CA"
selection_string = "name CA and resnum 1:3"
selection = ens1.select_atoms(selection_string)
reference_rmsd = []
coordinates = ens1.trajectory.timeseries(selection, order='fac')
Expand Down
14 changes: 7 additions & 7 deletions testsuite/MDAnalysisTests/parallelism/test_multiprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,8 +81,8 @@
),
(NCDF,),
(np.arange(150).reshape(5, 10, 3).astype(np.float64),),
(GRO, [GRO, GRO, GRO, GRO, GRO]),
(PDB, [PDB, PDB, PDB, PDB, PDB]),
(GRO, [GRO, GRO]),
(PDB, [PDB, PDB]),
(GRO, [XTC, XTC]),
(TRC_PDB_VAC, TRC_TRAJ1_VAC),
(TRC_PDB_VAC, [TRC_TRAJ1_VAC, TRC_TRAJ2_VAC]),
Expand Down Expand Up @@ -121,11 +121,11 @@ def test_multiprocess_COG(u):
ag = u.atoms[2:5]

ref = np.array([cog(u, ag, i)
for i in range(3)])
for i in range(2)])

p = multiprocessing.Pool(2)
res = np.array([p.apply(cog, args=(u, ag, i))
for i in range(3)])
for i in range(2)])
p.close()
assert_equal(ref, res)

Expand Down Expand Up @@ -198,9 +198,9 @@ def test_creating_multiple_universe_without_offset(temp_xtc, ncopies=3):
('memory', np.arange(60).reshape(2, 10, 3).astype(np.float64), dict()),
('TRC', TRC_TRAJ1_VAC, dict()),
('CHAIN', [TRC_TRAJ1_VAC, TRC_TRAJ2_VAC], dict()),
('CHAIN', [GRO, GRO, GRO], dict()),
('CHAIN', [PDB, PDB, PDB], dict()),
('CHAIN', [XTC, XTC, XTC], dict()),
('CHAIN', [GRO, GRO], dict()),
('CHAIN', [PDB, PDB], dict()),
('CHAIN', [XTC, XTC], dict()),
])
def ref_reader(request):
fmt_name, filename, extras = request.param
Expand Down
Loading