Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submittion of a batch job will be failed when argument "work_dir" contains a "_" #294

Open
Baohua-Chen opened this issue May 20, 2023 · 1 comment

Comments

@Baohua-Chen
Copy link

Baohua-Chen commented May 20, 2023

Details

  • Slurm Version: 20.02.5
  • Python Version: 3.10.8
  • Cython Version: 0.29.34
  • PySlurm Branch: 20-02-5
  • Linux Distribution: Linux version 3.10.0-1160.el7.x86_64 CentOS Linux release 7.9.2009

UPDATE

This bug seems not only caused by values of slurm_job dict. I have got the same error when deleted the "work_dir" from the dict.
Maybe it's something to submit job in Jupyter Lab? I do not know.

Issue

When attempting to submit a batch job using the job().submit_batch_job function and specifying a "work_dir" key with values containing underscores (_), the job gets submitted but immediately fails. Upon checking the submitted job using the job().find_id function, I discovered that the "work_dir" attribute was encoded as garbled text such as "wly�U". However, when I resubmitted the job with the underscores removed from the work_dir`, the issue did not reoccur. I suspect this might be due to replacing "_" by "-" when call the SLURM interface.

An example which reproduces this bug:
Job1 = {'wrap': 'echo a;sleep 15; echo b, 'job_name': 'test', 'partition': 'all', 'ntasks': 1, 'cpus_per_task': 1, 'work_dir': '/home/boo/slurm_jobs'}
job().submit_batch_job(Job1)

And an example which works well:
Job2 = {'wrap': 'echo a;sleep 15; echo b, 'job_name': 'test', 'partition': 'all', 'ntasks': 1, 'cpus_per_task': 1, 'work_dir': '/home/boo/slurmjobs'}
job().submit_batch_job(Job2)

@tazend
Copy link
Member

tazend commented May 20, 2023

Hi

you are probably seeing a similar issue as mentioned in #260

In newer versions of pyslurm (starting with 21.08), the Job-Submission API was substantially reworked (see the docs here), and the pyslurm.job class has been declared deprecated.

Since that new API is not available for 20.2 yet, I can try to backport it. But it may take some time due to potential changes that have been introduced over the years in newer slurm versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants