[slurm-users] job_submit.lua - choice of error on failure / job_desc.gpus?
Loris Bennett
loris.bennett at fu-berlin.de
Fri Dec 4 12:58:59 UTC 2020
Hi,
I want to reject jobs that don't specify any GPUs when accessing our GPU
partition and have the following in job_submit.lua:
if (job_desc.partition == "gpu" and job_desc.gres == nil ) then
slurm.log_user(string.format("Please request GPU resources in the partition 'gpu', " ..
"e.g. '#SBATCH --gres=gpu:1' " ..
"Please see 'man sbatch' for more details)"))
slurm.log_info(string.format("check_parameters: user '%s' did not request GPUs in partition 'gpu'",
username))
return slurm.ERROR
end
If GRES is not given for the GPU partition, this produces
sbatch: error: Please request GPU resources in the partition 'gpu', e.g. '#SBATCH --gres=gpu:1' Please see 'man sbatch' for more details)
sbatch: error: Batch job submission failed: Unspecified error
My questions are:
1. Is there a better error to return? The 'slurm.ERROR' produces the
generic second error line above (slurm_errno.h just seems to have
ESLURM_MISSING_TIME_LIMIT and ESLURM_INVALID_KNL as errors a plugin
might raise). This is misleading, since the error is in fact known
and specific.
2. I am right in thinking that 'job_desc' does not, as of 20.02.06, have
a 'gpus' field corresponding to the sbatch/srun option '--gpus'?
Cheers,
Loris
--
Dr. Loris Bennett (Hr./Mr.)
ZEDAT, Freie Universität Berlin Email loris.bennett at fu-berlin.de
More information about the slurm-users
mailing list