[slurm-users] Feature request: create a job id before job submission
Mark Hahn
hahn at mcmaster.ca
Tue May 7 18:16:45 UTC 2019
>> puzzled two ways: why not use the numeric jobid,
>
> I don't know the job id before the actual job submission. Hence I would like to get some kind of place holder, and `scommit` the job later with the actual resource requirements as comments in an usual jobscript.
>
OK, in that case, you can make up an arbitrary identifier
(hash of user and time, etc), and simply pass that into the
job in the environment. (there is a religion of not passing state,
such as environment variables into jobs, but it's just dogma...)
>> and why would configuring
>> the scratch space be too slow to perform in the job prolog?
>
> The access to /home is highly discouraged from the nodes, instead the users should prepare an area in /scratch beforehand (copy all the files for the job thereto) and submit the job from there. So the working directory of the job is automatically in the /scratch area (fast parallel file system) ? no further file staging needed. Essentially the nodes could work without a mounted /home.
>
there's no reason the prolog can't call standardized code that looks
for the relevant information and performs any staging (without human
intervention).
> Sure, `sblank` which would provide a reserved job id could have some prolog and prepare the workspace to tell the user: please put your files in /scratch/job-id-task-id. For the users this would mean to issue:
>
> sblank
> copy files to the given location(s) fro the login node
> scommit
I don't see any harm to doing this, which would require no assistance
from slurm. "sbatch --hold ...", then do your prep then "scontrol release".
but I'm not sure what it really gets you.
another approach would be to submit a dependent pair (or even triplet)
of jobs: data movement on either end and compute in the middle. one
attractive thing about this is that since the data movement would be
in dedicated jobs, you could handle them specially (run them on dedicated
nodes, rate-limit them, etc).
of course, user pro/epilog is very much like this, but somewhat less
structured (and potentially less queue time). possibly more wasteful.
> and the users can be sure to find the job's files in /scratch/job-id-task-id, and the admins can be sure that there is no access to /home slowing down the cluster and interactive work on the login node.
sure, though you could make the name somewhat richer (username, account, etc)
regards, mark hahn.
More information about the slurm-users
mailing list