[slurm-users] What is the complete logic to calculate node number in job_submit.lua

Loris Bennett loris.bennett at fu-berlin.de
Mon Sep 26 14:02:06 UTC 2022


Hi Ole,

Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:

> Hi Loris,
>
> On 9/26/22 12:51, Loris Bennett wrote:
>>> When designing restriction in job_submit.lua, I found there is no member in job_desc struct can directly be used to determine the node number finally allocated to a job. The job_desc.min_nodes seem to
>>> be a close answer, but it will be 0xFFFFFFFE when user not specify –node option. Then in such case we think we can use job_desc.num_tasks and job_desc.ntasks_per_node to calculate node number.
>>> But again, we find ntasks_per_node may also be default value 0xFFFE if user not specify related option.
>>>
>>> So what is the complete and elegant way to predict the job node number in job_submit.lua in all case, no matter how user write their submit options?
>> I don't think you can expect to know the node(s) a job will eventually
>> run on at submission time.  How would this work?  Resources will become
>> available earlier than Slurm expects, if jobs finish before the given
>> time-time (or if they crash).  If your are using fairshare, jobs can be
>> scheduled which have a higher priority than the currently waiting jobs.
>> What is your use-case for needing to know the node the job will run on?
>
> I think he meant the *number of nodes*, and not the *hostnames* of the compute
> nodes selected by Slurm at a later time.

Ah, OK, you may be right.  However, unless the user restricts the job to
an exact number of nodes, the actual number which will beused in the end
is also unknowable at the time of submission, isn't it?

Cheers,

Loris



More information about the slurm-users mailing list