[slurm-users] strange resource allocation issue - thoughts?

Wed Mar 27 15:02:30 UTC 2019

On 3/23/19 2:16 PM, Sharma, M D wrote:
> Hi folks,
>
> By default slurm allocates the whole node for a job (even if it 
> specifically requested a single core). This is usually taken care of 
> by adding SelectType=select/cons_res along with an appropriate 
> parameter such as SelectTypeParameters=CR_Core_Memory.
>
> When testing the job submission and resource allocation, we can see 
> things work as intended when using srun:
>
> srun -N1 -n1 -p fxq --mem=1000 sleep 60 &
>
> # A command as above, submitted 20 times would launch 20 jobs on a 
> single 40 core node as intended.
>
> However, if the same request is submitted via sbatch, the entire node 
> gets into an "allocated" state and does not accept any other jobs 
> until completion of the single core job.
>
> Has anyone else seen this behaviour / have thoughts on a fix?
>
>
I believe by requesting -N1, you are requesting the entire node be 
allocation to that job. Remove -N1 from that command, like this:

srun -n1 -p fxq --mem=1000 sleep 60 &

That should allow you to run multiple jobs on the same node.

--
Prentice

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190327/544f0e53/attachment.html>