[slurm-users] Preempting not working for GPU nodes

Thu May 10 04:50:04 MDT 2018

Hi everyone,

We have a heterogeneous cluster. Part of the nodes have two nvidia gpu
cards.

slurm.conf looks like:

NodeName=compute-0-[0-6] CPUs=32
>
> NodeName=compute-1-[0-5] CPUs=16 Gres=gpu:2 Weight=100
>
>
>> PartitionName=hipri DefaultTime=00-1 MaxTime=00-1 MaxNodes=1
>> PriorityTier=9 Nodes=compute-0-[0-6]
>
> PartitionName=higtx DefaultTime=00-1 MaxTime=00-1 MaxNodes=1
>> PriorityTier=9 Nodes=compute-1-[0-5]
>
> PartitionName=cpu   DefaultTime=01-0 MaxTime=15-0 MaxNodes=1
>> Nodes=compute-0-[0-6]
>
> PartitionName=gtx   DefaultTime=01-0 MaxTime=15-0 MaxNodes=1
>> Nodes=compute-1-[0-5]
>
>
> gres.conf look like:

> NodeName=compute-1-[0-5] Name=gpu File=/dev/nvidia0
>
> NodeName=compute-1-[0-5] Name=gpu File=/dev/nvidia1
>
>
>
Preempting works for hipri partition, but not for higtx partition

By the way, if I suspend job in cpu partition, the pending jobs in cpu
partition start to run. But if I suspend job in gtx partition, nothing
happens.

-- 
Gong, Zheng （龚正）
Doctoral Candidate in Physical Chemistry
School of Chemistry and Chemical Engineering
Shanghai Jiao Tong University
http://sun.sjtu.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180510/2c09fca3/attachment.html>