[slurm-users] Query about Compute + GPUs

Markus Köberl markus.koeberl at tugraz.at
Tue Nov 21 07:05:58 MST 2017


On Tuesday, 21 November 2017 10:26:53 CET Merlin Hartley wrote:
> Could you give us your submission command?
> It may be that you are requesting the wrong partition - i.e. relying on the
> default partition selection… try with “--partition cpu”

I run the following commands:

srun --gres=gpu --mem-per-cpu="5G" -w gpu1 --pty /bin/bash
-> works, partition gpu

srun --mem-per-cpu="5G" -p cpu --pty /bin/bash
-> works, I get a slot on another node which has only one NodeName entry.

srun --mem-per-cpu="5G" -p cpu -w gpu1-cpu --pty /bin/bash
-> error: Invalid job credential...

srun --mem-per-cpu="5G" -p cpu -w gpu1 --pty /bin/bash
-> error not in partition...


I am using the following options:

EnforcePartLimits=ANY
GresTypes=gpu
JobSubmitPlugins=all_partitions
ProctrackType=proctrack/cgroup
ReturnToService=2
TaskPlugin=task/cgroup
TrackWCKey=yes
InactiveLimit=3600
KillWait=1800
MinJobAge=600
OverTimeLimit=600
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0
DefMemPerCPU=1000
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
PriorityFlags=ACCRUE_ALWAYS,FAIR_TREE,SMALL_RELATIVE_TO_TIME
PriorityType=priority/multifactor
PriorityDecayHalfLife=7-0
PriorityFavorSmall=YES
PriorityWeightAge=50
PriorityWeightFairshare=25
PriorityWeightJobSize=50
PriorityWeightPartition=100
PriorityWeightTRES=CPU=1000,Mem=2000,Gres/gpu=3000
AccountingStorageEnforce=associations,limits,qos,WCKey
AccountingStorageType=accounting_storage/slurmdbd
AccountingStoreJobComment=YES
AccountingStorageTRES=CPU,Mem,Gres/gpu
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/cgroup


regards
Markus Köberl
-- 
Markus Koeberl
Graz University of Technology
Signal Processing and Speech Communication Laboratory
E-mail: markus.koeberl at tugraz.at



More information about the slurm-users mailing list