[slurm-users] Query about Compute + GPUs
Markus Köberl
markus.koeberl at tugraz.at
Tue Nov 21 07:05:58 MST 2017
On Tuesday, 21 November 2017 10:26:53 CET Merlin Hartley wrote:
> Could you give us your submission command?
> It may be that you are requesting the wrong partition - i.e. relying on the
> default partition selection… try with “--partition cpu”
I run the following commands:
srun --gres=gpu --mem-per-cpu="5G" -w gpu1 --pty /bin/bash
-> works, partition gpu
srun --mem-per-cpu="5G" -p cpu --pty /bin/bash
-> works, I get a slot on another node which has only one NodeName entry.
srun --mem-per-cpu="5G" -p cpu -w gpu1-cpu --pty /bin/bash
-> error: Invalid job credential...
srun --mem-per-cpu="5G" -p cpu -w gpu1 --pty /bin/bash
-> error not in partition...
I am using the following options:
EnforcePartLimits=ANY
GresTypes=gpu
JobSubmitPlugins=all_partitions
ProctrackType=proctrack/cgroup
ReturnToService=2
TaskPlugin=task/cgroup
TrackWCKey=yes
InactiveLimit=3600
KillWait=1800
MinJobAge=600
OverTimeLimit=600
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0
DefMemPerCPU=1000
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
PriorityFlags=ACCRUE_ALWAYS,FAIR_TREE,SMALL_RELATIVE_TO_TIME
PriorityType=priority/multifactor
PriorityDecayHalfLife=7-0
PriorityFavorSmall=YES
PriorityWeightAge=50
PriorityWeightFairshare=25
PriorityWeightJobSize=50
PriorityWeightPartition=100
PriorityWeightTRES=CPU=1000,Mem=2000,Gres/gpu=3000
AccountingStorageEnforce=associations,limits,qos,WCKey
AccountingStorageType=accounting_storage/slurmdbd
AccountingStoreJobComment=YES
AccountingStorageTRES=CPU,Mem,Gres/gpu
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/cgroup
regards
Markus Köberl
--
Markus Koeberl
Graz University of Technology
Signal Processing and Speech Communication Laboratory
E-mail: markus.koeberl at tugraz.at
More information about the slurm-users
mailing list