[slurm-users] Backfill isn’t working for a node with two GPUs that have different GRES types.
Randall Radmer
radmer at gmail.com
Mon Apr 1 15:31:16 UTC 2019
I can’t get backfill to work for a machine with two GPUs (one is a P4 and
the other a T4).
Submitting jobs works as expected: if the GPU I request is free, then my
job runs, otherwise it goes into a pending state. But if I have pending
jobs for one GPU ahead of pending jobs for the other GPU, I see blocking
issues.
More specifically, I can create a case where I am running a job on each of
the GPUs and have a pending job waiting for the P4 followed by a pending
job waiting for a T4. I would expect that if I exit the running T4 job,
then backfill would start the pending T4 job, even though it has to job
ahead of the pending P4 job. This does not happen...
The following shows my jobs after I exited from a running T4 job, which had
ID 100092:
$ squeue --noheader -u rradmer --Format=jobid,state,gres,nodelist,reason |
sed 's/ */ /g' | sort
100091 RUNNING gpu:gv100:1 computelab-134 None
100093 PENDING gpu:gv100:1 Resources
100094 PENDING gpu:tu104:1 Resources
I can find no reason why 100094 doesn’t start running (I’ve waited up to
an hour, just to make sure).
System config info and log snippets shown below.
Thanks much,
Randy
Node state corresponding to the squeue command, shown above:
$ scontrol show node computelab-134 | grep -i [gt]res
Gres=gpu:gv100:1,gpu:tu104:1
CfgTRES=cpu=12,mem=64307M,billing=12,gres/gpu=2,gres/gpu:gv100=1,gres/gpu:tu104=1
AllocTRES=cpu=6,mem=32148M,gres/gpu=1,gres/gpu:gv100=1
Slurm config follows:
$ scontrol show conf | grep -Ei '(gres|^Sched|prio|vers)'
AccountingStorageTRES =
cpu,mem,energy,node,billing,gres/gpu,gres/gpu:gp100,gres/gpu:gp104,gres/gpu:gv100,gres/gpu:tu102,gres/gpu:tu104,gres/gpu:tu106
GresTypes = gpu
PriorityParameters = (null)
PriorityDecayHalfLife = 7-00:00:00
PriorityCalcPeriod = 00:05:00
PriorityFavorSmall = No
PriorityFlags =
PriorityMaxAge = 7-00:00:00
PriorityUsageResetPeriod = NONE
PriorityType = priority/multifactor
PriorityWeightAge = 0
PriorityWeightFairShare = 0
PriorityWeightJobSize = 0
PriorityWeightPartition = 0
PriorityWeightQOS = 0
PriorityWeightTRES = (null)
PropagatePrioProcess = 0
SchedulerParameters =
default_queue_depth=2000,bf_continue,bf_ignore_newly_avail_nodes,bf_max_job_test=1000,bf_window=10080,kill_invalid_depend
SchedulerTimeSlice = 30 sec
SchedulerType = sched/backfill
SLURM_VERSION = 17.11.9-2
GPUs on node:
$ nvidia-smi --query-gpu=index,name,gpu_bus_id --format=csv
index, name, pci.bus_id
0, Tesla T4, 00000000:82:00.0
1, Tesla P4, 00000000:83:00.0
The gres file on node:
$ cat /etc/slurm/gres.conf
Name=gpu Type=tu104 File=/dev/nvidia0 Cores=0,1,2,3,4,5
Name=gpu Type=gp104 File=/dev/nvidia1 Cores=6,7,8,9,10,11
Random sample of SlurmSchedLogFile:
$ sudo tail -3 slurm.sched.log
[2019-04-01T08:14:23.727] sched: Running job scheduler
[2019-04-01T08:14:23.728] sched: JobId=100093. State=PENDING.
Reason=Resources. Priority=1. Partition=test-backfill.
[2019-04-01T08:14:23.728] sched: JobId=100094. State=PENDING.
Reason=Resources. Priority=1. Partition=test-backfill.
Random sample of SlurmctldLogFile:
$ sudo grep backfill slurmctld.log | tail -5
[2019-04-01T08:16:53.281] backfill: beginning
[2019-04-01T08:16:53.281] backfill test for JobID=100093 Prio=1
Partition=test-backfill
[2019-04-01T08:16:53.281] backfill test for JobID=100094 Prio=1
Partition=test-backfill
[2019-04-01T08:16:53.281] backfill: reached end of job queue
[2019-04-01T08:16:53.281] backfill: completed testing 2(2) jobs, usec=707
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190401/d0d9601c/attachment.html>
More information about the slurm-users
mailing list