[slurm-users] Scheduling GPUS
Mike Mosley
Mike.Mosley at uncc.edu
Thu Nov 7 21:19:59 UTC 2019
Greetings all:
I'm attempting to configure the scheduler to schedule our GPU boxes but
have run into a bit of a snag.
I have a box with two Tesla K80s. With my current configuration, the
scheduler will schedule one job on the box, but if I submit a second job,
it queues up until the first one finishes:
My submit script:
#SBATCH --partition=NodeSet1
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --gres=gpu:k80:1
My slurm.conf (the things I think are relevant)
GresTypes=gpu
SelectType=select/cons_tres
PartitionName=NodeSet1 Nodes=cht-c[1-4],cph-gpu1 Default=YES
MaxTime=INFINITE OverSubscribe=FORCE State=UP
NodeName=cph-gpu1 CPUs=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1
RealMemory=257541 Gres=gpu:k80:2 Feature=gpu State=UNKNOWN
My gres.conf:
NodeName=cph-gpu1 Name=gpu Type=k80 File=/dev/nvidia[0-1]
and finally, the results of squeue:
$ squeue
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
208 NodeSet1 job.sh jmmosley PD 0:00 1
(Resources)
207 NodeSet1 job.sh jmmosley R 4:12 1 cph-gpu1
Any idea what I am missing or have misconfigured?
Thanks in advance.
Mike
--
*J. Michael Mosley*
University Research Computing
The University of North Carolina at Charlotte
9201 University City Blvd
Charlotte, NC 28223
*704.687.7065 * * jmmosley at uncc.edu <mmosley at uncc.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191107/26c89137/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5329 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191107/26c89137/attachment-0001.bin>
More information about the slurm-users
mailing list