[slurm-users] [External] Scheduling GPUS

Loris Bennett loris.bennett at fu-berlin.de
Tue Nov 12 07:31:11 UTC 2019


Hi,

I don't think the statement below about --nodes=1 is true.  It just
means you want one and not more than one node.  This can be important
multiple cores are requested, but the program is not, say, an MPI
program.

You can see which cores a running job is using with

  scontrol show job --detail <job id>

HTH

Loris

Prentice Bisbal <pbisbal at pppl.gov> writes:

> Remove this line: 
>
> #SBATCH --nodes=1
>
> Slurm assumes you're requesting the whole node. --ntasks=1 should be adequate. 
>
>
> On 11/7/19 4:19 PM, Mike Mosley wrote:
>
>  Greetings all:
>
>  I'm attempting to  configure the scheduler to schedule our GPU boxes but have run into a bit of a snag. 
>
>  I have a box with two Tesla K80s.  With my current configuration, the scheduler will schedule one job on the box, but if I submit a second job, it queues up until the first
>  one finishes:
>
>  My submit script:
>
>  #SBATCH --partition=NodeSet1
>
>  #SBATCH --nodes=1
>
>  #SBATCH --ntasks=1
>
>  #SBATCH --gres=gpu:k80:1
>
>  My slurm.conf (the things I think are relevant)
>
>  GresTypes=gpu
>
>  SelectType=select/cons_tres
>
>  PartitionName=NodeSet1 Nodes=cht-c[1-4],cph-gpu1 Default=YES MaxTime=INFINITE OverSubscribe=FORCE State=UP
>
>  NodeName=cph-gpu1 CPUs=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=257541 Gres=gpu:k80:2 Feature=gpu State=UNKNOWN
>
>  My gres.conf:
>
>  NodeName=cph-gpu1 Name=gpu Type=k80 File=/dev/nvidia[0-1]
>
>  and finally, the results of squeue:
>
>  $ squeue
>
>               JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
>
>                 208  NodeSet1   job.sh jmmosley PD       0:00      1 (Resources)
>
>                 207  NodeSet1   job.sh jmmosley  R       4:12      1 cph-gpu1
>
>  Any idea what I am missing or have misconfigured?
>
>  Thanks in advance.
>
>  Mike
>
>  -- 
>
>  J. Michael Mosley
>  University Research Computing
>  The University of North Carolina at Charlotte
>  9201 University City Blvd
>  Charlotte, NC  28223
>  704.687.7065      jmmosley at uncc.edu
>
-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de



More information about the slurm-users mailing list