[slurm-users] NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ???

Fri Feb 8 17:37:34 UTC 2019

Xiang,

>From what I've of the original question, gres.conf may be another place to verify the setup that only one core is being allocated per gpu request: https://slurm.schedmd.com/gres.conf.html

Seeing the run submission line and gres.conf might help others give you further advise.

To Jeffrey's email: the concept of oversubscription may be beneficial versus changing resource inventories: https://slurm.schedmd.com/cons_res_share.html

Best,

Cyrus

On 2/8/19 9:44 AM, Jeffrey Frey wrote:
Documentation for CR_CPU:

CR_CPU
CPUs are consumable resources. Configure the number of CPUs on each node, which may be equal to the count of cores or hyper-threads on the node depending upon the desired minimum resource allocation. The node's Boards, Sockets, CoresPerSocket andThreadsPerCore may optionally be configured and result in job allocations which have improved locality; however doing so will prevent more than one job being from being allocated on each core.

So once you're configured node(s) with ThreadsPerCore=N, the cons_res plugin still forces tasks to span all threads on a core.  Elsewhere in the documentation it is stated:

Note that the Slurm can allocate resources to jobs down to the resolution of a core.

So you MUST treat a thread as a core if you want to schedule individual threads.  I can confirm this using the config:

SelectTypeParameters = CR_CPU_MEMORY
NodeName=n[003,008] CPUS=16 Sockets=2 CoresPerSocket=4 ThreadsPerCore=2

Submitting a 1-cpu job, if I check the cpuset assigned to a job on n003:

$ cat /sys/fs/cgroup/cpuset/slurm/{uid}/{job}/cpuset.cpus
4,12

If I instead configure as:

SelectTypeParameters = CR_Core_Memory
NodeName=n[003,008] CPUS=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1

Slurm will schedule "cores" 0-15 to jobs, which the cpuset cgroup happily accepts.  A 1-cpu job then shows:

$ cat /sys/fs/cgroup/cpuset/slurm/{uid}/{job}/cpuset.cpus
2

and a 2-cpu job shows:

$ cat /sys/fs/cgroup/cpuset/slurm/{uid}/{job}/cpuset.cpus
4,12

On Feb 8, 2019, at 5:09 AM, Antony Cleave <antony.cleave at gmail.com<mailto:antony.cleave at gmail.com>> wrote:

if you want slurm to just ignore the difference between physical and logical cores then you can change
SelectTypeParameters=CR_Core
to
SelectTypeParameters=CR_CPU

and then it will treat threads as CPUs and then it will let you start the number of tasks you expect

Antony

On Thu, 7 Feb 2019 at 18:04, Jeffrey Frey <frey at udel.edu<mailto:frey at udel.edu>> wrote:
Your nodes are hyperthreaded (ThreadsPerCore=2).  Slurm always allocates _all threads_ associated with a selected core to jobs.  So you're being assigned both threads on core N.

On our development-partition nodes we configure the threads as cores, e.g.

NodeName=moria CPUs=16 Boards=1 SocketsPerBoard=2 CoresPerSocket=8 ThreadsPerCore=1

to force Slurm to schedule the threads separately.

On Feb 7, 2019, at 12:10 PM, Xiang Gao <qasdfgtyuiop at gmail.com><mailto:qasdfgtyuiop at gmail.com> wrote:

Hi All,

We configured slurm on a server with 8 GPU and 16 CPUs and want to use slurm to scheduler for both CPU and GPU jobs. We observed an unexpected behavior that, although there are 16 CPUs, slurm only schedule 8 jobs to run even if there are jobs not asking any GPU. If I inspect detailed information using `scontrol show job`, I see some strange thing on some job that just ask for 1 CPU:

NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1

If I understand these concepts correctly, as the number of nodes is 1, number of tasks is 1, and number of cpus/task is 1, in principle there is no way that the final number of CPUs is 2. I'm not sure if I misunderstand the concepts, configure slurm wrongly, or this is a bug. So I come for help.

Some related config are:

# COMPUTE NODES
NodeName=moria CPUs=16 Boards=1 SocketsPerBoard=2 CoresPerSocket=4 ThreadsPerCore=2 RealMemory=120000 Gres=gpu:gtx1080ti:2,gpu:titanv:3,gpu:v100:1,gpu:gp100:2
State=UNKNOWN
PartitionName=queue Nodes=moria Default=YES MaxTime=INFINITE State=UP

# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
GresTypes=gpu
SelectType=select/cons_res
SelectTypeParameters=CR_Core

Best,
Xiang Gao

::::::::::::::::::::::::::::::::::::::::::::::::::::::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE  19716
Office: (302) 831-6034  Mobile: (302) 419-4976
::::::::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::::::::::::::::::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE  19716
Office: (302) 831-6034  Mobile: (302) 419-4976
::::::::::::::::::::::::::::::::::::::::::::::::::::::

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190208/30dfd24d/attachment-0001.html>