[slurm-users] Use all cores with HT node

Sidiney Crescencio sidiney.crescencio at clustervision.com
Fri Dec 7 07:34:23 MST 2018


I've found out the problem, in my case I had set too much higher value on
DefMemperCPU , then when I was requesting 80 cpus for instance, the memory
would not be enough.

It seems to be working fine now, I'm still testing.

Thanks, though.



On Fri, 7 Dec 2018 at 15:04, Jeffrey Frey <frey at udel.edu> wrote:

> I ran into this myself.  By default Slurm allocates HT's as pairs
> (associated with a single core).  The only adequate way I figured out to
> force HT = core is to make them full-fledged cores in the config:
>
>
> NodeName=csk007 CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=40
> ThreadsPerCore=1 RealMemory=385630 TmpDisk=217043
>
>
>
> and then I make sure those nodes have a "HT" feature on them to remind me
> they're configured with HT enabled -- also lets users request nodes with or
> without "HT" feature.
>
>
>
>
>
> On Dec 7, 2018, at 6:12 AM, Sidiney Crescencio <
> sidiney.crescencio at clustervision.com> wrote:
>
> Hello All,
>
> I'm facing some issues to use the HT on my compute nodes, I'm running
> slurm 17.02.7
>
> SelectTypeParameters    = CR_CORE_MEMORY
>
> cgroup.conf
>
> CgroupAutomount=yes
> CgroupReleaseAgentDir="/etc/slurm/cgroup"
>
> # cpuset subsystem
> ConstrainCores=yes
> TaskAffinity=no
>
> # memory subsystem
> ConstrainRAMSpace=yes
> ConstrainSwapSpace=yes
>
> # device subsystem
> ConstrainDevices=yes
>
> If I try to allocate the 80 CPUs it will not work, I couldn't find why
> this doesn't work. Do you guys have any ideas or could cause this issue?
> I've been playing with several different parameters on the node definition,
> also using --threads-per-core, etc.. but still I should be able to allocate
> the 80 cpus.
>
> Thanks in advance.
>
> srun --reservation=test_ht -p defq -n 80 sleep 100
> srun: error: Unable to allocate resources: Requested node configuration is
> not available
>
> --------------
>
> [root at csk007 ~]# slurmd -C
> NodeName=csk007 CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20
> ThreadsPerCore=2 RealMemory=385630 TmpDisk=217043
> UpTime=84-00:36:44
> [root at csk007 ~]# scontrol show node csk007
> NodeName=csk007 Arch=x86_64 CoresPerSocket=20
>    CPUAlloc=0 CPUErr=0 CPUTot=80 CPULoad=4.03
>    AvailableFeatures=(null)
>    ActiveFeatures=(null)
>    Gres=(null)
>    NodeAddr=csk007 NodeHostName=csk007 Version=17.02
>    OS=Linux RealMemory=380000 AllocMem=0 FreeMem=338487 Sockets=2 Boards=1
>    State=RESERVED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A
> MCS_label=N/A
>    Partitions=defq
>    BootTime=2018-09-14T12:31:05 SlurmdStartTime=2018-11-29T15:25:03
>    CfgTRES=cpu=80,mem=380000M
>    AllocTRES=
>    CapWatts=n/a
>    CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
>    ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
>
> -----------------------
>
> --
> Best Regards,
> Sidiney
>
>
>
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::
> Jeffrey T. Frey, Ph.D.
> Systems Programmer V / HPC Management
> Network & Systems Services / College of Engineering
> University of Delaware, Newark DE  19716
> Office: (302) 831-6034  Mobile: (302) 419-4976
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::
>
>
>
>
>

-- 
Best Regards,

[image: clustervision_logo.png]
Sidiney Crescencio
Technical Support Engineer


Direct: +31 20 407 7550
Skype: sidiney.crescencio_1
sidiney.crescencio at clustervision.com

ClusterVision BV
Gyroscoopweg 56
1042 AC Amsterdam
The Netherlands
Tel: +31 20 407 7550
Fax: +31 84 759 8389
www.clustervision.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181207/2ac589a8/attachment-0002.html>


More information about the slurm-users mailing list