[slurm-users] Use all cores with HT node
Sidiney Crescencio
sidiney.crescencio at clustervision.com
Fri Dec 7 07:34:23 MST 2018
I've found out the problem, in my case I had set too much higher value on
DefMemperCPU , then when I was requesting 80 cpus for instance, the memory
would not be enough.
It seems to be working fine now, I'm still testing.
Thanks, though.
On Fri, 7 Dec 2018 at 15:04, Jeffrey Frey <frey at udel.edu> wrote:
> I ran into this myself. By default Slurm allocates HT's as pairs
> (associated with a single core). The only adequate way I figured out to
> force HT = core is to make them full-fledged cores in the config:
>
>
> NodeName=csk007 CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=40
> ThreadsPerCore=1 RealMemory=385630 TmpDisk=217043
>
>
>
> and then I make sure those nodes have a "HT" feature on them to remind me
> they're configured with HT enabled -- also lets users request nodes with or
> without "HT" feature.
>
>
>
>
>
> On Dec 7, 2018, at 6:12 AM, Sidiney Crescencio <
> sidiney.crescencio at clustervision.com> wrote:
>
> Hello All,
>
> I'm facing some issues to use the HT on my compute nodes, I'm running
> slurm 17.02.7
>
> SelectTypeParameters = CR_CORE_MEMORY
>
> cgroup.conf
>
> CgroupAutomount=yes
> CgroupReleaseAgentDir="/etc/slurm/cgroup"
>
> # cpuset subsystem
> ConstrainCores=yes
> TaskAffinity=no
>
> # memory subsystem
> ConstrainRAMSpace=yes
> ConstrainSwapSpace=yes
>
> # device subsystem
> ConstrainDevices=yes
>
> If I try to allocate the 80 CPUs it will not work, I couldn't find why
> this doesn't work. Do you guys have any ideas or could cause this issue?
> I've been playing with several different parameters on the node definition,
> also using --threads-per-core, etc.. but still I should be able to allocate
> the 80 cpus.
>
> Thanks in advance.
>
> srun --reservation=test_ht -p defq -n 80 sleep 100
> srun: error: Unable to allocate resources: Requested node configuration is
> not available
>
> --------------
>
> [root at csk007 ~]# slurmd -C
> NodeName=csk007 CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20
> ThreadsPerCore=2 RealMemory=385630 TmpDisk=217043
> UpTime=84-00:36:44
> [root at csk007 ~]# scontrol show node csk007
> NodeName=csk007 Arch=x86_64 CoresPerSocket=20
> CPUAlloc=0 CPUErr=0 CPUTot=80 CPULoad=4.03
> AvailableFeatures=(null)
> ActiveFeatures=(null)
> Gres=(null)
> NodeAddr=csk007 NodeHostName=csk007 Version=17.02
> OS=Linux RealMemory=380000 AllocMem=0 FreeMem=338487 Sockets=2 Boards=1
> State=RESERVED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A
> MCS_label=N/A
> Partitions=defq
> BootTime=2018-09-14T12:31:05 SlurmdStartTime=2018-11-29T15:25:03
> CfgTRES=cpu=80,mem=380000M
> AllocTRES=
> CapWatts=n/a
> CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
> ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
>
> -----------------------
>
> --
> Best Regards,
> Sidiney
>
>
>
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::
> Jeffrey T. Frey, Ph.D.
> Systems Programmer V / HPC Management
> Network & Systems Services / College of Engineering
> University of Delaware, Newark DE 19716
> Office: (302) 831-6034 Mobile: (302) 419-4976
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::
>
>
>
>
>
--
Best Regards,
[image: clustervision_logo.png]
Sidiney Crescencio
Technical Support Engineer
Direct: +31 20 407 7550
Skype: sidiney.crescencio_1
sidiney.crescencio at clustervision.com
ClusterVision BV
Gyroscoopweg 56
1042 AC Amsterdam
The Netherlands
Tel: +31 20 407 7550
Fax: +31 84 759 8389
www.clustervision.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181207/2ac589a8/attachment-0002.html>
More information about the slurm-users
mailing list