[slurm-users] CPUSpecList confusion

Paul Raines raines at nmr.mgh.harvard.edu
Tue Dec 13 14:45:23 UTC 2022


Hmm.  Actually looks like confusion between CPU IDs on the system
and what SLURM thinks the IDs are

# scontrol -d show job 8
...
      Nodes=foobar CPU_IDs=14-21 Mem=25600 GRES=
...

# cat 
/sys/fs/cgroup/system.slice/slurmstepd.scope/job_8/cpuset.cpus.effective
7-10,39-42


-- Paul Raines (http://help.nmr.mgh.harvard.edu)



On Tue, 13 Dec 2022 9:40am, Paul Raines wrote:

>
> Oh but that does explain the CfgTRES=cpu=14.  With the CpuSpecList
> below and SlurmdOffSpec I do get CfgTRES=cpu=50 so that makes sense.
>
> The issue remains that thought the number of cpus in CpuSpecList
> is taken into account, the exact IDs seem to be ignored.
>
>
> -- Paul Raines (http://help.nmr.mgh.harvard.edu)
>
>
>
> On Tue, 13 Dec 2022 9:34am, Paul Raines wrote:
>
>>
>>  I have tried it both ways with the same result.  The assigned CPUs
>>  will be both in and out of the range given to CpuSpecList
>>
>>  I tried setting using commas instead of ranges so used
>>
>>  CpuSpecList=0,1,2,3,4,5,6,7,8,9,10,11,12,13
>>
>>  But still does not work
>>
>>  $ srun -p basic -N 1 --ntasks-per-node=1 --mem=25G \
>>  --time=10:00:00 --cpus-per-task=8 --pty /bin/bash
>>  $ grep -i ^cpu /proc/self/status
>>  Cpus_allowed:   00000780,00000780
>>  Cpus_allowed_list:      7-10,39-42
>> 
>>
>>  -- Paul Raines (http://help.nmr.mgh.harvard.edu)
>> 
>> 
>>
>>  On Mon, 12 Dec 2022 10:21am, Sean Maxwell wrote:
>>
>>>   Hi Paul,
>>>
>>>   Nodename=foobar \
>>>>      CPUs=64 Boards=1 SocketsPerBoard=2 CoresPerSocket=16
>>>>      ThreadsPerCore=2
>>>>      \
>>>>      RealMemory=256312 MemSpecLimit=32768 CpuSpecList=14-63 \
>>>>      TmpDisk=6000000 Gres=gpu:nvidia_rtx_a6000:1
>>>>
>>>>   The slurm.conf also has:
>>>>
>>>>   ProctrackType=proctrack/cgroup
>>>>   TaskPlugin=task/affinity,task/cgroup
>>>>   TaskPluginParam=Cores,*SlurmdOf**fSpec*,Verbose
>>>> 
>>>
>>>   Doesn't setting SlurmdOffSpec tell Slurmd that is should NOT use the
>>>   CPUs
>>>   in the spec list? (
>>>   https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmdOffSpec)
>>>   In this case, I believe it uses what is left, which is the 0-13. We are
>>>   just starting to work on this ourselves, and were looking at this
>>>   setting.
>>>
>>>   Best,
>>>
>>>   -Sean
>>> 
>> 
>
The information in this e-mail is intended only for the person to whom it is addressed.  If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Mass General Brigham Compliance HelpLine at https://www.massgeneralbrigham.org/complianceline <https://www.massgeneralbrigham.org/complianceline> .
Please note that this e-mail is not secure (encrypted).  If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately.  Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail. 




More information about the slurm-users mailing list