[slurm-users] require info on merging diff core count nodes under single queue or partition

Sudeep Narayan Banerjee snbanerjee at iitgn.ac.in
Mon May 18 14:17:57 UTC 2020


Dear Loris: Many thanks for your response.

I did change the IDLE state to UNKNOWN state for NodeName configuration, 
then reloaded *slurmctld* and got 2 gpu nodes(gpu3 & 4) as drain mode. 
Although the same state I have manually updated to IDLE state.

But how do I change the CoresPerSocket and ThreadsPerCore in the 
NodeName parameter?


Thanks & Regards,
Sudeep Narayan Banerjee

On 18/05/20 7:29 pm, Loris Bennett wrote:
> Hi Sudeep,
>
> I am not sure if this is the cause of the problem but in your slurm.conf
> you have
>
>    # COMPUTE NODES
>
>    NodeName=node[1-10] Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 Procs=16  RealMemory=60000  State=IDLE
>    NodeName=gpu[1-2] CPUs=16 Gres=gpu:2 State=IDLE
>
>    NodeName=node[11-22] Sockets=2 CoresPerSocket=16 ThreadsPerCore=1 Procs=32 State=IDLE
>    NodeName=node[23-24] Sockets=2 CoresPerSocket=20 ThreadsPerCore=1 Procs=40 State=IDLE
>    NodeName=gpu[3-4] CPUs=32 Gres=gpu:1 State=IDLE
>
> But if you read
>
>    man slurm.conf
>
> you will find the following under the description of the parameter
> "State" for nodes:
>
>    "IDLE" should not be specified in the node configuration, but set the
>    node state to "UNKNOWN" instead.
>
> Cheers,
>
> Loris
>
>
> Sudeep Narayan Banerjee <snbanerjee at iitgn.ac.in> writes:
>
>> Dear Loris: I am very sorry to address as Support; actually it has
>> become a bad habit for me which I will change. Sincere Apologies!
>>
>> Yes, I have checked while adding hybrid arch of hardware but while
>> executing slurmctld, it shows mismatch in core-count and also the
>> existing 32core nodes goes to Dowm/Drng mode and new 40-core nodes
>> sets to IDLE.
>>
>> Any help/guide to some link will be highly appreciated!
>>
>> Thanks & Regards,
>> Sudeep Narayan Banerjee
>> System Analyst | Scientist B
>> Information System Technology Facility
>> Academic Block 5 | Room 110
>> Indian Institute of Technology Gandhinagar
>> Palaj, Gujarat 382355 INDIA
>> On 18/05/20 6:30 pm, Loris Bennett wrote:
>>
>>   Dear Sudeep,
>>
>> Sudeep Narayan Banerjee <snbanerjee at iitgn.ac.in> writes:
>>
>>   Dear Support,
>>
>>
>> This mailing list is not really the Slurm support list.  It is just the
>> Slurm User Community List, so basically a bunch of people just like you.
>>
>>   node11-22 is having 16cores socket x 2 and node23-24 is having 20cores
>> socket x 2. In slurm.conf file (attached), can we merge all the nodes
>> 11-24 (having different core count) and have a single queue or
>> partition name?
>>
>>
>> Yes, you can have a partition consisting of heterogeneous nodes.  Have
>> you tried this?  Was there a problem?
>>
>> Cheers,
>>
>> Loris
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200518/6551cc75/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pobilfledbpcnghp.png
Type: image/png
Size: 26314 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200518/6551cc75/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: afnhjopabpggngpe.png
Type: image/png
Size: 7441 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200518/6551cc75/attachment-0003.png>


More information about the slurm-users mailing list