[slurm-users] [ext] Re: bufferoverflow in slurmd with acct_gather_energy plugin

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Wed Aug 30 09:37:28 UTC 2023


Hi Magnus,

On 8/30/23 11:17, Hagdorn, Magnus Karl Moritz wrote:
> On Wed, 2023-08-30 at 10:38 +0200, Ole Holm Nielsen wrote:
>> This is a very useful example!  I guess that you have also defined
>> EnergyIPMIUsername and EnergyIPMIPassword in acct_gather.conf?  How
>> is the
>> EnergyIPMIPassword protected from normal users if the
>> /etc/slurm/acct_gather.conf file exists?
>>
> it talks to the BMC via to OS, so no password/user required.

Ah, of course, the slurmd on your nodes can do local IPMI commands :-)

>> An EnergyIPMIFrequency of 10 seconds sounds like it could put a high
>> load
>> on the BMC and the server?
>>
> that might be my problem - I haven't checked that.

Maybe this could be a problem.  It's anyway better not to have "OS jitter" 
in HPC compute nodes by having system tasks executing too frequently.

>> I have never tested IPMI DCMI_ENHANCED commands.  Do you have some
>> FreeIMPI commands which can be used to verify the basic IPMI
>> DCMI_ENHANCED
>> functionality?
>>
> I checked the spec sheet of our BMC which suggested that it should be
> able to do DCMI_ENHANCED

That's good to know.  Our servers from Huawei don't seem to support 
DCMI_ENHANCED.

The following ipmitool command works locally on a node, but I can't figure 
out the corresponding command to use with FreeIPMI.

# ipmitool dcmi power reading

     Instantaneous power reading:                   689 Watts
     Minimum during sampling period:                 19 Watts
     Maximum during sampling period:                905 Watts
     Average power reading over sample period:      682 Watts
     IPMI timestamp:                           Wed Aug 30 09:35:28 2023
     Sampling period:                          00000001 Seconds.
     Power reading state is:                   activated


Best regards,
Ole



More information about the slurm-users mailing list