[slurm-users] [ext] Re: bufferoverflow in slurmd with acct_gather_energy plugin
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Wed Aug 30 09:37:28 UTC 2023
On 8/30/23 11:17, Hagdorn, Magnus Karl Moritz wrote:
> On Wed, 2023-08-30 at 10:38 +0200, Ole Holm Nielsen wrote:
>> This is a very useful example! I guess that you have also defined
>> EnergyIPMIUsername and EnergyIPMIPassword in acct_gather.conf? How
>> is the
>> EnergyIPMIPassword protected from normal users if the
>> /etc/slurm/acct_gather.conf file exists?
> it talks to the BMC via to OS, so no password/user required.
Ah, of course, the slurmd on your nodes can do local IPMI commands :-)
>> An EnergyIPMIFrequency of 10 seconds sounds like it could put a high
>> on the BMC and the server?
> that might be my problem - I haven't checked that.
Maybe this could be a problem. It's anyway better not to have "OS jitter"
in HPC compute nodes by having system tasks executing too frequently.
>> I have never tested IPMI DCMI_ENHANCED commands. Do you have some
>> FreeIMPI commands which can be used to verify the basic IPMI
> I checked the spec sheet of our BMC which suggested that it should be
> able to do DCMI_ENHANCED
That's good to know. Our servers from Huawei don't seem to support
The following ipmitool command works locally on a node, but I can't figure
out the corresponding command to use with FreeIPMI.
# ipmitool dcmi power reading
Instantaneous power reading: 689 Watts
Minimum during sampling period: 19 Watts
Maximum during sampling period: 905 Watts
Average power reading over sample period: 682 Watts
IPMI timestamp: Wed Aug 30 09:35:28 2023
Sampling period: 00000001 Seconds.
Power reading state is: activated
More information about the slurm-users