[slurm-users] 4 sockets but "

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Fri Jul 23 06:30:14 UTC 2021


Hi Diego,

On 7/23/21 8:16 AM, Diego Zuccato wrote:
>> The Configless Slurm (https://slurm.schedmd.com/configless_slurm.html) 
>> from 20.02 makes distribution of slurm.conf really simple.
> Eager to see it in Debian :)

IMHO, there ought to be a community effort to provide up-to-date Slurm 
packages for Debian (and Ubuntu), just like a colleague did for the EPEL 
repository for RHEL and derivatives ;-)  We run CentOS and can trivially 
build new RPMs from the Slurm source tar-balls.

>> For monitoring the state of compute nodes and their jobs, I recommend 
>> "pestat" from 
>> https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat
>> I use "pestat -F" many times every day to see if any jobs are 
>> misbehaving.I'll have a look. I'm also setting up Zabbix for more 
>> general monitoring 
> but I'm not really OK with it yet (for example I still can't understand 
> how I can exclude some metrics from a host that got 'em added by a 
> template... When I'll have enough time I'll find a way :) ). Maybe pestat 
> can be added to the Zabbix metrics...

Did you check out what pestat can do (and maybe not do) for you?  If you 
have any suggestions for improving pestat, I'd be glad to see what I can do.

/Ole



More information about the slurm-users mailing list