[slurm-users] Updated "pestat" tool for printing Slurm nodes status including GRES/GPU

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Mon Dec 13 14:16:36 UTC 2021


Hi Loris,

I fixed errors in the hostnamelength calculation and formatting.
Could you grab the latest pestat and test it?

Thanks,
Ole

On 12/13/21 13:56, Loris Bennett wrote:
> Hi Ole,
> 
> Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:
> 
>> Hi Slurm users,
>>
>> I have updated the "pestat" tool for printing Slurm nodes status with 1 line per
>> node including job info.  The download page is
>> https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat
>> (also listed in https://slurm.schedmd.com/download.html).
>>
>> Improvements:
>>
>> * The GRES/GPU output option "pestat -G" now prints the job gres/gpu information
>> as obtained from squeue's tres-alloc output option, which should contain the
>> most correct GRES/GPU information.
>>
>> If you have a cluster with GPUs, could you try out the latest version and send
>> me any feedback?
>>
>> Thanks to René Sitt for helpful suggestions and testing.
>>
>> The pestat tool can print a large variety of node and job information, and is
>> generally useful for monitoring nodes and jobs on Slurm clusters.  For command
>> options and examples please see the download page.  My own favorite usage is
>> "pestat -F".
> 
> Thanks for the update - the GPU information is a good addition.
> However, the alignment of the columns with the headers seems a bit off:
> 
> 
> $ pestat -p gpu -G
> Print only nodes in partition gpu
> GRES (Generic Resource) is printed after each jobid
> Hostname       Partition     Node Num_CPU  CPUload  Memsize  Freemem  GRES/node              Joblist
>                      State Use/Tot  (15min)     (MB)     (MB)                         JobID(JobArrayID) User GRES/job ...
> g001             gpu      mix   1  32    0.06*    95200    89990  gpu:gtx1080ti:2(S:0-1) 8692106 joesnow gpu=2
> g002             gpu      mix   6  32    1.70*    95200    71692  gpu:gtx1080ti:2(S:0-1) 8692181(8536946_566) gailhail gpu=1 8692131(8536946_563) gailhail gpu=1
> g003             gpu      mix   1  32    0.06*    95200    87622  gpu:gtx1080ti:2(S:0-1) 8692111 joesnow gpu=2
> g004             gpu      mix   6  32    1.74*    95200    65647  gpu:gtx1080ti:2(S:0-1) 8692124(8536946_562) gailhail gpu=1 8692122(8536946_561) gailhail gpu=1
> 
> 
> It looks as if the column 'Partition' needs to be four spaces wider.
> 
> Cheers,
> 
> Loris
> 

-- 
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark,
Fysikvej Building 309, DK-2800 Kongens Lyngby, Denmark
E-mail: Ole.H.Nielsen at fysik.dtu.dk
Homepage: http://dcwww.fysik.dtu.dk/~ohnielse/
Mobile: (+45) 5180 1620



More information about the slurm-users mailing list