[slurm-users] Updated "pestat" tool for printing Slurm nodes status including GRES/GPU
Loris Bennett
loris.bennett at fu-berlin.de
Mon Dec 13 14:31:15 UTC 2021
Hi Ole,
The new version looks good to me.
Cheers,
Loris
Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:
> Hi Loris,
>
> I fixed errors in the hostnamelength calculation and formatting.
> Could you grab the latest pestat and test it?
>
> Thanks,
> Ole
>
> On 12/13/21 13:56, Loris Bennett wrote:
>> Hi Ole,
>>
>> Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk> writes:
>>
>>> Hi Slurm users,
>>>
>>> I have updated the "pestat" tool for printing Slurm nodes status with 1 line per
>>> node including job info. The download page is
>>> https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat
>>> (also listed in https://slurm.schedmd.com/download.html).
>>>
>>> Improvements:
>>>
>>> * The GRES/GPU output option "pestat -G" now prints the job gres/gpu information
>>> as obtained from squeue's tres-alloc output option, which should contain the
>>> most correct GRES/GPU information.
>>>
>>> If you have a cluster with GPUs, could you try out the latest version and send
>>> me any feedback?
>>>
>>> Thanks to René Sitt for helpful suggestions and testing.
>>>
>>> The pestat tool can print a large variety of node and job information, and is
>>> generally useful for monitoring nodes and jobs on Slurm clusters. For command
>>> options and examples please see the download page. My own favorite usage is
>>> "pestat -F".
>>
>> Thanks for the update - the GPU information is a good addition.
>> However, the alignment of the columns with the headers seems a bit off:
>>
>>
>> $ pestat -p gpu -G
>> Print only nodes in partition gpu
>> GRES (Generic Resource) is printed after each jobid
>> Hostname Partition Node Num_CPU CPUload Memsize Freemem GRES/node Joblist
>> State Use/Tot (15min) (MB) (MB) JobID(JobArrayID) User GRES/job ...
>> g001 gpu mix 1 32 0.06* 95200 89990 gpu:gtx1080ti:2(S:0-1) 8692106 joesnow gpu=2
>> g002 gpu mix 6 32 1.70* 95200 71692 gpu:gtx1080ti:2(S:0-1) 8692181(8536946_566) gailhail gpu=1 8692131(8536946_563) gailhail gpu=1
>> g003 gpu mix 1 32 0.06* 95200 87622 gpu:gtx1080ti:2(S:0-1) 8692111 joesnow gpu=2
>> g004 gpu mix 6 32 1.74* 95200 65647 gpu:gtx1080ti:2(S:0-1) 8692124(8536946_562) gailhail gpu=1 8692122(8536946_561) gailhail gpu=1
>>
>>
>> It looks as if the column 'Partition' needs to be four spaces wider.
>>
>> Cheers,
>>
>> Loris
>>
--
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin Email loris.bennett at fu-berlin.de
More information about the slurm-users
mailing list