[slurm-users] Calculate the GPU usages

Tina Friedrich tina.friedrich at it.ox.ac.uk
Wed Sep 1 13:33:42 UTC 2021


...or maybe

sacct -p --format=jobid,user,ElapsedRaw,state,AllocGRES,ncpus

i.e. make the output parsable (non-truncated and delimiter separated) - 
handy if you want/need to do further work with the data.

Tina

On 01/09/2021 14:24, Loris Bennett wrote:
> Dear Jeherul,
> 
> Jeherul Islam <jeherul at gmail.com> writes:
> 
>> Dear Loris,
>>
>> When we grep it by the user name "j.mira" it will strike out the multiple counts. Again sacct is showing fewer gpu minutes than sreport.
> 
> Yes, you are right, although instead of
> 
>    sacct --account=chemistry --format=jobid,user,ElapsedRaw,state,AllocGRES,ncpus --starttime=2021-05-01 --endtime=2021-08-31  | grep j.mira
> 
> it would be more elegant just to write
> 
>    sacct --account=chemistry --user=j.mira  --format=jobid,user,ElapsedRaw,state,AllocGRES,ncpus --starttime=2021-05-01 --endtime=2021-08-31 --noheader
> 
> However, your problem might be caused by the fact that the default width
> of the 'AllocGRES' field is to small for the values.  This will cause
> the values to be truncated, so your 'grep gpu' might miss some entries.
> You might need something like
> 
>     --format=jobid,user,ElapsedRaw,state,AllocGRES%60,ncpus
> 
> Cheers,
> 
> Loris
> 
> 
>> On Wed, 1 Sep, 2021, 6:03 PM Loris Bennett, <loris.bennett at fu-berlin.de> wrote:
>>
>>   Dear Jeherul,
>>
>>   Jeherul Islam <jeherul at gmail.com> writes:
>>
>>   > Dear Loris,
>>   >
>>   > Thanks for your reply. Here is the output for the same period but the result is not matching.
>>   >
>>   > #sacct --account=chemistry --format=jobid,user,ElapsedRaw,state,AllocGRES,ncpus --starttime=2021-05-01 --endtime=2021-08-31  | grep j.mira| grep gpu| awk '{sum += $3} END {print sum}'
>>
>>   I think you need the option '-X' for 'sacct'.  This will give you one
>>   line per job rather than including the steps.  Without '-X' you are
>>   counting the usage multiple times for each job.
>>
>>   Cheers,
>>
>>   Loris
>>
>>   > 6835053          (6835053/60 = 113917 )
>>   >
>>   > # sreport cluster AccountUtilizationByUser cluster=**** user=j.mira start=2021-05-01 end=2021-08-31 --tres="gres/gpu"
>>   > --------------------------------------------------------------------------------
>>   > Cluster/Account/User Utilization 2021-05-01T00:00:00 - 2021-08-30T23:59:59 (10540800 secs)
>>   > Usage reported in TRES Minutes
>>   > --------------------------------------------------------------------------------
>>   >   Cluster         Account     Login     Proper Name      TRES Name     Used
>>   > --------- --------------- --------- --------------- -------------- --------
>>   > ********       chemistry    j.mira          j.mira       gres/gpu   149434
>>   >
>>   > On Wed, Sep 1, 2021 at 5:27 PM Loris Bennett <loris.bennett at fu-berlin.de> wrote:
>>   >
>>   >  Dear Jeherul,
>>   >
>>   >  Jeherul Islam <jeherul at gmail.com> writes:
>>   >
>>   >  > Dear All,
>>   >  >
>>   >  > Please share the correct way of calculating the GPU usages.
>>   >  > I am confused with sreport and sacct cmd. I am getting a different result.
>>   >  >
>>   >  > # sreport cluster AccountUtilizationByUser cluster=**** user=j.mira start=2021-05-01 end=2021-08-31 --tres="gres/gpu"
>>   >
>>   >  Here you have:
>>   >
>>   >    end=2021-08-31
>>   >
>>   >  > --------------------------------------------------------------------------------
>>   >  > Cluster/Account/User Utilization 2021-05-01T00:00:00 - 2021-08-30T23:59:59 (10540800 secs)
>>   >  > Usage reported in TRES Minutes
>>   >  > --------------------------------------------------------------------------------
>>   >  >   Cluster         Account     Login     Proper Name      TRES Name     Used
>>   >  > --------- --------------- --------- --------------- -------------- --------
>>   >  > ****       chemistry    j.mira          j.mira       gres/gpu   149434
>>   >  >
>>   >  > # sacct --account=chemistry --format=jobid,user,ElapsedRaw,state,AllocGRES,ncpus --starttime=2021-05-01 --endtime=2021-08-01  | grep j.mira| grep gpu| awk '{sum += $3} END {print sum}'
>>   >
>>   >  whereas here you have
>>   >
>>   >    --endtime=2021-08-01
>>   >
>>   >  > 4957060
>>   >  >
>>   >  > Please share the correct way.
>>   >  >
>>   >  > With Thanks and regards
>>   >
>>   >  so, without having checked your sacct/awk logic I would not expect the results to be the same.
>>   >
>>   >  Cheers,
>>   >
>>   >  Loris
>>   >
>>   >  --
>>   >  Dr. Loris Bennett (Hr./Mr.)
>>   >  ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de
>>   --
>>   Dr. Loris Bennett (Hr./Mr.)
>>   ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de
>>

-- 
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk



More information about the slurm-users mailing list