[slurm-users] Accounting CPU time with SLURM sacct

Jean-Michel Barbet jean-michel.barbet at subatech.in2p3.fr
Fri Apr 6 06:56:56 MDT 2018


Hello,

I have access to a SLURM cluster but I am not the administrator. I am
using the sacct command to collect accounting data each night for the
previous day.

My problem is that the CPU time does not look quite right. After reading
the man page for sacct, I decided that what I want is the metric
TotalCPU and I am computing the CPU efficiency as the ratio :

TotalCPU (converted to # of seconds)  / CPUTimeRAW (elapsed in seconds).

On a simple test it works OK but the workload I want to account for
is coming from the LHC ALice experiment and the process tree looks
a bit complicated. I am wondering if this is the reason why the
TotalCPU is so low while I know they are CPU intensive jobs.

Jobs are simple 1 CPU core per job.

Here is an example of the process tree :

   |-slurmstepd,40799
   |   |-AliEn-slurm-tmp,40804 /tmp/AliEn-slurm-tmp-416338.sh
   |   |   `-perl,40896 -w 
-I/cvmfs/alice.cern.ch/x86_64-2.6-gnu-4.1.2/Packages/AliEn/v2-19-395/lib/perl5/site_perl-I/cvm
   |   |       |-perl,395 -w 
-I/cvmfs/alice.cern.ch/x86_64-2.6-gnu-4.1.2/Packages/AliEn/v2-19-395/lib/perl5/site_perl-I/cvm
   |   |       |   `-perl,36727 -w 
-I/cvmfs/alice.cern.ch/x86_64-2.6-gnu-4.1.2/Packages/AliEn/v2-19-395/lib/perl5/site_perl-I/cvm
   |   |       |       `-sh,37534 -c...
   |   |       |           `-command,37535 
/tmp/ALICE/alien-job-1135310370/command --run 246153 --mode full --uid 
267 --nevents 10 --generator PWGLF:Hijing_Rsn003:b ...
   |   |       |               `-dpgsim.sh,37541 
/cvmfs/alice.cern.ch/x86_64-2.6-gnu-4.1.2/Packages/AliDPG/v5-08-XX-48/MC/dpgsim.sh 
--run 246153 --mode full --uid 267 ...
   |   |       |                   `-aliroot,38000 -b -q -x 
/cvmfs/alice.cern.ch/x86_64-2.6-gnu-4.1.2/Packages/AliDPG/v5-08-XX-48/MC/sim.C
   |   |       `-perl,39475 -w 
-I/cvmfs/alice.cern.ch/x86_64-2.6-gnu-4.1.2/Packages/AliEn/v2-19-395/lib/perl5/site_perl-I/cvm

=> Are there people on this list that had to face similar issues and
    does someone know something about this issue of collecting CPUtime.

BTW : It is not clear to me what MinCPU is supposed to be.

Thank you

JM Barbet

-- 
------------------------------------------------------------------------
Jean-michel BARBET                    | Tel: +33 (0)2 51 85 84 86
Laboratoire SUBATECH Nantes France    | Fax: +33 (0)2 51 85 84 79
CNRS-IN2P3/IMT-Atlantique/Univ.Nantes | E-Mail: barbet at subatech.in2p3.fr
------------------------------------------------------------------------



More information about the slurm-users mailing list