[slurm-users] jobacct_gather/linux vs jobacct_gather/cgroup

Juergen Salk juergen.salk at uni-ulm.de
Tue Oct 22 17:49:35 UTC 2019


Dear Chris,

I could not find this warning in the slurm.conf man page. So I googled
it and found a reference in the Slurm developers documentation: 

https://slurm.schedmd.com/jobacct_gatherplugins.html

However, this web page says in its footer: "Last modified 27 March 2015". 
So maybe (means: hopefully) this caveat is somewhat outdated today. 

I have also `JobAcctGatherType=jobacct_gather/cgroup´ in my slurm.conf 
but for no deeper reason than that we also use cgroups for
process tracking (i.e. ProctrackType=proctrack/cgroup) and to limit 
resources used by users. So it just felt more consistent to me to 
use cgroups for jobacct_gather plugin as well - even though SchedMD 
recommends jobacct_gather/linux (according to the slurm.conf man page)

That said, I'd also be interested in the pros and cons of jobacct_gather/cgroup 
versus jobacct_gather/linux and also why jobacct_gather/linux is the recommended
one.

Best regards
Jürgen

-- 
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471





* Christopher Benjamin Coffey <Chris.Coffey at nau.edu> [191022 16:26]:
> Hi,
> 
> We've been using jobacct_gather/cgroup for quite some time and haven't had any issues (I think). We do see some lengthy job cleanup times when there are lots of small jobs completing at once, maybe that is due to the cgroup plugin. At SLUG19 a slurm dev presented information that the jobacct_gather/cgroup plugin has quite the performance hit and that jobacct_gather/linux should be set instead. 
> 
> Can someone help me with the difference between these two gather plugins? If one were to switch to jobacct_gather/linux, what are the cons? Do you lose some job resource usage information?
> 
> Checking out the docs again on schedmd site regarding the jobacct_gather plugins I see:
> 
> cgroup — Gathers information from Linux cgroup infrastructure and adds this information to the standard rusage information also gathered for each job. (Experimental, not to be used in production.)
> 
> I don't believe I saw that before: "Experimental" ! Hah.
> 
> Thanks!
> 
> Best,
> Chris
>  
> -- 
> Christopher Coffey
> High-Performance Computing
> Northern Arizona University
> 928-523-1167
>  
>  
> 

-- 
GPG A997BA7A | 87FC DA31 5F00 C885 0DC3  E28F BD0D 4B33 A997 BA7A



More information about the slurm-users mailing list