[slurm-users] jobacct_gather/linux vs jobacct_gather/cgroup

Christopher Benjamin Coffey Chris.Coffey at nau.edu
Thu Oct 24 22:54:19 UTC 2019


Hi Juergen,

From what I see so far, there is nothing missing from the jobacct_gather/linux plugin vs the cgroup version. In fact, the extern step now has data where as it is empty when using the cgroup version. 

Anyone know the differences?

Best,
Chris
 
-- 
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
 
 

On 10/22/19, 10:52 AM, "slurm-users on behalf of Juergen Salk" <slurm-users-bounces at lists.schedmd.com on behalf of juergen.salk at uni-ulm.de> wrote:

    Dear Chris,
    
    I could not find this warning in the slurm.conf man page. So I googled
    it and found a reference in the Slurm developers documentation: 
    
    https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fjobacct_gatherplugins.html&data=02%7C01%7Cchris.coffey%40nau.edu%7Cd82fa0e7b1b444f33d1608d757188d36%7C27d49e9f89e14aa099a3d35b57b2ba03%7C0%7C0%7C637073635277184549&sdata=54t98dF9mAbR7bRmeiyF0OUN3dPULWKVoG08H7Y3TtY%3D&reserved=0
    
    However, this web page says in its footer: "Last modified 27 March 2015". 
    So maybe (means: hopefully) this caveat is somewhat outdated today. 
    
    I have also `JobAcctGatherType=jobacct_gather/cgroup´ in my slurm.conf 
    but for no deeper reason than that we also use cgroups for
    process tracking (i.e. ProctrackType=proctrack/cgroup) and to limit 
    resources used by users. So it just felt more consistent to me to 
    use cgroups for jobacct_gather plugin as well - even though SchedMD 
    recommends jobacct_gather/linux (according to the slurm.conf man page)
    
    That said, I'd also be interested in the pros and cons of jobacct_gather/cgroup 
    versus jobacct_gather/linux and also why jobacct_gather/linux is the recommended
    one.
    
    Best regards
    Jürgen
    
    -- 
    Jürgen Salk
    Scientific Software & Compute Services (SSCS)
    Kommunikations- und Informationszentrum (kiz)
    Universität Ulm
    Telefon: +49 (0)731 50-22478
    Telefax: +49 (0)731 50-22471
    
    
    
    
    
    * Christopher Benjamin Coffey <Chris.Coffey at nau.edu> [191022 16:26]:
    > Hi,
    > 
    > We've been using jobacct_gather/cgroup for quite some time and haven't had any issues (I think). We do see some lengthy job cleanup times when there are lots of small jobs completing at once, maybe that is due to the cgroup plugin. At SLUG19 a slurm dev presented information that the jobacct_gather/cgroup plugin has quite the performance hit and that jobacct_gather/linux should be set instead. 
    > 
    > Can someone help me with the difference between these two gather plugins? If one were to switch to jobacct_gather/linux, what are the cons? Do you lose some job resource usage information?
    > 
    > Checking out the docs again on schedmd site regarding the jobacct_gather plugins I see:
    > 
    > cgroup — Gathers information from Linux cgroup infrastructure and adds this information to the standard rusage information also gathered for each job. (Experimental, not to be used in production.)
    > 
    > I don't believe I saw that before: "Experimental" ! Hah.
    > 
    > Thanks!
    > 
    > Best,
    > Chris
    >  
    > -- 
    > Christopher Coffey
    > High-Performance Computing
    > Northern Arizona University
    > 928-523-1167
    >  
    >  
    > 
    
    -- 
    GPG A997BA7A | 87FC DA31 5F00 C885 0DC3  E28F BD0D 4B33 A997 BA7A
    
    



More information about the slurm-users mailing list