[slurm-users] Seff error with Slurm-18.08.1

Marcus Wagner wagner at itc.rwth-aachen.de
Thu Nov 8 03:54:59 MST 2018


Hi Miguel,


this is because SchedMD changed the stats field. There exists no more 
rss_max, cmp. line 225 of seff.
You need to evaluate the field stats{tres_usage_in_max}, and there the 
value after '2=', but this is the memory value in bytes instead of 
kbytes, so this should be divided by 1024 additionally.


Best
Marcus

On 11/08/2018 11:06 AM, Miguel A. Sánchez wrote:
>
> Hi and thanks for all your answers and sorry for the delay in my 
> answer. Yesterday I have installed in the controller machine the 
> Slurm-18.08.3 to check if with this last release the Seff command is 
> working fine. The behavior has improve but I still receive a error 
> message:
>
>
> # /usr/local/slurm-18.08.3/bin/seff 1694112
> *Use of uninitialized value $lmem in numeric lt (<) at 
> /usr/local/slurm-18.08.3/bin/seff line 130, <DATA> line 624.*
> Job ID: 1694112
> Cluster: XXXXX
> User/Group: XXXXX
> State: COMPLETED (exit code 0)
> Nodes: 1
> Cores per node: 2
> CPU Utilized: 01:39:33
> CPU Efficiency: 4266.43% of 00:02:20 core-walltime
> Job Wall-clock time: 00:01:10
> Memory Utilized: 0.00 MB (estimated maximum)
> Memory Efficiency: 0.00% of 3.91 GB (3.91 GB/node)
> [root at hydra ~]#
>
>
> And due to this problem,  any job shows me as memory utilized the 
> value of 0.00 MB.
>
>
> With slurm-17.11.1 is working fine:
>
>
> # /usr/local/slurm-17.11.0/bin/seff 1694112
> Job ID: 1694112
> Cluster: XXXXX
> User/Group: XXXXX
> State: COMPLETED (exit code 0)
> Nodes: 1
> Cores per node: 2
> CPU Utilized: 01:39:33
> CPU Efficiency: 4266.43% of 00:02:20 core-walltime
> Job Wall-clock time: 00:01:10
> Memory Utilized: 2.44 GB
> Memory Efficiency: 62.57% of 3.91 GB
> [root at hydra bin]#
>
>
>
>
> Miguel A. Sánchez Gómez
> System Administrator
> Research Programme on Biomedical Informatics - GRIB (IMIM-UPF)
>
> Barcelona Biomedical Research Park (office 4.80)
> Doctor Aiguader 88 | 08003 Barcelona (Spain)
> Phone: +34/ 93 316 0522 | Fax: +34/ 93 3160 550
> e-mail:miguelangel.sanchez at upf.edu
> On 11/06/2018 06:30 PM, Mike Cammilleri wrote:
>>
>> Thanks for this. We'll try the workaround script. It is not 
>> mission-critical but our users have gotten accustomed to seeing these 
>> metrics at the end of each run and its nice to have. We are currently 
>> doing this in a test VM environment, so by the time we actually do 
>> the upgrade to the cluster perhaps the fix will be available then.
>>
>>
>> Mike Cammilleri
>>
>> Systems Administrator
>>
>> Department of Statistics | UW-Madison
>>
>> 1300 University Ave | Room 1280
>> 608-263-6673 | mikec at stat.wisc.edu
>>
>>
>>
>> ------------------------------------------------------------------------
>> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf 
>> of Chris Samuel <chris at csamuel.org>
>> *Sent:* Tuesday, November 6, 2018 5:03 AM
>> *To:* slurm-users at lists.schedmd.com
>> *Subject:* Re: [slurm-users] Seff error with Slurm-18.08.1
>> On 6/11/18 7:49 pm, Baker D.J. wrote:
>>
>> > The good new is that I am assured by SchedMD that the bug has been 
>> fixed
>> > in v18.08.3.
>>
>> Looks like it's fixed in this commmit.
>>
>> commit 3d85c8f9240542d9e6dfb727244e75e449430aac
>> Author: Danny Auble <da at schedmd.com>
>> Date:   Wed Oct 24 14:10:12 2018 -0600
>>
>>      Handle symbol resolution errors in the 18.08 slurmdbd.
>>
>>      Caused by b1ff43429f6426c when moving the slurmdbd agent internals.
>>
>>      Bug 5882.
>>
>>
>> > Having said that we will probably live with this issue
>> > rather than disrupt users with another upgrade so soon .
>>
>> An upgrade to 18.08.3 from 18.08.1 shouldn't be disruptive though,
>> should it?  We just flip a symlink and the users see the new binaries,
>> libraries, etc immediately, we can then restart daemons as and when we
>> need to (in the right order of course, slurmdbd, slurmctld and then
>> slurmd's).
>>
>> All the best,
>> Chris
>> -- 
>>   Chris Samuel  : http://www.csamuel.org/ :  Melbourne, VIC
>>
>

-- 
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181108/af7826b9/attachment-0001.html>


More information about the slurm-users mailing list