[slurm-users] why sacct display wrong username while the UID is right?

taleintervenor at sjtu.edu.cn taleintervenor at sjtu.edu.cn
Sun Mar 13 03:59:56 UTC 2022


Hi all:

 

We encountered a strange bug when query job history using sacct. As show
below, we try to list user hpczbzt's job, and sacct do filter the right jobs
belong to this user. But there username is displayed as phywht.

 

> sacct -X --user=hpczbzt
--format=jobid%16,jobidraw,user,uid,partition,start,end,AllocCPUS,state%20

           JobID JobIDRaw          User    UID  Partition
Start                 End  AllocCPUS                State

---------------- ------------ --------- ------ ----------
------------------- ------------------- ---------- --------------------

         9882328 9882328         phywht   6270       dgx2
2022-03-13T04:50:12             Unknown          6              RUNNING

         9882330 9882330         phywht   6270       dgx2
2022-03-13T04:50:12             Unknown          6              RUNNING

         9882332 9882332         phywht   6270       dgx2
2022-03-13T04:50:12             Unknown          6              RUNNING

         9882335 9882335         phywht   6270       dgx2
2022-03-13T04:50:12             Unknown          6              RUNNING

         9882337 9882337         phywht   6270       dgx2
2022-03-13T04:50:12             Unknown          6              RUNNING

         9884211 9884211         phywht   6270       a100
2022-03-12T23:56:02 2022-03-13T00:13:43          8    CANCELLED by 6270

         9884265 9884265         phywht   6270       a100
2022-03-13T00:14:22             Unknown          8              RUNNING

         9884308 9884308         phywht   6270    64c512g
2022-03-13T01:18:44 2022-03-13T01:37:04          4    CANCELLED by 6270

         9884413 9884413         phywht   6270    64c512g
2022-03-13T04:52:06 2022-03-13T05:59:49         40            COMPLETED

         9884431 9884431         phywht   6270       a100
2022-03-13T06:09:02 2022-03-13T09:32:45          8            COMPLETED

         9887011 9887011         phywht   6270 debug64c5+
2022-03-13T11:06:44 2022-03-13T11:07:41          1    CANCELLED by 6270

 

The UID showed by sacct is right, and actual UID of phywht is 6272 as shown
below:

 

> id phywht

uid=6272(phywht) gid=6272(phywht) groups=6272(phywht)

> id hpczbzt

uid=6270(hpczbzt) gid=6270(hpczbzt) groups=6270(hpczbzt)

 

Those 2 system accounts are both stored in ldap. Also we have checked them
to be consistent on either slurmctld and slurmdbd node. What's more,
scontrol and squeue can show the right username as hpczbzt:

 

> scontrol show job 9884265

JobId=9884265 JobName=af_test_session

   UserId=hpczbzt(6270) GroupId=hpczbzt(6270) MCS_label=N/A

   Priority=519 Nice=0 Account=acct-phywht QOS=normal

   JobState=RUNNING Reason=None Dependency=(null)

..

> squeue --user=hpczbzt

             JOBID PARTITION     NAME     USER ST       TIME  NODES
NODELIST(REASON)

           9884265      a100 af_test_  hpczbzt  R   11:43:46      1 gpu04

           9882328      dgx2 repeat_V  hpczbzt  R    7:07:56      1 vol05

..

 

So is there any guess about why only sacct display the wrong username?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220313/3ed1cb05/attachment.htm>


More information about the slurm-users mailing list