[slurm-users] why sacct display wrong username while the UID is right?
taleintervenor at sjtu.edu.cn
taleintervenor at sjtu.edu.cn
Sun Mar 13 03:59:56 UTC 2022
Hi all:
We encountered a strange bug when query job history using sacct. As show
below, we try to list user hpczbzt's job, and sacct do filter the right jobs
belong to this user. But there username is displayed as phywht.
> sacct -X --user=hpczbzt
--format=jobid%16,jobidraw,user,uid,partition,start,end,AllocCPUS,state%20
JobID JobIDRaw User UID Partition
Start End AllocCPUS State
---------------- ------------ --------- ------ ----------
------------------- ------------------- ---------- --------------------
9882328 9882328 phywht 6270 dgx2
2022-03-13T04:50:12 Unknown 6 RUNNING
9882330 9882330 phywht 6270 dgx2
2022-03-13T04:50:12 Unknown 6 RUNNING
9882332 9882332 phywht 6270 dgx2
2022-03-13T04:50:12 Unknown 6 RUNNING
9882335 9882335 phywht 6270 dgx2
2022-03-13T04:50:12 Unknown 6 RUNNING
9882337 9882337 phywht 6270 dgx2
2022-03-13T04:50:12 Unknown 6 RUNNING
9884211 9884211 phywht 6270 a100
2022-03-12T23:56:02 2022-03-13T00:13:43 8 CANCELLED by 6270
9884265 9884265 phywht 6270 a100
2022-03-13T00:14:22 Unknown 8 RUNNING
9884308 9884308 phywht 6270 64c512g
2022-03-13T01:18:44 2022-03-13T01:37:04 4 CANCELLED by 6270
9884413 9884413 phywht 6270 64c512g
2022-03-13T04:52:06 2022-03-13T05:59:49 40 COMPLETED
9884431 9884431 phywht 6270 a100
2022-03-13T06:09:02 2022-03-13T09:32:45 8 COMPLETED
9887011 9887011 phywht 6270 debug64c5+
2022-03-13T11:06:44 2022-03-13T11:07:41 1 CANCELLED by 6270
The UID showed by sacct is right, and actual UID of phywht is 6272 as shown
below:
> id phywht
uid=6272(phywht) gid=6272(phywht) groups=6272(phywht)
> id hpczbzt
uid=6270(hpczbzt) gid=6270(hpczbzt) groups=6270(hpczbzt)
Those 2 system accounts are both stored in ldap. Also we have checked them
to be consistent on either slurmctld and slurmdbd node. What's more,
scontrol and squeue can show the right username as hpczbzt:
> scontrol show job 9884265
JobId=9884265 JobName=af_test_session
UserId=hpczbzt(6270) GroupId=hpczbzt(6270) MCS_label=N/A
Priority=519 Nice=0 Account=acct-phywht QOS=normal
JobState=RUNNING Reason=None Dependency=(null)
..
> squeue --user=hpczbzt
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
9884265 a100 af_test_ hpczbzt R 11:43:46 1 gpu04
9882328 dgx2 repeat_V hpczbzt R 7:07:56 1 vol05
..
So is there any guess about why only sacct display the wrong username?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220313/3ed1cb05/attachment.htm>
More information about the slurm-users
mailing list