Hi,
Thanks for the reply. I already went through this 🙁. I checked all nodes, id works as does a ssh login.
[root@node4 ~]# id xxxjonesst@xxx.ac.nz uid=1204805830(xxxjonesst@xxx.ac.nz) gid=1204805830(xxxjonesst@xxx.ac.nz)
8><--- Connection to node1 closed. [root@xxxunicobuildt1 warewulf]# ssh xxxjonesst@xxx.ac.nz@node4 (xxxjonesst@xxx.ac.nz@node4) Password: [xxxjonesst@xxx.ac.nz@node4 ~]$ whoami xxxjonesst@xxx.ac.nz [xxxjonesst@xxx.ac.nz@node4 ~]$
regards
Steven
________________________________ From: Chris Samuel via slurm-users slurm-users@lists.schedmd.com Sent: Monday, 3 February 2025 10:00 am To: slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com Subject: [slurm-users] Re: RHEL8.10 V slurmctld
On 29/1/25 10:44 am, Steven Jones via slurm-users wrote:
"2025-01-28T21:48:50.271] sched: Allocate JobId=4 NodeList=node4 #CPUs=1 Partition=debug [2025-01-28T21:48:50.280] Killing non-startable batch JobId=4: Invalid user id"
Looking at the source code it looks like that second error is reported back by slurmctld when it sends the RPC out to the compute node and it gets a response back, so I would look at what's going on with node4 to see what's being reported there.
All the best, Chris
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com