[slurm-users] error: user <UID> not found
Diego Zuccato
diego.zuccato at unibo.it
Wed Sep 30 06:37:44 UTC 2020
Il 30/09/20 03:49, Brian Andrus ha scritto:
Tks for the answer.
> That means the system has no idea who that user is.
But which system? Being a message generated by slurmctld, I thought it
must be the frontend node. But, as I wrote, that system correctly
identifies the user (he's logged in, 'id' and 'getent passwd' can
resolve both the name and the UID).
> If you are part of a domain or other shared directory (ldap, etc), your
> master is likely not configured right.
The frontend is an AD member, using PBIS-open. It's been working as-is
for at least the last 6 years :) and other users from the same domain
are able to submit jobs.
> If you are using SSSD, it is also possible your sssd has too long of a
> cache time. Run "sss_cache -E" to clear everything.
To partially workaround an issue with conflicting UIDs/GIDs (the
PBIS-assigned range is too short for our forest), I already clear the
PBIS cache every 5 minutes and re-populate it forcing an id of every
entry of /home/{PERSONALE,STUDENTI}/*.* entry (this forces PBIS to pull
the right name from AD when first resolving the UID, so the GUID is
already cached and associated to the UID when the reverse mapping is
required).
> If you have a forest, it could be the information has not propagated to
> all the servers, so you have to wait.
> I've been places where that can take 24 hours.
It's been more than a week since the first failure :( And our forest
usually propagates changes in just a few minutes (more often in seconds).
--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786
More information about the slurm-users
mailing list