Manual compilation of 24.05.4. slurmctld and slurmd run on same server. All works ok but all test jobs end up pending with InvalidAccount message. I do not use slurm database and have not enabled accounting. Can not find an answer for this behavior or a misconfiguration. slurm.conf file was generated using easy config tool. Any ideas how to fix this? Thx,
-Henk
## looks like all users have access to test queue [hmeij@sharptail2 slurm]$ sinfo -o "%g %.10R %.20l" GROUPS PARTITION TIMELIMIT all test infinite [hmeij@sharptail2 slurm]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST test* up infinite 1 idle sharptail2
## simple sleep job [hmeij@sharptail2 slurm]$ sbatch sleep Submitted batch job 8 [hmeij@sharptail2 slurm]$ squeue JOBID PARTITION NAME USER ST TIME NODES CPUS MIN_MEMORY NODELIST(REASON) 8 test sleep hmeij PD 0:00 1 1 1G (InvalidAccount) [hmeij@sharptail2 slurm]$ scontrol show job 8 JobId=8 JobName=sleep UserId=hmeij(8216) GroupId=its(623) MCS_label=N/A Priority=1 Nice=0 Account=(null) QOS=(null) JobState=PENDING Reason=InvalidAccount Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A SubmitTime=2024-11-11T13:27:14 EligibleTime=2024-11-11T13:27:14 AccrueTime=2024-11-11T13:27:14 StartTime=Unknown EndTime=Unknown Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2024-11-11T13:27:14 Scheduler=Main Partition=test AllocNode:Sid=sharptail2:644662 ReqNodeList=(null) ExcNodeList=(null) NodeList= NumNodes=1-1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:1:1 ReqTRES=cpu=1,mem=1G,node=1,billing=1 AllocTRES=(null) Socks/Node=1 NtasksPerN:B:S:C=1:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=1G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/zfshomes/hmeij/slurm/sleep WorkDir=/zfshomes/hmeij/slurm StdErr=/zfshomes/hmeij/slurm/err StdIn=/dev/null StdOut=/zfshomes/hmeij/slurm/out TresPerTask=cpu=1
## within a minute or so that InvalidAccount changes to None ( ## but job remains pending; 1-7 stuck over the weekend)
[hmeij@sharptail2 slurm]$ squeue JOBID PARTITION NAME USER ST TIME NODES CPUS MIN_MEMORY NODELIST(REASON) 8 test sleep hmeij PD 0:00 1 1 1G (None)
## in the slurmctld.log
slurmctld: sched: JobId=8 has invalid account slurmctld: debug: set_job_failed_assoc_qos_ptr: Filling in assoc for JobId=8 Assoc=0 slurmctld: debug: sched: Running job scheduler for full queue. slurmctld: error: _refresh_assoc_mgr_qos_list: no new list given back keeping cached one.
##and the slurm.conf accounting section (both AccountingStorageType lines yield same behavior)
#AccountingStorageType= AccountingStorageType=accounting_storage/none #JobAcctGatherFrequency=30 #JobAcctGatherType=
## using
SchedulerType = sched/builtin