Manual compilation of 24.05.4. slurmctld and slurmd run on same server. All works ok but all test jobs end up pending with InvalidAccount message. I do not use slurm database and have not enabled accounting. Can not find an answer for this behavior or a misconfiguration. slurm.conf file was generated using easy config tool. Any ideas how to fix this? Thx,
-Henk
## looks like all users have access to test queue [hmeij@sharptail2 slurm]$ sinfo -o "%g %.10R %.20l" GROUPS PARTITION TIMELIMIT all test infinite [hmeij@sharptail2 slurm]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST test* up infinite 1 idle sharptail2
## simple sleep job [hmeij@sharptail2 slurm]$ sbatch sleep Submitted batch job 8 [hmeij@sharptail2 slurm]$ squeue JOBID PARTITION NAME USER ST TIME NODES CPUS MIN_MEMORY NODELIST(REASON) 8 test sleep hmeij PD 0:00 1 1 1G (InvalidAccount) [hmeij@sharptail2 slurm]$ scontrol show job 8 JobId=8 JobName=sleep UserId=hmeij(8216) GroupId=its(623) MCS_label=N/A Priority=1 Nice=0 Account=(null) QOS=(null) JobState=PENDING Reason=InvalidAccount Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A SubmitTime=2024-11-11T13:27:14 EligibleTime=2024-11-11T13:27:14 AccrueTime=2024-11-11T13:27:14 StartTime=Unknown EndTime=Unknown Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2024-11-11T13:27:14 Scheduler=Main Partition=test AllocNode:Sid=sharptail2:644662 ReqNodeList=(null) ExcNodeList=(null) NodeList= NumNodes=1-1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:1:1 ReqTRES=cpu=1,mem=1G,node=1,billing=1 AllocTRES=(null) Socks/Node=1 NtasksPerN:B:S:C=1:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryNode=1G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/zfshomes/hmeij/slurm/sleep WorkDir=/zfshomes/hmeij/slurm StdErr=/zfshomes/hmeij/slurm/err StdIn=/dev/null StdOut=/zfshomes/hmeij/slurm/out TresPerTask=cpu=1
## within a minute or so that InvalidAccount changes to None ( ## but job remains pending; 1-7 stuck over the weekend)
[hmeij@sharptail2 slurm]$ squeue JOBID PARTITION NAME USER ST TIME NODES CPUS MIN_MEMORY NODELIST(REASON) 8 test sleep hmeij PD 0:00 1 1 1G (None)
## in the slurmctld.log
slurmctld: sched: JobId=8 has invalid account slurmctld: debug: set_job_failed_assoc_qos_ptr: Filling in assoc for JobId=8 Assoc=0 slurmctld: debug: sched: Running job scheduler for full queue. slurmctld: error: _refresh_assoc_mgr_qos_list: no new list given back keeping cached one.
##and the slurm.conf accounting section (both AccountingStorageType lines yield same behavior)
#AccountingStorageType= AccountingStorageType=accounting_storage/none #JobAcctGatherFrequency=30 #JobAcctGatherType=
## using
SchedulerType = sched/builtin
Hi Henk,
On 11-11-2024 20:06, hmeij--- via slurm-users wrote:
Manual compilation of 24.05.4. slurmctld and slurmd run on same
server. All works ok but all test jobs end up pending with InvalidAccount message. I do not use slurm database and have not enabled accounting. Can not find an answer for this behavior or a misconfiguration. slurm.conf file was generated using easy config tool. Any ideas how to fix this? Thx,
In the slurm.conf manual page for 24.05 the accounting options are listed:
AccountingStorageType The accounting storage mechanism type. Acceptable values at present "accounting_storage/slurmdbd". The "accounting_storage/slurmdbd" value indicates that accounting records will be written to the Slurm DBD, which manages an underlying MySQL database. See "man slurmdbd" for more information. When this is not set it indicates that account records are not maintained.
In other words, the use of slurmdbd seems to be *required* as of Slurm 24.05! The use of AccountingStorageType=accounting_storage/none seems to be deprecated, but I can't offhand find this to be documented. Can anyone else help?
When account records are not maintained, it would make sense that you get InvalidAccount messages.
Perhaps you may find some usable Slurm setup guidance in this Wiki page: https://wiki.fysik.dtu.dk/Niflheim_system/
Best regards, Ole
On 11/11/24 21:39, Ole Holm Nielsen wrote:
Hi Henk,
On 11-11-2024 20:06, hmeij--- via slurm-users wrote:
Manual compilation of 24.05.4. slurmctld and slurmd run on same server.
All works ok but all test jobs end up pending with InvalidAccount message. I do not use slurm database and have not enabled accounting. Can not find an answer for this behavior or a misconfiguration. slurm.conf file was generated using easy config tool. Any ideas how to fix this? Thx,
In the slurm.conf manual page for 24.05 the accounting options are listed:
AccountingStorageType The accounting storage mechanism type. Acceptable values at present "accounting_storage/slurmdbd". The "accounting_storage/slurmdbd" value indicates that accounting records will be written to the Slurm DBD, which manages an underlying MySQL database. See "man slurmdbd" for more information. When this is not set it indicates that account records are not maintained.
In other words, the use of slurmdbd seems to be *required* as of Slurm 24.05! The use of AccountingStorageType=accounting_storage/none seems to be deprecated, but I can't offhand find this to be documented. Can anyone else help?
It seems that accounting_storage/none was removed (deprecated) from 23.11, but it was still documented until 23.02:
https://github.com/SchedMD/slurm/blob/slurm-23.02/doc/man/man5/slurm.conf.5#...
Note that in 22.05 the "accounting_storage/none" still implied that account records would not work, as you have experienced when getting InvalidAccount messages:
The default value is "accounting_storage/none" and indicates that account records are not maintained.
IHTH, Ole
Ole, I had not made that connection yet ... The required part. Could be documented a bit more clearly, if true.
Small institutions like us are not interested in managing slurm accounts and projects. Also weird that job Reason changes from InvalidAccount to None in minutes but job is not released. While sinfo reports partition is available to Group 'all'
-Henk ________________________________ From: Ole Holm Nielsen via slurm-users slurm-users@lists.schedmd.com Sent: Tuesday, November 12, 2024 2:23 AM To: slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com Subject: [External] [slurm-users] Re: InvalidAccount
On 11/11/24 21:39, Ole Holm Nielsen wrote:
Hi Henk,
On 11-11-2024 20:06, hmeij--- via slurm-users wrote:
Manual compilation of 24.05.4. slurmctld and slurmd run on same server.
All works ok but all test jobs end up pending with InvalidAccount message. I do not use slurm database and have not enabled accounting. Can not find an answer for this behavior or a misconfiguration. slurm.conf file was generated using easy config tool. Any ideas how to fix this? Thx,
In the slurm.conf manual page for 24.05 the accounting options are listed:
AccountingStorageType The accounting storage mechanism type. Acceptable values at present "accounting_storage/slurmdbd". The "accounting_storage/slurmdbd" value indicates that accounting records will be written to the Slurm DBD, which manages an underlying MySQL database. See "man slurmdbd" for more information. When this is not set it indicates that account records are not maintained.
In other words, the use of slurmdbd seems to be *required* as of Slurm 24.05! The use of AccountingStorageType=accounting_storage/none seems to be deprecated, but I can't offhand find this to be documented. Can anyone else help?
It seems that accounting_storage/none was removed (deprecated) from 23.11, but it was still documented until 23.02:
https://github.com/SchedMD/slurm/blob/slurm-23.02/doc/man/man5/slurm.conf.5#...
Note that in 22.05 the "accounting_storage/none" still implied that account records would not work, as you have experienced when getting InvalidAccount messages:
The default value is "accounting_storage/none" and indicates that account records are not maintained.
IHTH, Ole
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
Hi Henk,
On 11/12/24 15:36, Henk Meij wrote:
Ole, I had not made that connection yet ... The *required* part. Could be documented a bit more clearly, if true.
I've opened a case with SchedMD to make the documentation of AccountingStorageType clearer - may be in Slurm 24.11.
Small institutions like us are not interested in managing slurm accounts and projects.
You don't need to manage a lot, but the slurmdbd *is* required for proper Slurm functionality.
Also weird that job Reason changes from InvalidAccount to > None in
minutes but job is not released. While sinfo reports partition is
available to Group 'all'
Ah yes, when the scheduler loop is running sinfo will report a Reason of "None" for some seconds before returning to the normal value. This can be confusing I have to admit...
/Ole
Ole, you wrote
"Perhaps you may find some usable Slurm setup guidance in this Wiki page: https://wiki.fysik.dtu.dk/Niflheim_system/ "
Just want to put it out there, that documentation is awesome. With a high level of details. Thanks! Our test slurmdbd is up.
-Henk ________________________________ From: Henk Meij via slurm-users slurm-users@lists.schedmd.com Sent: Tuesday, November 12, 2024 9:36 AM To: slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com; Ole.H.Nielsen@fysik.dtu.dk Ole.H.Nielsen@fysik.dtu.dk Subject: [slurm-users] Re: [External] Re: InvalidAccount
Ole, I had not made that connection yet ... The required part. Could be documented a bit more clearly, if true.
Small institutions like us are not interested in managing slurm accounts and projects. Also weird that job Reason changes from InvalidAccount to None in minutes but job is not released. While sinfo reports partition is available to Group 'all'
-Henk ________________________________ From: Ole Holm Nielsen via slurm-users slurm-users@lists.schedmd.com Sent: Tuesday, November 12, 2024 2:23 AM To: slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com Subject: [External] [slurm-users] Re: InvalidAccount
On 11/11/24 21:39, Ole Holm Nielsen wrote:
Hi Henk,
On 11-11-2024 20:06, hmeij--- via slurm-users wrote:
Manual compilation of 24.05.4. slurmctld and slurmd run on same server.
All works ok but all test jobs end up pending with InvalidAccount message. I do not use slurm database and have not enabled accounting. Can not find an answer for this behavior or a misconfiguration. slurm.conf file was generated using easy config tool. Any ideas how to fix this? Thx,
In the slurm.conf manual page for 24.05 the accounting options are listed:
AccountingStorageType The accounting storage mechanism type. Acceptable values at present "accounting_storage/slurmdbd". The "accounting_storage/slurmdbd" value indicates that accounting records will be written to the Slurm DBD, which manages an underlying MySQL database. See "man slurmdbd" for more information. When this is not set it indicates that account records are not maintained.
In other words, the use of slurmdbd seems to be *required* as of Slurm 24.05! The use of AccountingStorageType=accounting_storage/none seems to be deprecated, but I can't offhand find this to be documented. Can anyone else help?
It seems that accounting_storage/none was removed (deprecated) from 23.11, but it was still documented until 23.02:
https://github.com/SchedMD/slurm/blob/slurm-23.02/doc/man/man5/slurm.conf.5#...
Note that in 22.05 the "accounting_storage/none" still implied that account records would not work, as you have experienced when getting InvalidAccount messages:
The default value is "accounting_storage/none" and indicates that account records are not maintained.
IHTH, Ole
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com