Configuring sacct to report state=OUT_OF_MEMORY - slurm-users

29 May 2024


      Hello,
*Background :*
I am working on a small cluster that is managed by Base Command Manager
v10.0 using Slurm 23.02.7 with Ubuntu 22.04.2.  I have a small testing
script that simply consumes memory and processors.
I run my test script, it consumes more memory than allocated by Slurm and
as expected it gets killed by OOM killer.  In /var/log/slurmd, I see
entries like :
[2024-05-29T08:53:04.975] Launching batch job 65 for UID 1001
[2024-05-29T08:53:05.016] [65.batch] task/cgroup: _memcg_initialize: job:
alloc=10868MB mem.limit=10868MB memsw.limit=10868MB job_swappiness=1
[2024-05-29T08:53:05.016] [65.batch] task/cgroup: _memcg_initialize: step:
alloc=10868MB mem.limit=10868MB memsw.limit=10868MB job_swappiness=1
[2024-05-29T08:53:19.530] [65.batch] task/cgroup:
task_cgroup_memory_check_oom: StepId=65.batch hit memory+swap limit at
least once during execution. This may or may not result in some failure.
[2024-05-29T08:53:19.563] [65.batch] done with job
Inspecting with sacct, I see :
$ sacct -j 65 --format="jobid,jobname,state,exitcode"
JobID           JobName      State ExitCode
------------ ---------- ---------- --------
65                 wrap     FAILED      9:0
65.batch          batch     FAILED      9:0
Based on my previous experience with a RHEL based Slurm cluster, I would
expect the state to be listed as OUT_OF_MEMORY and the exitcode to be
0:125.
*Question : *
1. How do I configure [slurm,cgroup].conf such that when Slurm kills a job
due to exceeding the allocated memory, sacct reports the state as
OUT_OF_MEMORY?
See below for my configuration :
*Configuration:*
/etc/default/grub :
~# grep -v "^#" /etc/default/grub
GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=hidden
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX="biosdevname=0 cgroup_enable=memory swapaccount=1"
GRUB_GFXMODE="1024x768,800x600,auto"
GRUB_BACKGROUND="/boot/grub/bcm.png"
slurm.conf :
~# grep -v "^#" /cm/shared/apps/slurm/var/etc/bcm10-slurm/slurm.conf
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
SlurmdSpoolDir=/cm/local/apps/slurm/var/spool
SwitchType=switch/none
MpiDefault=pmix
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/cgroup
ReturnToService=2
TaskPlugin=task/cgroup
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=300
SlurmdTimeout=300
Waittime=0
SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurmctld
SlurmdDebug=info
SlurmdLogFile=/var/log/slurmd
JobAcctGatherType=jobacct_gather/cgroup
JobAcctGatherFrequency=30
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageUser=slurm
SlurmctldHost=bcm10-h01
AccountingStorageHost=master
NodeName=bcm10-n[01,02] Procs=4 CoresPerSocket=4 RealMemory=15988
SocketsPerBoard=1 ThreadsPerCore=1 Boards=1 MemSpecLimit=5120
Feature=location=local
PartitionName="defq" Default=YES MinNodes=1 DefaultTime=UNLIMITED
MaxTime=UNLIMITED AllowGroups=ALL PriorityJobFactor=1 PriorityTier=1
OverSubscribe=NO PreemptMode=OFF AllowAccounts=ALL AllowQos=ALL
Nodes=bcm10-n[01,02]
ClusterName=bcm10-slurm
SchedulerType=sched/backfill
StateSaveLocation=/cm/shared/apps/slurm/var/cm/statesave/bcm10-slurm
PrologFlags=Alloc
GresTypes=gpu
Prolog=/cm/local/apps/cmd/scripts/prolog
Epilog=/cm/local/apps/cmd/scripts/epilog
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory
cgroup.conf
~# grep -v "^#" /cm/shared/apps/slurm/var/etc/bcm10-slurm/cgroup.conf
CgroupMountpoint="/sys/fs/cgroup"
CgroupAutomount=no
ConstrainCores=yes
ConstrainRAMSpace=yes
ConstrainSwapSpace=yes
ConstrainDevices=yes
AllowedRamSpace=100.00
AllowedSwapSpace=0.00
MemorySwappiness=1
MaxRAMPercent=100.00
MaxSwapPercent=100.00
MinRAMSpace=30
Best regards,
Lee