Slurm versions 24.11.5, 24.05.8, and 23.11.11 are now available (CVE-2025-43904)
Slurm versions 24.11.5, 24.05.8, and 23.11.11 are now available and include a fix for a recently discovered security issue. SchedMD customers were informed on April 23rd and provided a patch on request; this process is documented in our security policy. [1] A mistake with permission handling for Coordinators within Slurm's accounting system can allow a Coordinator to promote a user to Administrator. (CVE-2025-43904) Thank you to Sekou Diakite (HPE) for reporting this. Downloads are available at https://www.schedmd.com/downloads.php . Release notes follow below. - Tim [1] https://www.schedmd.com/security-policy/ -- Tim Wickberg Chief Technology Officer, SchedMD LLC Commercial Slurm Development and Support
* Changes in Slurm 24.11.5 ========================== -- Return error to scontrol reboot on bad nodelists. -- slurmrestd - Report an error when QOS resolution fails for v0.0.40 endpoints. -- slurmrestd - Report an error when QOS resolution fails for v0.0.41 endpoints. -- slurmrestd - Report an error when QOS resolution fails for v0.0.42 endpoints. -- data_parser/v0.0.42 - Added +inline_enums flag which modifies the output when generating OpenAPI specification. It causes enum arrays to not be defined in their own schema with references ($ref) to them. Instead they will be dumped inline. -- Fix binding error with tres-bind map/mask on partial node allocations. -- Fix stepmgr enabled steps being able to request features. -- Reject step creation if requested feature is not available in job. -- slurmd - Restrict listening for new incoming RPC requests further into startup. -- slurmd - Avoid auth/slurm related hangs of CLI commands during startup and shutdown. -- slurmctld - Restrict processing new incoming RPC requests further into startup. Stop processing requests sooner during shutdown. -- slurmcltd - Avoid auth/slurm related hangs of CLI commands during startup and shutdown. -- slurmctld: Avoid race condition during shutdown or reconfigure that could result in a crash due delayed processing of a connection while plugins are unloaded. -- Fix small memleak when getting the job list from the database. -- Fix incorrect printing of % escape characters when printing stdio fields for jobs. -- Fix padding parsing when printing stdio fields for jobs. -- Fix printing %A array job id when expanding patterns. -- Fix reservations causing jobs to be held for Bad Constraints -- switch/hpe_slingshot - Prevent potential segfault on failed curl request to the fabric manager. -- Fix printing incorrect array job id when expanding stdio file names. The %A will now be substituted by the correct value. -- Fix printing incorrect array job id when expanding stdio file names. The %A will now be substituted by the correct value. -- switch/hpe_slingshot - Fix vni range not updating on slurmctld restart or reconfigre. -- Fix steps not being created when using certain combinations of -c and -n inferior to the jobs requested resources, when using stepmgr and nodes are configured with CPUs == Sockets*CoresPerSocket. -- Permit configuring the number of retry attempts to destroy CXI service via the new destroy_retries SwitchParameter. -- Do not reset memory.high and memory.swap.max in slurmd startup or reconfigure as we are never really touching this in slurmd. -- Fix reconfigure failure of slurmd when it has been started manually and the CoreSpecLimits have been removed from slurm.conf. -- Set or reset CoreSpec limits when slurmd is reconfigured and it was started with systemd. -- switch/hpe-slingshot - Make sure the slurmctld can free step VNIs after the controller restarts or reconfigures while the job is running. -- Fix backup slurmctld failure on 2nd takeover. -- Testsuite - fix python test 130_2. -- Fix security issue where a coordinator could add a user with elevated privileges. CVE-2025-43904.
* Changes in Slurm 24.05.8 ========================== -- Testsuite - fix python test 130_2. -- Fix security issue where a coordinator could add a user with elevated privileges. CVE-2025-43904.
* Changes in Slurm 23.11.11 =========================== -- Fixed a job requeuing issue that merged job entries into the same SLUID when all nodes in a job failed simultaneously. -- Add ABORT_ON_FATAL environment variable to capture a backtrace from any fatal() message. -- Testsuite - fix python test 130_2. -- Fix security issue where a coordinator could add a user with elevated privileges. CVE-2025-43904.
participants (1)
-
Tim Wickberg