We are pleased to announce the availability of Slurm version 23.11.5.
The 23.11.5 release includes some important fixes related to newer features as well as some database fixes. The most noteworthy fixes include fixing the sattach command (which only worked for root and SlurmUser after 23.11.0) and fixing an issue while constructing the new lineage database entries. This last change will also perform a query during the upgrade from any prior 23.11 version to fix existing databases.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
-Tim
- Changes in Slurm 23.11.5
========================== -- Fix Debian package build on systems that are not able to query the systemd package. -- data_parser/v0.0.40 - Emit a warning instead of an error if a disabled parser is invoked. -- slurmrestd - Improve handling when content plugins rely on parsers that haven't been loaded. -- Fix old pending jobs dying (Slurm version 21.08.x and older) when upgrading Slurm due to "Invalid message version" errors. -- Have client commands sleep for progressively longer periods when backed off by the RPC rate limiting system. -- slurmctld - Ensure agent queue is flushed correctly at shutdown time. -- slurmdbd - correct lineage construction during assoc table conversion for partition based associations. -- Add new RPCs and API call for faster querying of job states from slurmctld. -- slurmrestd - Add endpoint '/slurm/{data_parser}/jobs/state'. -- squeue - Add `--only-job-state` argument to use faster query of job states. -- Make a job requesting --no-requeue, or JobRequeue=0 in the slurm.conf, supersede RequeueExit[Hold]. -- Add sackd man page to the Debian package. -- Fix issues with tasks when a job was shrinked more than once. -- Fix reservation update validation that resulted in reject of correct updates of reservation when the reservation was running jobs. -- Fix possible segfault when the backup slurmctld is asserting control. -- Fix regression introduced in 23.02.4 where slurmctld was not properly tracking the total GRES selected for exclusive multi-node jobs, potentially and incorrectly bypassing limits. -- Fix tracking of jobs typeless GRES count when multiple typed GRES with the same name are also present in the job allocation. Otherwise, the job could bypass limits configured for the typeless GRES. -- Fix tracking of jobs typeless GRES count when request specification has a typeless GRES name first and then typed GRES of different names (i.e. --gres=gpu:1,tmpfs:foo:2,tmpfs:bar:7). Otherwise, the job could bypass limits configured for the generic of the typed one (tmpfs in the example). -- Fix batch step not having SLURM_CLUSTER_NAME filled in. -- slurmstepd - Avoid error during `--container` job cleanup about RunTimeQuery never being configured. Results in cleanup where job steps not fully started. -- Fix nodes not being rebooted when using salloc/sbatch/srun "--reboot" flag. -- Send scrun.lua in configless mode. -- Fix rejecting an interactive job whose extra constraint request cannot immediately be satisfied. -- Fix regression in 23.11.0 when parsing LogTimeFormat=iso8601_ms that prevented milliseconds from being printed. -- Fix issue where you could have a gpu allocated as well as a shard on that gpu allocated at the same time. -- Fix slurmctld crashes when using extra constraints with job arrays. -- sackd/slurmrestd/scrun - Avoid memory leak on new unix socket connection. -- The failed node field is filled when a node fails but does not time out. -- slurmrestd - Remove requiring job script field and job component script fields to both be populated in the `POST /slurm/v0.0.40/job/submit` endpoint as there can only be one batch step script for a job. -- slurmrestd - When job script is provided in '.jobs[].script' and '.script' fields, the '.script' field's value will be used in the `POST /slurm/v0.0.40/job/submit` endpoint. -- slurmrestd - Reject HetJob submission missing or empty batch script for first Het component in the `POST /slurm/v0.0.40/job/submit` endpoint. -- slurmrestd - Reject job when empty batch script submitted to the POST /slurm/v0.0.40/job/submit` endpoint. -- Fix pam_slurm and pam_slurm_adopt when using auth/slurm. -- slurmrestd - Add 'cores_per_socket' field to `POST /slurm/v0.0.40/job/submit` endpoint. -- Fix srun and other Slurm commands running within a "configless" salloc when salloc itself fetched the config. -- Enforce binding with shared gres selection if requested. -- Fix job allocation failures when the requested tres type or name ends in "gres" or "license". -- accounting_storage/mysql - Fix lineage string construction when adding a user association with a partition. -- Fix sattach command. -- Fix ReconfigFlags. Due how reconfig was changed in 23.11, they will also be used to influence the slurmctld startup as well. -- Fix starting slurmd in configless mode if MUNGE support was disabled.