[slurm-announce] Slurm version 23.02.5 is now available

Thu Sep 7 19:45:25 UTC 2023

We are pleased to announce the availability of Slurm version 23.02.5.

The 23.02.5 release includes a number of stability fixes and some fixes 
for notable regressions.

The SLURM_NTASKS environment variable that in 23.02.0 was not set when 
using --ntasks-per-node has been changed back to its 22.05 behavior of 
being set. The method that is is being set, however, is different and 
should be more accurate in more situations.

The mpi/pmi2 plugin now respects the SrunPortRange option, which matches 
the behavior of the mpi/pmix plugin as of 23.02.0.

The --uid and --gid options for salloc and srun have been removed. These 
options did not work correctly since the CVE-2022-29500 fix in 
combination with some changes made in 23.02.0.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

-Tim

--
Tim McMullan
Release Management, Support, and Development
SchedMD LLC - Commercial Slurm Development and Support

> * Changes in Slurm 23.02.5
> ==========================
>  -- Add the JobId to debug() messages indicating when cpus_per_task/mem_per_cpu
>     or pn_min_cpus are being automatically adjusted.
>  -- Fix regression in 23.02.2 that caused slurmctld -R to crash on startup if
>     a node features plugin is configured.
>  -- Fix and prevent reoccurring reservations from overlapping.
>  -- job_container/tmpfs - Avoid attempts to share BasePath between nodes.
>  -- Change the log message warning for rate limited users from verbose to info.
>  -- With CR_Cpu_Memory, fix node selection for jobs that request gres and
>     --mem-per-cpu.
>  -- Fix a regression from 22.05.7 in which some jobs were allocated too few
>     nodes, thus overcommitting cpus to some tasks.
>  -- Fix a job being stuck in the completing state if the job ends while the
>     primary controller is down or unresponsive and the backup controller has
>     not yet taken over.
>  -- Fix slurmctld segfault when a node registers with a configured CpuSpecList
>     while slurmctld configuration has the node without CpuSpecList.
>  -- Fix cloud nodes getting stuck in POWERED_DOWN+NO_RESPOND state after not
>     registering by ResumeTimeout.
>  -- slurmstepd - Avoid cleanup of config.json-less containers spooldir getting
>     skipped.
>  -- slurmstepd - Cleanup per task generated environment for containers in
>     spooldir.
>  -- Fix scontrol segfault when 'completing' command requested repeatedly in
>     interactive mode.
>  -- Properly handle a race condition between bind() and listen() calls in the
>     network stack when running with SrunPortRange set.
>  -- Federation - Fix revoked jobs being returned regardless of the -a/--all
>     option for privileged users.
>  -- Federation - Fix canceling pending federated jobs from non-origin clusters
>     which could leave federated jobs orphaned from the origin cluster.
>  -- Fix sinfo segfault when printing multiple clusters with --noheader option.
>  -- Federation - fix clusters not syncing if clusters are added to a federation
>     before they have registered with the dbd.
>  -- Change pmi2 plugin to honor the SrunPortRange option. This matches the new
>     behavior of the pmix plugin in 23.02.0. Note that neither of these plugins
>     makes use of the "MpiParams=ports=" option, and previously were only limited
>     by the systems ephemeral port range.
>  -- node_features/helpers - Fix node selection for jobs requesting changeable
>     features with the '|' operator, which could prevent jobs from running on
>     some valid nodes.
>  -- node_features/helpers - Fix inconsistent handling of '&' and '|', where an
>     AND'd feature was sometimes AND'd to all sets of features instead of just
>     the current set. E.g. "foo|bar&baz" was interpreted as {foo,baz} or
>     {bar,baz} instead of how it is documented: "{foo} or {bar,baz}".
>  -- Fix job accounting so that when a job is requeued its allocated node count
>     is cleared. After the requeue, sacct will correctly show that the job has
>     0 AllocNodes while it is pending or if it is canceled before restarting.
>  -- sacct - AllocCPUS now correctly shows 0 if a job has not yet received an
>     allocation or if the job was canceled before getting one.
>  -- Fix intel oneapi autodetect: detect the /dev/dri/renderD[0-9]+ gpus, and do
>     not detect /dev/dri/card[0-9]+.
>  -- Format batch, extern, interactive, and pending step ids into strings that
>     are human readable.
>  -- Fix node selection for jobs that request --gpus and a number of tasks fewer
>     than gpus, which resulted in incorrectly rejecting these jobs.
>  -- Remove MYSQL_OPT_RECONNECT completely.
>  -- Fix cloud nodes in POWERING_UP state disappearing (getting set to FUTURE)
>     when an `scontrol reconfigure` happens.
>  -- openapi/dbv0.0.39 - Avoid assert / segfault on missing coordinators list.
>  -- slurmrestd - Correct memory leak while parsing OpenAPI specification
>     templates with server overrides.
>  -- slurmrestd - Reduce memory usage when printing out job CPU frequency.
>  -- Fix overwriting user node reason with system message.
>  -- Remove --uid / --gid options from salloc and srun commands.
>  -- Prevent deadlock when rpc_queue is enabled.
>  -- slurmrestd - Correct OpenAPI specification generation bug where fields with
>     overlapping parent paths would not get generated.
>  -- Fix memory leak as a result of a partition info query.
>  -- Fix memory leak as a result of a job info query.
>  -- slurmrestd - For 'GET /slurm/v0.0.39/node[s]', change format of node's
>     energy field "current_watts" to a dictionary to account for unset value
>     instead of dumping 4294967294.
>  -- slurmrestd - For 'GET /slurm/v0.0.39/qos', change format of QOS's
>     field "priority" to a dictionary to account for unset value instead of
>     dumping 4294967294.
>  -- slurmrestd - For 'GET /slurm/v0.0.39/job[s]', the 'return code' code field
>     in v0.0.39_job_exit_code will be set to -127 instead of being left unset
>     where job does not have a relevant return code.
>  -- data_parser/v0.0.39 - Add required/memory_per_cpu and
>     required/memory_per_node to `sacct --json` and `sacct --yaml` and
>     'GET /slurmdb/v0.0.39/jobs' from slurmrestd.
>  -- For step allocations, fix --gres=none sometimes not ignoring gres from the
>     job.
>  -- Fix --exclusive jobs incorrectly gang-scheduling where they shouldn't.
>  -- Fix allocations with CR_SOCKET, gres not assigned to a specific socket, and
>     block core distribion potentially allocating more sockets than required.
>  -- gpu/oneapi - Store cores correctly so CPU affinity is tracked.
>  -- Revert a change in 23.02.3 where Slurm would kill a script's process group
>     as soon as the script ended instead of waiting as long as any process in
>     that process group held the stdout/stderr file descriptors open. That change
>     broke some scripts that relied on the previous behavior. Setting time limits
>     for scripts (such as PrologEpilogTimeout) is strongly encouraged to avoid
>     Slurm waiting indefinitely for scripts to finish.
>  -- Allow slurmdbd -R to work if the root assoc id is not 1.
>  -- Fix slurmdbd -R not returning an error under certain conditions.
>  -- slurmdbd - Avoid potential NULL pointer dereference in the mysql plugin.
>  -- Revert a change in 23.02 where SLURM_NTASKS was no longer set in the job's
>     environment when --ntasks-per-node was requested.
>  -- Limit periodic node registrations to 50 instead of the full TreeWidth.
>     Since unresolvable cloud/dynamic nodes must disable fanout by setting
>     TreeWidth to a large number, this would cause all nodes to register at
>     once.
>  -- Fix regression in 23.02.3 which broken x11 forwarding for hosts when
>     MUNGE sends a localhost address in the encode host field. This is caused
>     when the node hostname is mapped to 127.0.0.1 (or similar) in /etc/hosts.
>  -- openapi/[db]v0.0.39 - fix memory leak on parsing error.
>  -- data_parser/v0.0.39 - fix updating qos for associations.
>  -- openapi/dbv0.0.39 - fix updating values for associations with null users.
>  -- Fix minor memory leak with --tres-per-task and licenses.
>  -- Fix cyclic socket cpu distribution for tasks in a step where
>     --cpus-per-task < usable threads per core.