[slurm-users] Slurm versions 20.02.3 and 19.05.7 are now available (CVE-2020-12693)

Tim Wickberg tim at schedmd.com
Thu May 21 20:54:58 UTC 2020


Slurm versions 20.02.3 and 19.05.7 are now available, and include a 
series of recent bug fixes, as well as a fix for a security issue with 
the optional message aggregation feature.

SchedMD customers were informed on May 7th and provided a patch on 
request; this process is documented in our security policy [1].

CVE-2020-12693:

A review of what was intended to be a minor cleanup patch uncovered an 
underlying race condition for systems with Message Aggregation enabled. 
This race condition could allow a user to launch a process as an 
arbitrary user.

This is only an issue for systems with Message Aggregation enabled, 
which we expect to be a small number of Slurm installations in practice.

Message Aggregation is off in Slurm by default, and is only enabled by 
MsgAggregationParams=WindowMsgs=<msgs>, where <msgs> is greater than 1. 
(Using Message Aggregation on your systems is not a recommended 
configuration at this time, and we may retire this subsystem in a future 
Slurm release in favor of other RPC aggregation techniques. Although 
care must be taken before disabling this to avoid communication issues.)

Downloads are available at https://www.schedmd.com/downloads.php .

Release notes follow below.

- Tim

[1] https://www.schedmd.com/security.php

-- 
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

> * Changes in Slurm 20.02.3
> ==========================
>  -- Factor in ntasks-per-core=1 with cons_tres.
>  -- Fix formatting in error message in cons_tres.
>  -- Fix calling stat on a NULL variable.
>  -- Fix minor memory leak when using reservations with flags=first_cores.
>  -- Fix gpu bind issue when CPUs=Cores and ThreadsPerCore > 1 on a node.
>  -- Fix --mem-per-gpu for heterogenous --gres requests.
>  -- Fix slurmctld load order in load_all_part_state().
>  -- Fix race condition not finding jobacct gather task cgroup entry.
>  -- Suppress error message when selecting nodes on disjoint topologies.
>  -- Improve performance of _pack_default_job_details() with large number of job
>     arguments.
>  -- Fix archive loading previous to 17.11 jobs per-node req_mem.
>  -- Fix regresion validating that --gpus-per-socket requires --sockets-per-node
>     for steps. Should only validate allocation requests.
>  -- error() instead of fatal() when parsing an invalid hostlist.
>  -- nss_slurm - fix potential deadlock in slurmstepd on overloaded systems.
>  -- cons_tres - fix --gres-flags=enforce-binding and related --cpus-per-gres.
>  -- cons_tres - Allocate lowest numbered cores when filtering cores with gres.
>  -- Fix getting system counts for named GRES/TRES.
>  -- MySQL - Fix for handing typed GRES for association rollups.
>  -- Fix step allocations when tasks_per_core > 1.
>  -- Fix allocating more GRES than requested when asking for multiple GRES types.

> * Changes in Slurm 19.05.7
> ==========================
>  -- Fix handling of -m/--distribution options for across socket/2nd level by
>     task/affinity plugin.
>  -- Fix grp_node_bitmap error when slurmctld started before slurmdbd.
>  -- Fix compilation issues in GCC10.
>  -- Fix distributing job steps across idle nodes within a job.
>  -- Break infinite loop in cons_tres dealing with incorrect tasks per tres
>     request resulting in slurmctld hang.
>  -- priority/multifactor - gracefully handle NULL list of associations or array
>     of siblings when calculating FairTree fairshare.
>  -- Fix cons_tres --exclusive=user to allocate only requested number of CPUs.
>  -- Add MySQL deadlock detection and automatic retry mechanism.
>  -- Fix _verify_node_state memory requested as --mem-per-gpu DefMemPerGPU.
>  -- Factor in ntasks-per-core=1 with cons_tres.
>  -- Fix formatting in error message in cons_tres.
>  -- Fix gpu bind issue when CPUs=Cores and ThreadsPerCore > 1 on a node.
>  -- Fix --mem-per-gpu for heterogenous --gres requests.
>  -- Fix slurmctld load order in load_all_part_state().
>  -- Fix getting system counts for named GRES/TRES.
>  -- MySQL - Fix for handing typed GRES for association rollups.
>  -- Fix step allocations when tasks_per_core > 1.



More information about the slurm-users mailing list