[slurm-users] Slurm version 17.11.4 available

Tim Wickberg tim at schedmd.com
Wed Feb 28 15:01:18 MST 2018

We are pleased to announce the availability of Slurm version 17.11.4.

This includes roughly 38 fixes made since 17.11.3 was released early 
this month.

Slurm can be downloaded from https://www.schedmd.com/downloads.php

- Tim

> * Changes in Slurm 17.11.4
> ==========================
>  -- Add fatal_abort() function to be able to get core dumps if we hit an
>     "impossible" edge case.
>  -- Link slurmd against all libraries that slurmstepd links to.
>  -- Fix limits enforce order when they're set at partition and other levels.
>  -- Add slurm_load_single_node() function to the Perl API.
>  -- slurm.spec - change dependency for --with lua to use pkgconfig.
>  -- Fix small memory leaks in node_features plugins on reconfigure.
>  -- slurmdbd - only permit requests to update resources from operators or
>     administrators.
>  -- Fix handling of partial writes in io_init_msg_write_to_fd() which can
>     lead to job step launch failure under higher cluster loads.
>  -- MYSQL - Fix to handle quotes in a given work_dir of a job.
>  -- sbcast - fix a race condition that leads to "Unspecified error".
>  -- Log that support for the ChosLoc configuration parameter will end in Slurm
>     version 18.08.
>  -- Fix backfill performance issue where bf_min_prio_reserve was not respected.
>  -- Fix MaxQueryTimeRange checks.
>  -- Print MaxQueryTimeRange in "sacctmgr show config".
>  -- Correctly check return codes when creating a step to check if needing to
>     wait to retry or not.
>  -- Fix issue where a job could be denied by Reason=MaxMemPerLimit when not
>     requesting any tasks.
>  -- In perl tools, fix for regexp that caused extra incorrectly shown results.
>  -- Add some extra locks in fed_mgr to be extra safe.
>  -- Minor memory leak fixes in the fed_mgr on slurmctld shutdown.
>  -- Make sreport job reports also report duplicate jobs correctly.
>  -- Fix issues restoring certain Partition configuration elements, especially
>     when ReconfigFlags=KeepPartInfo is enabled.
>  -- Don't add TRES whose value is NO_VAL64 when building string line.
>  -- Fix removing array jobs from hash in slurmctld.
>  -- Print out missing user messages from jobsubmit plugin when srun/salloc are
>     waiting for an allocation.
>  -- Handle --clusters=all as case insensitive.
>  -- Only check requested clusters in federation when using --test-only
>     submission option.
>  -- In the federation, make it so you can cancel stranded sibling jobs.
>  -- Silence an error from PSS memory stat collection process.
>  -- Requeue jobs allocated to nodes requested to DRAIN or FAIL if nodes are
>     POWER_SAVE or POWER_UP, preventing jobs to start on NHC-failed nodes.
>  -- Make MAINT and OVERLAP resvervation flags order agnostic on overlap test.
>  -- Preserve node features when slurmctld daemons reconfigured including active
>     and available KNL features.
>  -- Prevent creation of multiple io_timeout threads within srun, which can
>     lead to fatal() messages when those unexpected and additional mutexes are
>     destroyed when srun shuts down.
>  -- burst_buffer/cray - Prevent use of "#DW create_persistent" and
>     "#DW destroy_persistent" directives available in Cray CLE6.0UP06. This
>     will be supported in Slurm version 18.08. Use "#BB" directives until then.
>  -- Fix task/cgroup affinity to behave correctly.
>  -- FreeBSD - fix build on systems built with WITHOUT_KERBEROS.
>  -- Fix to restore pn_min_memory calculated result to correctly enforce
>     MaxMemPerCPU setting on a partition when the job uses --mem.
>  -- slurmdbd - prevent infinite loop if a QOS is set to preempt itself.
>  -- Fix issue with log rotation for slurmstepd processes.

More information about the slurm-users mailing list