[slurm-announce] Slurm version 17.11.4 available
tim at schedmd.com
Wed Feb 28 15:01:18 MST 2018
We are pleased to announce the availability of Slurm version 17.11.4.
This includes roughly 38 fixes made since 17.11.3 was released early
Slurm can be downloaded from https://www.schedmd.com/downloads.php
> * Changes in Slurm 17.11.4
> -- Add fatal_abort() function to be able to get core dumps if we hit an
> "impossible" edge case.
> -- Link slurmd against all libraries that slurmstepd links to.
> -- Fix limits enforce order when they're set at partition and other levels.
> -- Add slurm_load_single_node() function to the Perl API.
> -- slurm.spec - change dependency for --with lua to use pkgconfig.
> -- Fix small memory leaks in node_features plugins on reconfigure.
> -- slurmdbd - only permit requests to update resources from operators or
> -- Fix handling of partial writes in io_init_msg_write_to_fd() which can
> lead to job step launch failure under higher cluster loads.
> -- MYSQL - Fix to handle quotes in a given work_dir of a job.
> -- sbcast - fix a race condition that leads to "Unspecified error".
> -- Log that support for the ChosLoc configuration parameter will end in Slurm
> version 18.08.
> -- Fix backfill performance issue where bf_min_prio_reserve was not respected.
> -- Fix MaxQueryTimeRange checks.
> -- Print MaxQueryTimeRange in "sacctmgr show config".
> -- Correctly check return codes when creating a step to check if needing to
> wait to retry or not.
> -- Fix issue where a job could be denied by Reason=MaxMemPerLimit when not
> requesting any tasks.
> -- In perl tools, fix for regexp that caused extra incorrectly shown results.
> -- Add some extra locks in fed_mgr to be extra safe.
> -- Minor memory leak fixes in the fed_mgr on slurmctld shutdown.
> -- Make sreport job reports also report duplicate jobs correctly.
> -- Fix issues restoring certain Partition configuration elements, especially
> when ReconfigFlags=KeepPartInfo is enabled.
> -- Don't add TRES whose value is NO_VAL64 when building string line.
> -- Fix removing array jobs from hash in slurmctld.
> -- Print out missing user messages from jobsubmit plugin when srun/salloc are
> waiting for an allocation.
> -- Handle --clusters=all as case insensitive.
> -- Only check requested clusters in federation when using --test-only
> submission option.
> -- In the federation, make it so you can cancel stranded sibling jobs.
> -- Silence an error from PSS memory stat collection process.
> -- Requeue jobs allocated to nodes requested to DRAIN or FAIL if nodes are
> POWER_SAVE or POWER_UP, preventing jobs to start on NHC-failed nodes.
> -- Make MAINT and OVERLAP resvervation flags order agnostic on overlap test.
> -- Preserve node features when slurmctld daemons reconfigured including active
> and available KNL features.
> -- Prevent creation of multiple io_timeout threads within srun, which can
> lead to fatal() messages when those unexpected and additional mutexes are
> destroyed when srun shuts down.
> -- burst_buffer/cray - Prevent use of "#DW create_persistent" and
> "#DW destroy_persistent" directives available in Cray CLE6.0UP06. This
> will be supported in Slurm version 18.08. Use "#BB" directives until then.
> -- Fix task/cgroup affinity to behave correctly.
> -- FreeBSD - fix build on systems built with WITHOUT_KERBEROS.
> -- Fix to restore pn_min_memory calculated result to correctly enforce
> MaxMemPerCPU setting on a partition when the job uses --mem.
> -- slurmdbd - prevent infinite loop if a QOS is set to preempt itself.
> -- Fix issue with log rotation for slurmstepd processes.
More information about the slurm-announce