[slurm-users] Slurm versions 20.02.1 and 19.05.6 are now available
tim at schedmd.com
Thu Mar 26 21:54:33 UTC 2020
We are pleased to announce the availability of Slurm versions 20.02.1
This includes a series of minor fixes since the last maintenance
releases for both branches.
Please note that the 19.05.6 release is expected to be the the last
maintenance release of that branch (barring any critical security
issues) as our support team has shifted their attention to the 20.02
release. Also note that support for the 18.08 release ended in
Februrary; SchedMD customers are encourage to upgrade to a supported
major release (20.02 or 19.05) at their earliest convenience.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support
> * Changes in Slurm 20.02.1
> -- Improve job state reason for jobs hitting partition_job_depth.
> -- Speed up testing of singleton dependencies.
> -- Fix negative loop bound in cons_tres.
> -- srun - capture the MPI plugin return code from mpi_hook_client_fini() and
> use as final return code for step failure.
> -- Fix segfault in cli_filter/lua.
> -- Fix --gpu-bind=map_gpu reusability if tasks > elements.
> -- Make sure config_flags on a gres are sent to the slurmctld on node
> -- Prolog/Epilog - Fix missing GPU information.
> -- Fix segfault when using config parser for expanded lines.
> -- Fix bit overlap test function.
> -- Don't accrue time if job begin time is in the future.
> -- Remove accrue time when updating a job start/eligible time to the future.
> -- Fix regression in 20.02.0 that broke --depend=expand.
> -- Reset begin time on job release if it's not in the future.
> -- Fix for recovering burst buffers when using high-availability.
> -- Fix invalid read due to freeing an incorrectly allocated env array.
> -- Update slurmctld -i message to warn about losing data.
> -- Fix scontrol cancel_reboot so it clears the DRAIN flag and node reason for a
> pending ASAP reboot.
> * Changes in Slurm 19.05.6
> -- Fix OverMemoryKill.
> -- Fix memory leak in scontrol show config.
> -- Remove PART_NODES reservation flag after ignoring it at creation.
> -- Fix deprecation of MemLimitEnforce parameter. > -- X11 forwarding - alter Xauthority regex to work when "FamilyWild"
> are present in the "xauth list" output.
> -- Fix memory leak when utilizing core reservations.
> -- Fix issue where adding WCKeys and then using them right away didn't always
> -- Add cosmetic batch step to correct component in a hetjob.
> -- Fix to make scontrol write config create a usable config without editing.
> -- Fix memory leak when pinging backup controller.
> -- Fix issue with 'scontrol update' not enforcing all QoS / Association limits.
> -- Fix to properly schedule certain jobs with cons_tres plugin.
> -- Fix FIRST_CORES for reservations when using cons_tres.
> -- Fix sbcast -C argument parsing.
> -- Replace/deprecate max_job_bf with bf_max_job_test and print error message.
> -- sched/backfill - fix options parsing when bf_hetjob_prio enabled.
> -- Fix for --gpu-bind when no gpus requested.
> -- Fix sshare -l crash with large values.
> -- Fix printing NULL job and step pointers.
> -- Break infinite loop in cons_tres dealing with incorrect tasks per tres
> request resulting in slurmctld hang.
> -- Improve handling of --gpus-per-task to make sure appropriate number of GPUs
> is assigned to job.
More information about the slurm-users