[slurm-announce] Slurm version 21.08.1 is now available
Tim Wickberg
tim at schedmd.com
Thu Sep 16 21:45:09 UTC 2021
We are pleased to announce the availability of Slurm version 21.08.1.
For sites using scrontab, there is a critical fix included to ensure
that the cron jobs continue to repeat indefinitely into the future.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
- Tim
--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support
> * Changes in Slurm 21.08.1
> ==========================
> -- Fix potential memory leak if a problem happens while allocating GRES for
> a job.
> -- If an overallocation of GRES happens terminate the creation of a job.
> -- AutoDetect=nvml: Fatal if no devices found in MIG mode.
> -- slurm.spec - fix querying for PMIx and UCX version.
> -- Print federation and cluster sacctmgr error messages to stderr.
> -- Fix off by one error in --gpu-bind=mask_gpu.
> -- Fix statement condition in http_parser autoconf macro.
> -- Fix statement condition in netloc autoconf macro.
> -- Add --gpu-bind=none to disable gpu binding when using --gpus-per-task.
> -- Handle the burst buffer state "alloc-revoke" which previously would not
> display in the job correctly.
> -- Fix issue in the slurmstepd SPANK prolog/epilog handler where configuration
> values were used before being initialized.
> -- Restore a step's ability to utilize all of an allocations memory if --mem=0.
> -- Fix --cpu-bind=verbose garbage taskid.
> -- Fix cgroup task affinity issues from garbage taskid info.
> -- Make gres_job_state_validate() client logging behavior as before 44466a4641.
> -- Fix steps with --hint overriding an allocation with --threads-per-core.
> -- Require requesting a GPU if --mem-per-gpu is requested.
> -- Return error early if a job is requesting --ntasks-per-gpu and no gpus or
> task count.
> -- Properly clear out pending step if unavailable to run with available
> resources.
> -- Kill all processes spawned by burst_buffer.lua including decendents.
> -- openapi/v0.0.{35,36,37} - Avoid setting default values of min_cpus,
> job name, cwd, mail_type, and contiguous on job update.
> -- openapi/v0.0.{35,36,37} - Clear user hold on job update if hold=false.
> -- Prevent CRON_JOB flag from being cleared when loading job state.
> -- sacctmgr - Fix deleting WCKeys when not specifying a cluster.
> -- Fix getting memory for a step when the first node in the step isn't the
> first node in the allocation.
> -- Make SelectTypeParameters=CR_Core_Memory default for cons_tres and cons_res.
> -- Correctly handle mutex unlocks in the gres code if failures happen.
> -- Give better error message if -m plane is given with no size.
> -- Fix --distribution=arbitrary for salloc.
> -- Fix jobcomp/script regression introduced in 21.08.0rc1 0c75b9ac9d.
> -- Only send the batch node in the step_hostlist in the job credential.
> -- When setting affinity for the batch step don't assume the batch host is node
> 0.
> -- In task/affinity better checking for node existence when laying out
> affinity.
> -- slurmrestd - fix job submission with auth/jwt.
More information about the slurm-announce
mailing list