[slurm-users] Slurm version 21.08.1 is now available

Tim Wickberg tim at schedmd.com
Thu Sep 16 21:45:09 UTC 2021


We are pleased to announce the availability of Slurm version 21.08.1.

For sites using scrontab, there is a critical fix included to ensure 
that the cron jobs continue to repeat indefinitely into the future.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

-- 
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

> * Changes in Slurm 21.08.1
> ==========================
>  -- Fix potential memory leak if a problem happens while allocating GRES for
>     a job.
>  -- If an overallocation of GRES happens terminate the creation of a job.
>  -- AutoDetect=nvml: Fatal if no devices found in MIG mode.
>  -- slurm.spec - fix querying for PMIx and UCX version.
>  -- Print federation and cluster sacctmgr error messages to stderr.
>  -- Fix off by one error in --gpu-bind=mask_gpu.
>  -- Fix statement condition in http_parser autoconf macro.
>  -- Fix statement condition in netloc autoconf macro.
>  -- Add --gpu-bind=none to disable gpu binding when using --gpus-per-task.
>  -- Handle the burst buffer state "alloc-revoke" which previously would not
>     display in the job correctly.
>  -- Fix issue in the slurmstepd SPANK prolog/epilog handler where configuration
>     values were used before being initialized.
>  -- Restore a step's ability to utilize all of an allocations memory if --mem=0.
>  -- Fix --cpu-bind=verbose garbage taskid.
>  -- Fix cgroup task affinity issues from garbage taskid info.
>  -- Make gres_job_state_validate() client logging behavior as before 44466a4641.
>  -- Fix steps with --hint overriding an allocation with --threads-per-core.
>  -- Require requesting a GPU if --mem-per-gpu is requested.
>  -- Return error early if a job is requesting --ntasks-per-gpu and no gpus or
>     task count.
>  -- Properly clear out pending step if unavailable to run with available
>     resources.
>  -- Kill all processes spawned by burst_buffer.lua including decendents.
>  -- openapi/v0.0.{35,36,37} - Avoid setting default values of min_cpus,
>     job name, cwd, mail_type, and contiguous on job update.
>  -- openapi/v0.0.{35,36,37} - Clear user hold on job update if hold=false.
>  -- Prevent CRON_JOB flag from being cleared when loading job state.
>  -- sacctmgr - Fix deleting WCKeys when not specifying a cluster.
>  -- Fix getting memory for a step when the first node in the step isn't the
>     first node in the allocation.
>  -- Make SelectTypeParameters=CR_Core_Memory default for cons_tres and cons_res.
>  -- Correctly handle mutex unlocks in the gres code if failures happen.
>  -- Give better error message if -m plane is given with no size.
>  -- Fix --distribution=arbitrary for salloc.
>  -- Fix jobcomp/script regression introduced in 21.08.0rc1 0c75b9ac9d.
>  -- Only send the batch node in the step_hostlist in the job credential.
>  -- When setting affinity for the batch step don't assume the batch host is node
>     0.
>  -- In task/affinity better checking for node existence when laying out
>     affinity.
>  -- slurmrestd - fix job submission with auth/jwt.



More information about the slurm-users mailing list