[slurm-announce] Slurm version 20.11.8 is now available

Tim Wickberg tim at schedmd.com
Thu Jul 1 23:00:17 UTC 2021


We are pleased to announce the availability of Slurm version 20.11.8.

This includes a number of minor-to-moderate severity bug fixes.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

-- 
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

> * Changes in Slurm 20.11.8
> ==========================
>  -- slurmctld - fix erroneous "StepId=CORRUPT" messages in error logs.
>  -- Correct the error given when auth plugin fails to pack a credential.
>  -- Fix unused-variable compiler warning on FreeBSD in fd_resolve_path().
>  -- acct_gather_filesystem/lustre - only emit collection error once per step.
>  -- srun - leave SLURM_DIST_UNKNOWN as default for --interactive.
>  -- Add GRES environment variables (e.g., CUDA_VISIBLE_DEVICES) into the
>     interactive step, the same as is done for the batch step.
>  -- Fix various potential deadlocks when altering objects in the database
>     dealing with every cluster in the database.
>  -- slurmrestd - handle slurmdbd connection failures without segfaulting.
>  -- slurmrestd - fix segfault for searches in slurmdb/v0.0.36/jobs.
>  -- slurmrestd - remove (non-functioning) users query parameter for
>     slurmdb/v0.0.36/jobs from openapi.json
>  -- slurmrestd - fix segfault in slurmrestd db/jobs with numeric queries
>  -- slurmrestd - add argv handling for job/submit endpoint.
>  -- srun - fix broken node step allocation in a heterogeneous allocation.
>  -- Fail step creation if -n is not multiple of --ntasks-per-gpu.
>  -- job_container/tmpfs - Fix slowdown on teardown.
>  -- Fix problem with SlurmctldProlog where requeued jobs would never launch.
>  -- job_container/tmpfs - Fix issue when restarting slurmd where the namespace
>     mount points could disappear.
>  -- sacct - avoid truncating JobId at 34 characters.
>  -- scancel - fix segfault when --wckey filtering option is used.
>  -- select/cons_tres - Fix memory leak.
>  -- Prevent file descriptor leak in job_container/tmpfs on slurmd restart.
>  -- slurmrestd/dbv0.0.36 - Fix values dumped in job state/current and
>     job step state.
>  -- slurmrestd/dbv0.0.36 - Correct description for previous state property.
>  -- perlapi/libslurmdb - expose tres_req_str to job hash.
>  -- scrontab - close and reopen temporary crontab file to deal with editors
>     that do not change the original file, but instead write out then rename
>     a new file.
>  -- sstat - fix linking so that it will work when --without-shared-libslurm
>     was used to build Slurm.
>  -- Clear allocated cpus for running steps in a job before handling requested
>     nodes on new step.
>  -- Don't reject a step if not enough nodes are available. Instead, defer the
>     step until enough nodes are available to satisfy the request.
>  -- Don't reject a step if it requests at least one specific node that is
>     already allocated to another step. Instead, defer the step until the
>     requested node(s) become available.
>  -- slurmrestd - add description for slurmdb/job endpoint.
>  -- Better handling of --mem=0.
>  -- Ignore DefCpuPerGpu when --cpus-per-task given.
>  -- sacct - fix segfault when printing StepId (or when using --long).




More information about the slurm-announce mailing list