[slurm-users] Slurm version 20.11.8 is now available
Tim Wickberg
tim at schedmd.com
Thu Jul 1 23:00:17 UTC 2021
We are pleased to announce the availability of Slurm version 20.11.8.
This includes a number of minor-to-moderate severity bug fixes.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
- Tim
--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support
> * Changes in Slurm 20.11.8
> ==========================
> -- slurmctld - fix erroneous "StepId=CORRUPT" messages in error logs.
> -- Correct the error given when auth plugin fails to pack a credential.
> -- Fix unused-variable compiler warning on FreeBSD in fd_resolve_path().
> -- acct_gather_filesystem/lustre - only emit collection error once per step.
> -- srun - leave SLURM_DIST_UNKNOWN as default for --interactive.
> -- Add GRES environment variables (e.g., CUDA_VISIBLE_DEVICES) into the
> interactive step, the same as is done for the batch step.
> -- Fix various potential deadlocks when altering objects in the database
> dealing with every cluster in the database.
> -- slurmrestd - handle slurmdbd connection failures without segfaulting.
> -- slurmrestd - fix segfault for searches in slurmdb/v0.0.36/jobs.
> -- slurmrestd - remove (non-functioning) users query parameter for
> slurmdb/v0.0.36/jobs from openapi.json
> -- slurmrestd - fix segfault in slurmrestd db/jobs with numeric queries
> -- slurmrestd - add argv handling for job/submit endpoint.
> -- srun - fix broken node step allocation in a heterogeneous allocation.
> -- Fail step creation if -n is not multiple of --ntasks-per-gpu.
> -- job_container/tmpfs - Fix slowdown on teardown.
> -- Fix problem with SlurmctldProlog where requeued jobs would never launch.
> -- job_container/tmpfs - Fix issue when restarting slurmd where the namespace
> mount points could disappear.
> -- sacct - avoid truncating JobId at 34 characters.
> -- scancel - fix segfault when --wckey filtering option is used.
> -- select/cons_tres - Fix memory leak.
> -- Prevent file descriptor leak in job_container/tmpfs on slurmd restart.
> -- slurmrestd/dbv0.0.36 - Fix values dumped in job state/current and
> job step state.
> -- slurmrestd/dbv0.0.36 - Correct description for previous state property.
> -- perlapi/libslurmdb - expose tres_req_str to job hash.
> -- scrontab - close and reopen temporary crontab file to deal with editors
> that do not change the original file, but instead write out then rename
> a new file.
> -- sstat - fix linking so that it will work when --without-shared-libslurm
> was used to build Slurm.
> -- Clear allocated cpus for running steps in a job before handling requested
> nodes on new step.
> -- Don't reject a step if not enough nodes are available. Instead, defer the
> step until enough nodes are available to satisfy the request.
> -- Don't reject a step if it requests at least one specific node that is
> already allocated to another step. Instead, defer the step until the
> requested node(s) become available.
> -- slurmrestd - add description for slurmdb/job endpoint.
> -- Better handling of --mem=0.
> -- Ignore DefCpuPerGpu when --cpus-per-task given.
> -- sacct - fix segfault when printing StepId (or when using --long).
More information about the slurm-users
mailing list