[slurm-users] Slurm versions 20.02.2 is now available

Tim Wickberg tim at schedmd.com
Thu Apr 30 19:49:55 UTC 2020


We are pleased to announce the availability of Slurm version 20.02.2.

This includes a series of moderate and minor fixes since the last 
maintenance releases for both branches.

Slurm can be downloaded from https://www.schedmd.com/downloads.php .

- Tim

-- 
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support

> * Changes in Slurm 20.02.2
> ==========================
>  -- Fix slurmctld segfault when checking no_consume GRES node allocation counts.
>  -- Fix resetting of cloud_dns on a reconfigure.
>  -- squeue - change output for dependency column to use "(null)" instead of ""
>     for no dependncies as documented in the man page, and used by other columns.
>  -- Clear node_cnt_wag after job update.
>  -- Fix regression where AccountingStoreJobComment was not defaulting to 'yes'.
>  -- Send registration message immediately after a node is resumed.
>  -- Cray - Fix hetjobs when using only a single component in the step launch.
>  -- Cray - Fix hetjobs launched without component 0.
>  -- Cray - Quiet cookies missing message which is expected on for hetjobs.
>  -- Fix handling of -m/--distribution options for across socket/2nd level by
>     task/affinity plugin.
>  -- Fix grp_node_bitmap error when slurmctld started before slurmdbd.
>  -- Fix scheduling issue when there are not enough nodes available to run a job
>     resulting in possible job starvation.
>  -- Make it so mpi/cray_shasta appears in srun --mpi=list
>  -- Don't requeue jobs that have been explicitly canceled.
>  -- Fix error message for a regular user trying to update licenses on a running
>     job.
>  -- Fix backup slurmctld handling for logrotation via SIGUSR2.
>  -- Fix reservation feature specification when looking for inactive features
>     after active features fails.
>  -- Prevent misleading error messages for reservation creation.
>  -- Print message in scontrol when a request fails for not having enough nodes.
>  -- Fix duplicate output in sacct with multiple resv events.
>  -- auth/jwt - return correct gid for a given user. This was incorrectly
>     assuming the users's primary group name matched their username.
>  -- slurmrestd - permit non-SlurmUser/root job submission.
>  -- Use host IP if hostname unknown for job submission for allocating node.
>  -- Fix issue with primary_slurmdbd_resumed_operation trigger not happening
>     on slurmctld restart.
>  -- Fix race in acct_gather_interconnect/ofed on step termination.
>  -- Fix typo of SlurmctldProlog -> PrologSlurmctld in error message.
>  -- slurm.spec - add SuSE-specific dependencies for optional slurmrestd package.
>  -- Fix FreeBSD build issues.
>  -- Fixed sbatch not processing --ignore-pbs in batch script.
>  -- Don't clear the qos_id of an invalid QOS.
>  -- Allow a job that was once FAIL_[QOS|ACCOUNT] to be eligible again if
>     the qos|account limitation is remedied.
>  -- Fix core reservations using the FLEX flag to allow use of resources
>     outside of the reservation allocation.
>  -- Fix MPS without File with 1 GPU, and without GPUs.
>  -- Add FreeBSD support to proctrack/pgid plugin.
>  -- Fix remote dependency testing for meta job in job array.
>  -- Fix preemption when dealing with a job array.
>  -- Don't send remote non-pending singleton dependencies on federation update.
>  -- slurmrestd - fix crash on empty query.
>  -- Fix race condition which could lead to invalid references in backfill.
>  -- Fix edge case in _remove_job_hash().
>  -- Fix exit code when using --cluster/-M client options.
>  -- Fix compilation issues in GCC10.
>  -- Fix invalid references when federated job is revoked while in backfill loop.
>  -- Fix distributing job steps across idle nodes within a job.
>  -- Fix detected floating reservation overlapping.
>  -- Break infinite loop in cons_tres dealing with incorrect tasks per tres
>     request resulting in slurmctld hang.
>  -- Send the current (not the previous) reason for a pending job to client
>     commands like squeue/scontrol.
>  -- Fix incorrect lock levels for select_g_reconfigure().
>  -- Handle hidden nodes correctly in slurmrestd.
>  -- Allow sacctmgr to use MaxSubmitP[U|A] as format options.
>  -- Fix segfault when trying to delete a corrupted association.
>  -- Fix setting ntasks-per-core when using --multithread.
>  -- Only override job wait reason to priority if Reason=None or
>     Reason=Resources.
>  -- Perl API / seff - fix missing symbol issue with accounting_storage/slurmdbd.
>  -- slurm.spec - add --with cray_shasta option.
>  -- Downgrade "Node config differ.." error message if config_overrides enabled.
>  -- Add client error when using --gpus-per-socket without --sockets-per-node.
>  -- Fix nvml/rsmi debug statements making it to stderr.
>  -- NodeSets - fix slurmctld segfault in newer glibc if any nodes have no
>     defined features.
>  -- ConfigLess - write out plugstack config to correct config file name in
>     the config cache.
>  -- priority/multifactor - gracefully handle NULL list of associations or array
>     of siblings when calculating FairTree fairshare.
>  -- Fix cons_tres --exclusive=user to allocate only requested number of CPUs.
>  -- Add MySQL deadlock detection and automatic retry mechanism.
>  -- Reject repeating floating reservations as they aren't supported.
>  -- Fix testing of reservation flags that may be NO_VAL64.
>  -- Fix _verify_node_state memory requested as --mem-per-gpu DefMemPerGPU.
>  -- Fix DependencyNeverSatisfied not set as the job's state reason if >     kill_invalid_depend or --kill-on-invalid-dep are used.
>  -- pam_slurm_adopt - explicitly call slurm_conf_init().
>  -- configless - fix plugstack.conf handling for client commands.
>  -- Set SLURM_JOB_USER and SLURM_JOB_UID in task_epilog correctly.
>  -- slurmrestd - authenticate job submissions by SlurmUser properly.



More information about the slurm-users mailing list