[slurm-users] Slurm versions 20.02.2 is now available
Tim Wickberg
tim at schedmd.com
Thu Apr 30 19:49:55 UTC 2020
We are pleased to announce the availability of Slurm version 20.02.2.
This includes a series of moderate and minor fixes since the last
maintenance releases for both branches.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
- Tim
--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support
> * Changes in Slurm 20.02.2
> ==========================
> -- Fix slurmctld segfault when checking no_consume GRES node allocation counts.
> -- Fix resetting of cloud_dns on a reconfigure.
> -- squeue - change output for dependency column to use "(null)" instead of ""
> for no dependncies as documented in the man page, and used by other columns.
> -- Clear node_cnt_wag after job update.
> -- Fix regression where AccountingStoreJobComment was not defaulting to 'yes'.
> -- Send registration message immediately after a node is resumed.
> -- Cray - Fix hetjobs when using only a single component in the step launch.
> -- Cray - Fix hetjobs launched without component 0.
> -- Cray - Quiet cookies missing message which is expected on for hetjobs.
> -- Fix handling of -m/--distribution options for across socket/2nd level by
> task/affinity plugin.
> -- Fix grp_node_bitmap error when slurmctld started before slurmdbd.
> -- Fix scheduling issue when there are not enough nodes available to run a job
> resulting in possible job starvation.
> -- Make it so mpi/cray_shasta appears in srun --mpi=list
> -- Don't requeue jobs that have been explicitly canceled.
> -- Fix error message for a regular user trying to update licenses on a running
> job.
> -- Fix backup slurmctld handling for logrotation via SIGUSR2.
> -- Fix reservation feature specification when looking for inactive features
> after active features fails.
> -- Prevent misleading error messages for reservation creation.
> -- Print message in scontrol when a request fails for not having enough nodes.
> -- Fix duplicate output in sacct with multiple resv events.
> -- auth/jwt - return correct gid for a given user. This was incorrectly
> assuming the users's primary group name matched their username.
> -- slurmrestd - permit non-SlurmUser/root job submission.
> -- Use host IP if hostname unknown for job submission for allocating node.
> -- Fix issue with primary_slurmdbd_resumed_operation trigger not happening
> on slurmctld restart.
> -- Fix race in acct_gather_interconnect/ofed on step termination.
> -- Fix typo of SlurmctldProlog -> PrologSlurmctld in error message.
> -- slurm.spec - add SuSE-specific dependencies for optional slurmrestd package.
> -- Fix FreeBSD build issues.
> -- Fixed sbatch not processing --ignore-pbs in batch script.
> -- Don't clear the qos_id of an invalid QOS.
> -- Allow a job that was once FAIL_[QOS|ACCOUNT] to be eligible again if
> the qos|account limitation is remedied.
> -- Fix core reservations using the FLEX flag to allow use of resources
> outside of the reservation allocation.
> -- Fix MPS without File with 1 GPU, and without GPUs.
> -- Add FreeBSD support to proctrack/pgid plugin.
> -- Fix remote dependency testing for meta job in job array.
> -- Fix preemption when dealing with a job array.
> -- Don't send remote non-pending singleton dependencies on federation update.
> -- slurmrestd - fix crash on empty query.
> -- Fix race condition which could lead to invalid references in backfill.
> -- Fix edge case in _remove_job_hash().
> -- Fix exit code when using --cluster/-M client options.
> -- Fix compilation issues in GCC10.
> -- Fix invalid references when federated job is revoked while in backfill loop.
> -- Fix distributing job steps across idle nodes within a job.
> -- Fix detected floating reservation overlapping.
> -- Break infinite loop in cons_tres dealing with incorrect tasks per tres
> request resulting in slurmctld hang.
> -- Send the current (not the previous) reason for a pending job to client
> commands like squeue/scontrol.
> -- Fix incorrect lock levels for select_g_reconfigure().
> -- Handle hidden nodes correctly in slurmrestd.
> -- Allow sacctmgr to use MaxSubmitP[U|A] as format options.
> -- Fix segfault when trying to delete a corrupted association.
> -- Fix setting ntasks-per-core when using --multithread.
> -- Only override job wait reason to priority if Reason=None or
> Reason=Resources.
> -- Perl API / seff - fix missing symbol issue with accounting_storage/slurmdbd.
> -- slurm.spec - add --with cray_shasta option.
> -- Downgrade "Node config differ.." error message if config_overrides enabled.
> -- Add client error when using --gpus-per-socket without --sockets-per-node.
> -- Fix nvml/rsmi debug statements making it to stderr.
> -- NodeSets - fix slurmctld segfault in newer glibc if any nodes have no
> defined features.
> -- ConfigLess - write out plugstack config to correct config file name in
> the config cache.
> -- priority/multifactor - gracefully handle NULL list of associations or array
> of siblings when calculating FairTree fairshare.
> -- Fix cons_tres --exclusive=user to allocate only requested number of CPUs.
> -- Add MySQL deadlock detection and automatic retry mechanism.
> -- Reject repeating floating reservations as they aren't supported.
> -- Fix testing of reservation flags that may be NO_VAL64.
> -- Fix _verify_node_state memory requested as --mem-per-gpu DefMemPerGPU.
> -- Fix DependencyNeverSatisfied not set as the job's state reason if > kill_invalid_depend or --kill-on-invalid-dep are used.
> -- pam_slurm_adopt - explicitly call slurm_conf_init().
> -- configless - fix plugstack.conf handling for client commands.
> -- Set SLURM_JOB_USER and SLURM_JOB_UID in task_epilog correctly.
> -- slurmrestd - authenticate job submissions by SlurmUser properly.
More information about the slurm-users
mailing list