<div dir="ltr"><div dir="ltr">Is there a direct upgrade path from 


<span style="font-size:12pt;font-family:"Times New Roman",serif">20.11.0 to 22.05.6 or is it in multiple steps?</span><br clear="all"><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><br></div><div><br></div><div>Sid Young</div><div><br></div></div></div></div></div></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Nov 11, 2022 at 7:53 AM Marshall Garey <<a href="mailto:marshall@schedmd.com">marshall@schedmd.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">We are pleased to announce the availability of Slurm version 22.05.6.<br>

<br>

This includes a fix to core selection for steps which could result in <br>

random task launch failures, alongside a number of other moderate <br>

severity issues.<br>

<br>

- Marshall<br>

<br>

--<br>

Marshall Garey<br>

Release Management, Support, and Development<br>

SchedMD LLC - Commercial Slurm Development and Support<br>

<br>

> * Changes in Slurm 22.05.6<br>

> ==========================<br>

>  -- Fix a partition's DisableRootJobs=no from preventing root jobs from working.<br>

>  -- Fix the number of allocated cpus for an auto-adjustment case in which the<br>

>     job requests --ntasks-per-node and --mem (per-node) but the limit is<br>

>     MaxMemPerCPU.<br>

>  -- Fix POWER_DOWN_FORCE request leaving node in completing state.<br>

>  -- Do not count magnetic reservation queue records towards backfill limits.<br>

>  -- Clarify error message when --send-libs=yes or BcastParameters=send_libs<br>

>     fails to identify shared library files, and avoid creating an empty<br>

>     "<filename>_libs" directory on the target filesystem.<br>

>  -- Fix missing CoreSpec on dynamic nodes upon slurmctld restart.<br>

>  -- Fix node state reporting when using specialized cores.<br>

>  -- Fix number of CPUs allocated if --cpus-per-gpu used.<br>

>  -- Add flag ignore_prefer_validation to not validate --prefer on a job.<br>

>  -- Fix salloc/sbatch SLURM_TASKS_PER_NODE output environment variable when the<br>

>     number of tasks is not requested.<br>

>  -- Permit using wildcard magic cookies with X11 forwarding.<br>

>  -- cgroup/v2 - Add check for swap when running OOM check after task<br>

>     termination.<br>

>  -- Fix deadlock caused by race condition when disabling power save with a<br>

>     reconfigure.<br>

>  -- Fix memory leak in the dbd when container is sent to the database.<br>

>  -- openapi/dbv0.0.38 - correct dbv0.0.38_tres_info.<br>

>  -- Fix node SuspendTime, SuspendTimeout, ResumeTimeout being updated after<br>

>     altering partition node lists with scontrol.<br>

>  -- jobcomp/elasticsearch - fix data_t memory leak after serialization.<br>

>  -- Fix issue where '*' wasn't accepted in gpu/cpu bind.<br>

>  -- Fix SLURM_GPUS_ON_NODE for shared GPU gres (MPS, shards).<br>

>  -- Add SLURM_SHARDS_ON_NODE environment variable for shards.<br>

>  -- Fix srun error with overcommit.<br>

>  -- Fix bug in core selection for the default cyclic distribution of tasks<br>

>     across sockets, that resulted in random task launch failures.<br>

>  -- Fix core selection for steps requesting multiple tasks per core when<br>

>     allocation contains more cores than required for step.<br>

>  -- gpu/nvml - Fix MIG minor number generation when GPU minor number<br>

>     (/dev/nvidia[minor_number]) and index (as seen in nvidia-smi) do not match.<br>

>  -- Fix accrue time underflow errors after slurmctld reconfig or restart.<br>

>  -- Surpress errant errors from prolog_complete about being unable to locate<br>

>     "node:(null)".<br>

>  -- Fix issue where shards were selected from multiple gpus and failed to<br>

>     allocate.<br>

>  -- Fix step cpu count calculation when using --ntasks-per-gpu=.<br>

>  -- Fix overflow problems when validating array index parameters in slurmctld<br>

>     and prevent a potential condition causing slurmctld to crash.<br>

>  -- Remove dependency on json-c in slurmctld when running with power saving.<br>

>     Only the new "SLURM_RESUME_FILE" support relies on this, and it will be<br>

>     disabled if json-c support is unavailable instead.<br>

<br>

</blockquote></div></div>