[slurm-users] Slurm versions 20.11.1 is now available

Michael Di Domenico mdidomenico4 at gmail.com
Fri Dec 11 17:06:45 UTC 2020


that's helpful, but still a little vague (for non-dev's) what those
two commits actually mean.  i suspect the bug/fix is fairly benign,
but it sounds scary.  perhaps it's just the wording.

On Fri, Dec 11, 2020 at 11:13 AM Jeffrey T Frey <frey at udel.edu> wrote:
>
> It's in the github commits:
>
>
> https://github.com/SchedMD/slurm/commit/8e84db0f01ecd4c977c12581615d74d59b3ff995
>
>
> The primary issue is that any state the client program established on the connection after first making it (e.g. opening a transaction, creating temp tables) won't be present if MySQL automatically reconnects to the server.  So the reconnected state won't match the state expected by the client.  Better for the client to know the connection failed and reconnect on its own to reestablish state.
>
>
>
> > On Dec 11, 2020, at 10:31 , Malte Thoma <Malte.Thoma at awi.de> wrote:
> >
> >
> >
> > Am 11.12.20 um 14:11 schrieb Michael Di Domenico:
> >>>  -- Disable MySQL automatic reconnection.
> >> can you expand on this?  seems an 'odd' thing to disable.
> >
> > same thoughts here :-)
> >
> >
> >
> >
> >
> >
> >
> >
> >> On Thu, Dec 10, 2020 at 4:44 PM Tim Wickberg <tim at schedmd.com> wrote:
> >>>
> >>> We are pleased to announce the availability of Slurm version 20.11.1.
> >>>
> >>> This includes a number of fixes made in the month since 20.11 was
> >>> initially released, including critical fixes to nss_slurm and the Perl
> >>> API when used with the newer configless mode of operation.
> >>>
> >>> Slurm can be downloaded from https://www.schedmd.com/downloads.php .
> >>>
> >>> - Tim
> >>>
> >>> --
> >>> Tim Wickberg
> >>> Chief Technology Officer, SchedMD LLC
> >>> Commercial Slurm Development and Support
> >>>
> >>>> * Changes in Slurm 20.11.1
> >>>> ==========================
> >>>>  -- Fix spelling of "overcomited" to "overcomitted" in sreport's cluster
> >>>>     utilization report.
> >>>>  -- Silence debug message about shutting down backup controllers if none are
> >>>>     configured.
> >>>>  -- Don't create interactive srun until PrologSlurmctld is done.
> >>>>  -- Fix fd symlink path resolution.
> >>>>  -- Fix slurmctld segfault on subnode reservation restore after node
> >>>>     configuration change.
> >>>>  -- Fix resource allocation response message environment allocation size.
> >>>>  -- Ensure that details->env_sup is NULL terminated.
> >>>>  -- select/cray_aries - Correctly remove jobs/steps from blades using NPC.
> >>>>  -- cons_tres - Avoid max_node_gres when entire node is allocated with
> >>>>     --ntasks-per-gpu.
> >>>>  -- Allow NULL arg to data_get_type().
> >>>>  -- In sreport have usage for a reservation contain all jobs that ran in the
> >>>>     reservation instead of just the ones that ran in the time specified. This
> >>>>     matches the report for the reservation is not truncated for a time period.
> >>>>  -- Fix issue with sending wrong batch step id to a < 20.11 slurmd.
> >>>>  -- Add a job's alloc_node to lua for job modification and completion.
> >>>>  -- Fix regression getting a slurmdbd connection through the perl API.
> >>>>  -- Stop the extern step terminate monitor right after proctrack_g_wait().
> >>>>  -- Fix removing the normalized priority of assocs.
> >>>>  -- slurmrestd/v0.0.36 - Use correct name for partition field:
> >>>>     "min nodes per job" -> "min_nodes_per_job".
> >>>>  -- slurmrestd/v0.0.36 - Add node comment field.
> >>>>  -- Fix regression marking cloud nodes as "unexpectedly rebooted" after
> >>>>     multiple boots.
> >>>>  -- Fix slurmctld segfault in _slurm_rpc_job_step_create().
> >>>>  -- slurmrestd/v0.0.36 - Filter node states against NODE_STATE_BASE to avoid
> >>>>     the extended states all being reported as "invalid".
> >>>>  -- Fix race that can prevent the prolog for a requeued job from running.
> >>>>  -- cli_filter - add "type" to readily distinguish between the CLI command in
> >>>>     use.
> >>>>  -- smail - reduce sleep before seff to 5 seconds.
> >>>>  -- Ensure SPANK prolog and epilog run without an explicit PlugStackConfig.
> >>>>  -- Disable MySQL automatic reconnection.
> >>>>  -- Fix allowing "b" after memory unit suffixes.
> >>>>  -- Fix slurmctld segfault with reservations without licenses.
> >>>>  -- Due to internal restructuring ahead of the 20.11 release, applications
> >>>>     calling libslurm MUST call slurm_init(NULL) before any API calls.
> >>>>     Otherwise the API call is likely to fail due to libslurm's internal
> >>>>     configuration not being available.
> >>>>  -- slurm.spec - allow custom paths for PMIx and UCX install locations.
> >>>>  -- Use rpath if enabled when testing for Mellanox's UCX libraries.
> >>>>  -- slurmrestd/dbv0.0.36 - Change user query for associations to optional.
> >>>>  -- slurmrestd/dbv0.0.36 - Change account query for associations to optional.
> >>>>  -- mpi/pmix - change the error handler error message to be more useful.
> >>>>  -- Add missing connection in acct_storage_p_{clear_stats, reconfig, shutdown}.
> >>>>  -- Perl API - fix issue when running in configless mode.
> >>>>  -- nss_slurm - avoid deadlock when stray sockets are found.
> >>>>  -- Display correct value for ScronParameters in 'scontrol show config'.
> >>>
> >
> > --
> > Malte Thoma        Tel. +49-471-4831-1828
> > HSM Documentation: https://spaces.awi.de/x/YF3-Eg (User)
> >                   https://spaces.awi.de/x/oYD8B  (Admin)
> > HPC Documentation: https://spaces.awi.de/x/Z13-Eg (User)
> >                   https://spaces.awi.de/x/EgCZB (Admin)
> > AWI, Geb.E (3125)
> > Am Handelshafen 12
> > 27570 Bremerhaven
> > Tel. +49-471-4831-1828
> >
>
>



More information about the slurm-users mailing list