We are pleased to announce the availability of Slurm version 23.11.7.
The 23.11.7 release fixes a few potential crashes in slurmctld when
using less common options on job submission, slurmrestd compatibility
with auth/slurm, and some additional minor and moderate severity bugs.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
-Marshall
> -- slurmrestd - Correct OpenAPI specification for
> 'GET /slurm/v0.0.40/jobs/state' having response as null.
> -- Allow running jobs on overlapping partitions if jobs don't specify -s.
> -- Fix segfault when requesting a shared gres along with an exclusive
> allocation.
> -- Fix regression in 23.02 where afternotok and afterok dependencies were
> rejected for federated jobs not running on the origin cluster of the
> submitting job.
> -- slurmctld - Disable job table locking while job state cache is active when
> replying to `squeue --only-job-state` or `GET /slurm/v0.0.40/jobs/state`.
> -- Fix sanity check when setting tres-per-task on the job allocation as well as
> the step.
> -- slurmrestd - Fix compatiblity with auth/slurm.
> -- Fix issue where TRESRunMins gets off correct value if using
> QOS UsageFactor != 1.
> -- slurmrestd - Require `user` and `association_condition` fields to be
> populated for requests to 'POST /slurmdb/v0.0.40/users_association'.
> -- Avoid a slurmctld crash with extra_constraints enabled when a job requests
> certain invalid --extra values.
> -- `scancel --ctld` and `DELETE /slurm/v0.0/40/jobs` - Fix support for job
> array expressions (e.g. 1_[3-5]). Also fix signaling a single pending array
> task (e.g. 1_10), which previously signaled the whole array job instead.
> -- Fix a possible slurmctld segfault when at some point we failed to create an
> external launcher step.
> -- Allow the slurmctld to open a connection to the slurmdbd if the first
> attempt fails due to a protocol error.
> -- mpi/cray_shasta - Fix launch for non-het-steps within a hetjob.
> -- sacct - Fix "gpuutil" TRES usage output being incorrect when using --units.
> -- Fix a rare deadlock on slurmctld shutdown or reconfigure.
> -- Fix issue that only left one thread on each core available when "CPUs=" is
> configured to total thread count on multi-threaded hardware and no other
> topology info ("Sockets=", "CoresPerSocket", etc.) is configured.
> -- Fix the external launcher step not being allocated a VNI when requested.
> -- jobcomp/kafka - Fix payload length when producing and sending a message.
> -- scrun - Avoid a crash if RunTimeDelete is called before the container
> finishes.
> -- Save the slurmd's cred_state while reconfiguring to prevent the loss job
> credentials.
Slurm User Group (SLUG) 2024 is set for September 12-13 at the
University of Oslo in Oslo, Norway.
Registration information and a high-level schedule can be found
here:https://slug24.splashthat.com/
The deadline to submit a presentation abstract is Friday, May 31st. We
do not intend to extend this deadline.
If you are interested in presenting your own usage, developments, site
report, tutorial, etc about Slurm, please fill out the following
form:https://forms.gle/N7bFo5EzwuTuKkBN7
Notifications of final presentations accepted will go out by Friday, June 14th.
--
Victoria Hobson
SchedMD LLC
Vice President of Marketing
We are pleased to announce the availability of Slurm release candidate
24.05.0rc1.
To highlight some new features coming in 24.05:
- (Optional) isolated Job Step management. Enabled on a job-by-job basis
with the --stepmgr option, or globally through
SlurmctldParameters=enable_stepmgr.
- Federation - Allow for client command operation while SlurmDBD is
unavailable.
- New MaxTRESRunMinsPerAccount and MaxTRESRunMinsPerUser QOS limits.
- New USER_DELETE reservation flag.
- New Flags=rebootless option on Features for node_features/helpers
which indicates the given feature can be enabled without rebooting the node.
- Cloud power management options: New "max_powered_nodes=<limit>" option
in SlurmctldParamters, and new SuspendExcNodes=<nodes>:<count> syntax
allowing for <count> nodes out of a given node list to be excluded.
- StdIn/StdOut/StdErr now stored in SlurmDBD accounting records for
batch jobs.
- New switch/nvidia_imex plugin for IMEX channel management on NVIDIA
systems.
- New RestrictedCoresPerGPU option at the Node level, designed to ensure
GPU workloads always have access to a certain number of CPUs even when
nodes are running non-GPU workloads concurrently.
This is the first release candidate of the upcoming 24.05 release
series, and represents the end of development for this release, and a
finalization of the RPC and state file formats.
If any issues are identified with this release candidate, please report
them through https://bugs.schedmd.com against the 24.05.x version and we
will address them before the first production 24.05.0 release is made.
Please note that the release candidates are not intended for production use.
A preview of the updated documentation can be found at
https://slurm.schedmd.com/archive/slurm-master/ .
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
--
Marshall Garey
Release Management, Support, and Development
SchedMD LLC - Commercial Slurm Development and Support
We are pleased to announce the availability of Slurm version 23.11.6.
The 23.11.6 release includes two different problems with the
priority/multifactor plugin: a crash and a miscalculation of
AssocGrpCPURunMinutes after a slurmctld reconfiguration/restart.
The wsrep_on errors that sites running MySQL or older MariaDB should
happen much less frequently and has a clarifying statement when it
is an innocuous error.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
-Marshall
> * Changes in Slurm 23.11.6
> ==========================
> -- Avoid limiting sockets per node to one when using gres enforce-binding.
> -- slurmrestd - Avoid permission denied errors when attempting to listen on
> the same port multiple times.
> -- Fix GRES reservations where the GRES has no topology
> (no cores= in gres.conf).
> -- Ensure that thread_id_rpc is gone before priority_g_fini().
> -- Fix scontrol reboot timeout removing drain state from nodes.
> -- squeue - Print header on empty reponse to `--only-job-state`.
> -- Fix slurmrestd not ending job properly when xauth is not present and a x11
> job is sent.
> -- Add experimental job state caching with
> SchedulerParameters=enable_job_state_cache to speed up querying job states
> with squeue --only-job-state.
> -- slurmrestd - Correct dumping of invalid ArrayJobIds returned from
> 'GET /slurm/v0.0.40/jobs/state'.
> -- squeue - Correct dumping of invalid ArrayJobIds returned from
> `squeue --only-job-state --{json|yaml}`.
> -- If scancel --ctld is not used with --interactive, --sibling, or specific
> step ids, then this option issues a single request to the slurmctld to
> signal all jobs matching the specified filters. This greatly improves
> the performance of slurmctld and scancel. The updated --ctld option also
> fixes issues with the --partition or --reservation scancel options for jobs
> that requested multiple partitions or reservations.
> -- slurmrestd - Give EINVAL error when failing to parse signal name to numeric
> signal.
> -- slurmrestd - Allow ContentBody for all methods per RFC7230 even if ignored.
> -- slurmrestd - Add 'DELETE /slurm/v0.0.40/jobs' endpoint to allow bulk job
> signaling via slurmctld.
> -- Fix combination of --nodelist and --exclude not always respecting the
> excluded node list.
> -- Fix jobs incorrectly allocating nodes exclusively when started on a
> partition that doesn't enforce it. This could happen if a multi-partition
> job doesn't specify --exclusive and is evaluated first on a partition
> configured with OverSubscribe=EXCLUSIVE but ends up starting in a partition
> configured with OverSubscribe!=EXCLUSIVE evaluated afterwards.
> -- Setting GLOB_SILENCE flag no longer exposes old bugged behavior.
> -- Fix associations AssocGrpCPURunMinutes being incorrectly computed for
> running jobs after a controller reconfiguration/restart.
> -- Fix scheduling jobs that request --gpus and nodes have different node
> weights and different numbers of gpus.
> -- slurmrestd - Add "NO_CRON_JOBS" as possible flag value to the following:
> 'DELETE /slurm/v0.0.40/jobs' flags field.
> 'DELETE /slurm/v0.0.40/job/{job_id}?flags=' flags query parameter.
> -- Fix scontrol segfault/assert failure if the TRESPerNode parameter is used
> when creating reservations.
> -- Avoid checking for wsrep_on when restoring streaming replication settings.
> -- Clarify in the logs that error "1193 Unknown system variable 'wsrep_on'" is
> innocuous.
> -- accounting_storage/mysql - Fix problem when loading reservations from an
> archive dump.
> -- slurmdbd - Fix minor race condition when sending updates to a shutdown
> slurmctld.
> -- slurmctld - Fix invalid refusal of a reservation update.
> -- openapi - Fix memory leak of /meta/slurm/cluster response field.
> -- Fix memory leak when using auth/slurm and AuthInfo=use_client_ids.
--
Marshall Garey
Release Management, Support, and Development
SchedMD LLC - Commercial Slurm Development and Support
Slurm User Group (SLUG) 2024 is set for September 12-13 at the
University of Oslo in Oslo, Norway.
Registration information and a high-level schedule can be found here:
https://slug24.splashthat.com/
We invite all interested attendees to submit a presentation abstract
to be given at SLUG. Presentation content can be in the form of a
tutorial, technical presentation or site report.
SLUG 2024 is sponsored and organized by the University of Oslo and
SchedMD. This international event is open to those who want to:
Learn more about Slurm, a highly scalable resource manager and job scheduler
- Share their knowledge and experience with other users and administrators
- Get detailed information about the latest features and developments
- Share requirements and discuss future developments
Everyone who wants to present their own usage, developments, site
report, or tutorial about Slurm is invited to submit abstract details
here: https://forms.gle/N7bFo5EzwuTuKkBN7
Abstracts are due Friday, May 31st and notifications of acceptance
will go out by Friday, June 14th.
--
Victoria Hobson
SchedMD LLC
Vice President of Marketing
Slurm major releases are moving to a six month release cycle. This
change starts with the upcoming Slurm 24.05 release this May. Slurm
24.11 will follow in November 2024. Major releases then continue every
May and November in 2025 and beyond.
There are two main goals of this change:
- Faster delivery of newer features and functionality for customers.
- "Predictable" release timing, especially for those sites that would
prefer to upgrade during an annual system maintenance window.
SchedMD will be adjusting our handling of backwards-compatibility within
Slurm itself, and how SchedMD's support services will handle older releases.
For the 24.05 release, Slurm will still only support upgrading from (and
mixed-version operations with) the prior two releases (23.11, 23.02).
Starting with 24.11, Slurm will start supporting upgrades from the prior
three releases (24.05, 23.11, 23.02).
SchedMD's Slurm Support has been built around an 18-month cycle. This
18-month cycle has traditionally covered the current stable release,
plus one prior major releases. With the increase in release frequency
this support window will now cover to the current stable release, plus
two prior major releases.
The blog post version of this announcement includes a table that
outlines the updated support lifecycle:
https://www.schedmd.com/slurm-releases-move-to-a-six-month-cycle/
- Tim
--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support
We are pleased to announce the availability of Slurm version 23.11.5.
The 23.11.5 release includes some important fixes related to newer
features as well as some database fixes. The most noteworthy fixes
include fixing the sattach command (which only worked for root and
SlurmUser after 23.11.0) and fixing an issue while constructing the new
lineage database entries. This last change will also perform a query
during the upgrade from any prior 23.11 version to fix existing databases.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
-Tim
> * Changes in Slurm 23.11.5
> ==========================
> -- Fix Debian package build on systems that are not able to query the systemd
> package.
> -- data_parser/v0.0.40 - Emit a warning instead of an error if a disabled
> parser is invoked.
> -- slurmrestd - Improve handling when content plugins rely on parsers
> that haven't been loaded.
> -- Fix old pending jobs dying (Slurm version 21.08.x and older) when upgrading
> Slurm due to "Invalid message version" errors.
> -- Have client commands sleep for progressively longer periods when backed off
> by the RPC rate limiting system.
> -- slurmctld - Ensure agent queue is flushed correctly at shutdown time.
> -- slurmdbd - correct lineage construction during assoc table conversion for
> partition based associations.
> -- Add new RPCs and API call for faster querying of job states from slurmctld.
> -- slurmrestd - Add endpoint '/slurm/{data_parser}/jobs/state'.
> -- squeue - Add `--only-job-state` argument to use faster query of job states.
> -- Make a job requesting --no-requeue, or JobRequeue=0 in the slurm.conf,
> supersede RequeueExit[Hold].
> -- Add sackd man page to the Debian package.
> -- Fix issues with tasks when a job was shrinked more than once.
> -- Fix reservation update validation that resulted in reject of correct
> updates of reservation when the reservation was running jobs.
> -- Fix possible segfault when the backup slurmctld is asserting control.
> -- Fix regression introduced in 23.02.4 where slurmctld was not properly
> tracking the total GRES selected for exclusive multi-node jobs, potentially
> and incorrectly bypassing limits.
> -- Fix tracking of jobs typeless GRES count when multiple typed GRES with the
> same name are also present in the job allocation. Otherwise, the job could
> bypass limits configured for the typeless GRES.
> -- Fix tracking of jobs typeless GRES count when request specification has a
> typeless GRES name first and then typed GRES of different names (i.e.
> --gres=gpu:1,tmpfs:foo:2,tmpfs:bar:7). Otherwise, the job could bypass
> limits configured for the generic of the typed one (tmpfs in the example).
> -- Fix batch step not having SLURM_CLUSTER_NAME filled in.
> -- slurmstepd - Avoid error during `--container` job cleanup about
> RunTimeQuery never being configured. Results in cleanup where job steps not
> fully started.
> -- Fix nodes not being rebooted when using salloc/sbatch/srun "--reboot" flag.
> -- Send scrun.lua in configless mode.
> -- Fix rejecting an interactive job whose extra constraint request cannot
> immediately be satisfied.
> -- Fix regression in 23.11.0 when parsing LogTimeFormat=iso8601_ms that
> prevented milliseconds from being printed.
> -- Fix issue where you could have a gpu allocated as well as a shard on that
> gpu allocated at the same time.
> -- Fix slurmctld crashes when using extra constraints with job arrays.
> -- sackd/slurmrestd/scrun - Avoid memory leak on new unix socket connection.
> -- The failed node field is filled when a node fails but does not time out.
> -- slurmrestd - Remove requiring job script field and job component script
> fields to both be populated in the `POST /slurm/v0.0.40/job/submit`
> endpoint as there can only be one batch step script for a job.
> -- slurmrestd - When job script is provided in '.jobs[].script' and '.script'
> fields, the '.script' field's value will be used in the
> `POST /slurm/v0.0.40/job/submit` endpoint.
> -- slurmrestd - Reject HetJob submission missing or empty batch script for
> first Het component in the `POST /slurm/v0.0.40/job/submit` endpoint.
> -- slurmrestd - Reject job when empty batch script submitted to the
> POST /slurm/v0.0.40/job/submit` endpoint.
> -- Fix pam_slurm and pam_slurm_adopt when using auth/slurm.
> -- slurmrestd - Add 'cores_per_socket' field to
> `POST /slurm/v0.0.40/job/submit` endpoint.
> -- Fix srun and other Slurm commands running within a "configless" salloc when
> salloc itself fetched the config.
> -- Enforce binding with shared gres selection if requested.
> -- Fix job allocation failures when the requested tres type or name ends in
> "gres" or "license".
> -- accounting_storage/mysql - Fix lineage string construction when adding a
> user association with a partition.
> -- Fix sattach command.
> -- Fix ReconfigFlags. Due how reconfig was changed in 23.11, they will also
> be used to influence the slurmctld startup as well.
> -- Fix starting slurmd in configless mode if MUNGE support was disabled.
--
Tim McMullan
Release Management, Support, and Development
SchedMD LLC - Commercial Slurm Development and Support
We are pleased to announce the availability of Slurm version 23.11.4.
The 23.11.4 release includes a number of fixes to stability and various
bug fixes. Some notable changes include that VSZ is no longer being
reported when using cgroup/v2 (this is not provided by the kernel), a
warning has been added if using select/linear and tolology/tree together
as this will not be supported in the next major release, and a backwards
compatibility issue that caused jobs using --gpus to be rejected when
submitted from 23.02 or 22.05.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
In addition, welcome to the updated slurm-announce list! We've made some
mailing list adjustments in order to ensure compliance with newer
anti-spam measures, and upgraded to Mailman3 as part of this process.
- Tim
--
Tim Wickberg
Chief Technology Officer, SchedMD LLC
Commercial Slurm Development and Support
> * Changes in Slurm 23.11.4
> ==========================
> -- Fix a memory leak when updating partition nodes.
> -- Don't leave a partition around if it fails to create with scontrol.
> -- Fix segfault when creating partition with bad node list from scontrol.
> -- Fix preserving partition nodes on bad node list update from scontrol.
> -- Fix assertion in developer mode on a failed message unpack.
> -- Fix repeat POWER_DOWN requests making the nodes available for ping.
> -- Fix rebuilding job alias_list on restart when nodes are still powering up.
> -- Fix INVALID nodes running health check.
> -- Fix cloud/future nodes not setting addresses on invalid registration.
> -- scrun - Remove the requirement to set the SCRUN_WORKING_DIR environment
> variable. This was a regression in 23.11.
> -- Add warning for using select/linear with topology/tree.
> This combination will not be supported in the next major version.
> -- Fix health check program not being run after first pass of all nodes when
> using MaxNodeCount.
> -- sacct - Set process exit code to one for all errors.
> -- Add SlurmctldParameters=disable_triggers option.
> -- Fix issue running steps when the allocation requested an exclusive
> allocation shards along with shards.
> -- Fix cleaning up the sleep process and the cgroup of the extern step if
> slurm_spank_task_post_fork returns an error.
> -- slurm_completion - Add missing --gres-flags= options
> multiple-tasks-per-sharing and one-task-per-sharing.
> -- scrun - Avoid race condition that could cause outbound network
> communications to incorrectly rejected with an incomplete packet error.
> -- scrun - Gracefully handle kernel giving invalid expected number of incoming
> bytes for a connection causing incoming packet corruption resulting in
> connection getting closed.
> -- srun - return 1 when a step lauch fails
> -- scrun - Avoid race condition that could cause deadlock during shutdown.
> -- Fix scontrol listpids to work under dynamic node scenarios.
> -- Add --tres-bind to --help and --usage output.
> -- Add --gres-flags=allow-task-sharing to allow GPUs to still be accessible
> among all tasks when binding GPUs to specific tasks.
> -- Fix issue with CUDA_VISIBLE_DEVICES showing the same MIG device for all
> tasks when using MIGs with --tres-per-task or --gpus-per-task.
> -- slurmctld - Prevent a potential hang during shutdown/reconfigure if the
> association cache thread was previously shut down.
> -- scrun - Avoid race condition that could cause scrun to hang during
> shutdown when connections have pending events.
> -- scrun - Avoid excessive polling of connections during shutdown that could
> needlessly cause 100% CPU usage on a thread.
> -- sbcast - Use user identity from broadcast credential instead of looking it
> up locally on the node.
> -- scontrol - Remove "abort" option handling.
> -- Fix an error message referring to the wrong RPC.
> -- Fix memory leak on error when creating dynamic nodes.
> -- Fix a slurmctld segfault when a cloud/dynamic node changes hostname on
> registration.
> -- Prevent a slurmctld deadlock if the gpu plugin fails to load when
> creating a node.
> -- Change a slurmctld fatal() to an error() when attempting to create a
> dynamic node with a global autodetect set in gres.conf.
> -- Fix leaving node records on error when creating nodes with scontrol.
> -- scrun/sackd - Avoid race condition where shutdown could deadlock.
> -- Fix a regression in 23.02.5 that caused pam_slurm_adopt to fail when
> the user has multiple jobs on a node.
> -- Add GLOB_SILENCE flag that silences the error message which will display if
> an include directive attempts to use the "*" wildcard.
> -- Fix jobs getting rejected when submitting with --gpus option from older
> versions of job submission commands (23.02 and older).
> -- cgroup/v2 - Return 0 for VSZ. Kernel cgroups do not provide this metric.
> -- scrun - Avoid race condition where outbound RPCs could be corrupted.
> -- scrun - Avoid race condition that could cause a crash while compiled in
> debug mode.
> -- gpu/rsmi - Disable gpu usage statistics when not using ROCM 6.0.0+
> -- Fix stuck processes and incorrect environment when using --get-user-env.
> -- Avoid segfault in the slurmdbd when TrackWCKey=no but you are still using
> use WCKeys.
> -- Fix ctld segfault with TopologyParam=RoutePart and no partition defined.
> -- slurmctld - Fix missing --deadline handling for jobs not evaluated by the
> schedulers (i.e. non-runnable, skipped for other reasons, etc.).
> -- Demote some eio related logs from error to verbose in user commands. These
> are not generally actionable by the user and are easilly generated by port
> scanning a machine running srun.
> -- Make sprio correctly print array tasks that have not yet been split out.
> -- topology/block - Restrict the number of last-level blocks in any allocation.