[slurm-users] "Incompatible plugin version" after upgrade
Luke Yeager
lyeager at nvidia.com
Tue Sep 6 18:33:48 UTC 2022
As of Slurm 21.08, you have to recompile your spank plugins when you upgrade e.g. from 20.11 to 21.08, or from 21.08 to 22.05.
https://github.com/SchedMD/slurm/blob/slurm-21.08/RELEASE_NOTES#L18-L19
https://bugs.schedmd.com/show_bug.cgi?id=12318
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Alan Orth
Sent: Tuesday, August 16, 2022 10:36 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [slurm-users] "Incompatible plugin version" after upgrade
External email: Use caution opening links or attachments
Dear list,
Twice this month I've had jobs stuck in completing state (CG). When I go to the compute node and check slurmd.log I see a message about "incompatible plugin version", for example:
[2022-08-16T03:36:25.823] [748139.batch] done with job
[2022-08-16T12:54:21.404] [748139.extern] plugin_load_from_file: Incompatible Slurm plugin /usr/lib64/slurm/hash_k12.so version (22.05.3
)
[2022-08-16T12:54:21.404] [748139.extern] error: Couldn't load specified plugin name for hash/k12: Incompatible plugin version
[2022-08-16T12:54:21.404] [748139.extern] error: cannot create hash context for K12
[2022-08-16T12:54:21.404] [748139.extern] error: slurm_send_node_msg: hash_g_compute: REQUEST_STEP_COMPLETE has error
[2022-08-16T12:54:21.404] [748139.extern] error: Rank 0 failed sending step completion message directly to slurmctld, retrying
For context, I did a minor upgrade of SLURM yesterday (22.05.2 to 22.05.3), so it's possible there is an incompatible version somewhere, but if I look up earlier in the log file I see that the running version of slurmd is correct and still prints that error right after startup:
[2022-08-15T22:27:59.865] slurmd version 22.05.3 started
[2022-08-15T22:27:59.867] slurmd started on Mon, 15 Aug 2022 22:27:59 +0300
[2022-08-15T22:27:59.869] CPUs=48 Boards=1 Sockets=2 Cores=24 Threads=1 Memory=386525 TmpDisk=71645 Uptime=2679297 CPUSpecList=(null) Fe
aturesAvail=(null) FeaturesActive=(null)
[2022-08-16T02:36:10.020] [748139.batch] plugin_load_from_file: Incompatible Slurm plugin /usr/lib64/slurm/hash_k12.so version (22.05.3)
[2022-08-16T02:36:10.020] [748139.batch] error: Couldn't load specified plugin name for hash/k12: Incompatible plugin version
[2022-08-16T02:36:10.022] [748139.batch] error: cannot create hash context for K12
I'm running SLURM 22.05.3. The slurmctld is running on CentOS 7, and compute nodes are on CentOS Stream 8 (not sure if this matters?).
Thanks for any advice,
--
Alan Orth
alan.orth at gmail.com<mailto:alan.orth at gmail.com>
https://picturingjordan.com<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpicturingjordan.com%2F&data=05%7C01%7Clyeager%40nvidia.com%7C7980b43c59e243a0b1da08da7f9d67fe%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637962611285459013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0CbO2KUBoi05CaMqMrZuUVa9JWso818sRYEHdHVk2ms%3D&reserved=0>
https://englishbulgaria.net<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fenglishbulgaria.net%2F&data=05%7C01%7Clyeager%40nvidia.com%7C7980b43c59e243a0b1da08da7f9d67fe%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637962611285459013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NDOmhxZik8lYdkQaFEw1G1OKppLByvPcp6NQCgnIKtI%3D&reserved=0>
https://mjanja.ch<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmjanja.ch%2F&data=05%7C01%7Clyeager%40nvidia.com%7C7980b43c59e243a0b1da08da7f9d67fe%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637962611285459013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1D9bsDMfVb7Znuy%2FuERSXv5xnfAjLJ8Hkt7LiBQDZRA%3D&reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220906/7862d787/attachment.htm>
More information about the slurm-users
mailing list