[slurm-users] /usr/lib64/slurm/prep_script.so: undefined symbol: run_script

Braulio Solano Rojas braulio at solsoft.biz
Tue Jul 13 02:45:47 UTC 2021


Greetings,

I would like to install SLURM on Clear Linux because of its good 
benchmarks.  I have followed the tutorial at 
https://docs.01.org/clearlinux/latest/tutorials/hpc.html 
<https://docs.01.org/clearlinux/latest/tutorials/hpc.html>. When I got 
to the step of the section "Create slurm.conf configuration file" I 
noticed that slurmctld service didn't start. The error was related to 
the slurm.conf file. This was in the log:

jul 11 19:20:00 slurm-controller slurmctld[615]: error: Ignoring 
obsolete FastSchedule=1 option. Please remove from your configuration.
jul 11 19:20:00 slurm-controller slurmctld[615]: fatal: 
SallocDefaultCommand has been removed. Please consider setting 
LaunchParameters=use_interactive_step instead.

I deleted FastSchedule and SallocDefaultCommand. After that I added 
these lines:

LaunchParameters=use_interactive_step
InteractiveStepOptions="srun -n1 -N1 --pty --preserve-env --mpi=pmix_v3 
$SHELL"

After I corrected that I could not continue because there is an 
undefined symbol in a shared object.

This is the log:

[2021-07-11T19:35:14.260] slurmctld version 20.11.8 started on cluster linux
[2021-07-11T19:35:14.261] cred/munge: init: Munge credential signature 
plugin loaded
[2021-07-11T19:35:14.262] debug: auth/munge: init: Munge authentication 
plugin loaded
[2021-07-11T19:35:14.262] select/cons_res: common_init: select/cons_res 
loaded
[2021-07-11T19:35:14.263] select/linear: init: Linear node selection 
plugin loaded with argument 1
[2021-07-11T19:35:14.263] select/cons_tres: common_init: 
select/cons_tres loaded
[2021-07-11T19:35:14.263] preempt/none: init: preempt/none loaded
[2021-07-11T19:35:14.264] debug: acct_gather_energy/none: init: 
AcctGatherEnergy NONE plugin loaded
[2021-07-11T19:35:14.264] debug: acct_gather_Profile/none: init: 
AcctGatherProfile NONE plugin loaded
[2021-07-11T19:35:14.264] debug: acct_gather_interconnect/none: init: 
AcctGatherInterconnect NONE plugin loaded
[2021-07-11T19:35:14.264] debug: acct_gather_filesystem/none: init: 
AcctGatherFilesystem NONE plugin loaded
[2021-07-11T19:35:14.265] debug2: No acct_gather.conf file 
(/etc/slurm/acct_gather.conf)
[2021-07-11T19:35:14.265] debug: jobacct_gather/none: init: Job 
accounting gather NOT_INVOKED plugin loaded
[2021-07-11T19:35:14.265] error: plugin_load_from_file: 
dlopen(/usr/lib64/slurm/prep_script.so): 
/usr/lib64/slurm/prep_script.so: undefined symbol: run_script
[2021-07-11T19:35:14.265] error: Couldn't load specified plugin name for 
prep/script: Dlopen of plugin file failed
[2021-07-11T19:35:14.266] error: prep_plugin_init: cannot create prep 
context for prep/script
[2021-07-11T19:35:14.266] fatal: failed to initialize prep plugin

Since the slurm.conf file of the bundle (package) of Clear Linux is 
outdated, I thought that may be using a better configuration file the 
error would disappear.  My hypothesis was that maybe I needed to load 
another plugin that has the run_script symbol. Then, I tried creating a 
better configuration file using 
https://slurm.schedmd.com/configurator.easy.html.  But I got the same 
error.

Do you think it is either a bug of SLURM, something missing in the 
configuration or an error in the compilation of the bundle (package) I 
installed?  I have noticed that in other Linux distributions there are 
similar issues with precompiled packages. However, it happens with other 
shared objects and other symbols.

If the problem is Clear Linux what's the best Linux for SLURM?

I am attaching my latest test configuration file.

I would appreciate any help you may give me.  Thank very much in advance.

Best regards,

Braulio J. Solano-Rojas

-------------- next part --------------
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
SlurmctldHost=slurm-controller
#
#MailProg=/bin/mail
MpiDefault=pmix_v3
#MpiParams=ports=#-#
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/run/slurm/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/run/slurm/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurm/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/spool/slurm/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_CPU
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=citic-cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
#SlurmctldDebug=info
SlurmctldLogFile=/var/log/slurm/slurmctld.log
#SlurmdDebug=info
SlurmdLogFile=/var/log/slurm/slurmd.log
#
#
# COMPUTE NODES
NodeName=slurm-worker CPUs=2 Boards=1 SocketsPerBoard=2 CoresPerSocket=1 ThreadsPerCore=1 RealMemory=1968

PartitionName=workers Nodes=slurm-worker Default=YES MaxTime=INFINITE State=UP
PartitionName=debug Nodes=slurm-worker MaxTime=INFINITE State=UP


More information about the slurm-users mailing list