[slurm-users] seff in slurm-23.02

David Gauchard gauchard at laas.fr
Thu May 25 17:03:50 UTC 2023


Well, sorry, I indeed runned the raw script for this mail.
Running the installed one by `make install`, which is setting line 11 
path correctly:
	use lib qw(/usr/local/slurm-23.02.2/lib/x86_64-linux-gnu/perl/5.30.0);

I get:

perl: error: slurm_persist_conn_open: Something happened with the 
receiving/processing of the persistent connection init message to 
localhost:6819: Failed to unpack SLURM_PERSIST_INIT message
perl: error: Sending PersistInit msg: Message receive failure
Use of uninitialized value in subroutine entry at 
/usr/local/slurm/bin/seff line 57, <DATA> line 564.
perl: error: g_slurm_auth_pack: protocol_version 6500 not supported
perl: error: slurm_send_node_msg: g_slurm_auth_pack: 
REQUEST_PERSIST_INIT has  authentication error: No error
perl: error: slurm_persist_conn_open: failed to send persistent 
connection init message to localhost:6819
perl: error: Sending PersistInit msg: Protocol authentication error
perl: error: DBD_GET_JOBS_COND failure: Unspecified error
Job not found.

Slurm is otherwise running well after an update from 20.11 -> 21.08 -> 
23.02.

# sinfo -V
slurm 23.02.2
# sinfo -O nodehost,Version
HOSTNAMES           VERSION
x                   23.02.2
x                   23.02.2
x                   23.02.2
x                   23.02.2
x                   23.02.2
x                   23.02.2
x                   23.02.2
x                   23.02.2
x                   23.02.2
x                   23.02.2


On 5/25/23 18:33, Mike Robbert wrote:
> How did you install seff? I don’t know exactly where this happens, but 
> it looks like line 11 in the source file for seff is supposed to get 
> transformed to include an actual path. I am running on CentOS and 
> install Slurm by building the RPMs using the included spec files and 
> here is a diff of the file in the source tree and the file that got 
> installed to /usr/bin/seff
> 
> $ diff contribs/seff/seff /usr/bin/seff
> 
> 11c11
> 
> < use lib "${FindBin::Bin}/../lib/perl";
> 
> ---
> 
>> use lib qw(/usr/lib64/perl5);
> 
> *Mike Robbert*
> 
> *Cyberinfrastructure Specialist, Cyberinfrastructure and Advanced 
> Research Computing*
> 
> Information and Technology Solutions (ITS)
> 
> 303-273-3786 | mrobbert at mines.edu <mailto:mrobbert at mines.edu>
> 
> A close up of a sign Description automatically generated
> 
> *Our values:*Trust | Integrity | Respect | Responsibility
> 
> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of 
> David Gauchard <gauchard at laas.fr>
> *Date: *Thursday, May 25, 2023 at 10:02
> *To: *slurm-users at schedmd.com <slurm-users at schedmd.com>
> *Subject: *[EXTERNAL] [slurm-users] seff in slurm-23.02
> 
> CAUTION: This email originated from outside of the Colorado School of 
> Mines organization. Do not click on links or open attachments unless you 
> recognize the sender and know the content is safe.
> 
> 
> Hello,
> 
> slurm-23.02 on ubuntu-20.04,
> 
> seff is not working anymore:
> 
> ```
> # ./seff 4911385
> Use of uninitialized value $FindBin::Bin in concatenation (.) or string 
> at ./seff line 11.
> Name "FindBin::Bin" used only once: possible typo at ./seff line 11, 
> <DATA> line 602.
> perl: error: slurm_persist_conn_open: Something happened with the 
> receiving/processing of the persistent connection init message to 
> localhost:6819: Failed to unpack
> SLURM_PERSIST_INIT message
> perl: error: Sending PersistInit msg: Message receive failure
> Use of uninitialized value in subroutine entry at ./seff line 58, <DATA> 
> line 602.
> perl: error: [...]
> ```
> 
> 
> while using 
> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSchedMD%2Fslurm%2Fblob%2Fce7d569807c495516ebfa6fcef25ad36ccc76827%2Fcontribs%2Fseff%2Fseff%23LL19C3-L19C124&data=05%7C01%7Cmrobbert%40mines.edu%7C2a6103be8f63448b670d08db5d396909%7C997209e009b346239a4d76afa44a675c%7C0%7C0%7C638206273390681941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lQO0KSMPkx%2BSzejwv0qJ7WqGI43tGQDkYkutW2ghByE%3D&reserved=0 <https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSchedMD%2Fslurm%2Fblob%2Fce7d569807c495516ebfa6fcef25ad36ccc76827%2Fcontribs%2Fseff%2Fseff%23LL19C3-L19C124&data=05%7C01%7Cmrobbert%40mines.edu%7C2a6103be8f63448b670d08db5d396909%7C997209e009b346239a4d76afa44a675c%7C0%7C0%7C638206273390681941%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lQO0KSMPkx%2BSzejwv0qJ7WqGI43tGQDkYkutW2ghByE%3D&reserved=0> :
> 
> ```
> # sacct -P -n -a --format 
> JobID,User,Group,State,Cluster,AllocCPUS,REQMEM,TotalCPU,Elapsed,MaxRSS,ExitCode,NNodes,NTasks -j 4911385
> 4911385|user|part|FAILED|hpc|1|2000M|00:23.041|00:00:31||0:9|1|
> 4911385.batch|||CANCELLED by 0|hpc|1||00:23.041|00:00:31|5936692K|0:9|1|1
> ```
> 
> I wonder whether this is an installation error and contrib/seff is working
> for other 23.02 users.
> 
> Thanks
> 


More information about the slurm-users mailing list