[slurm-users] Problem with srun on ARM Ubuntu servers
Daniel L'Hommedieu
dlhommedieu at gmail.com
Fri Jul 21 21:27:03 UTC 2023
Hi, everyone.
My team runs a SLURM cluster, currently SLURM17, but we are working to upgrade to 22, of about 800 servers. We currently have only x64 front-end servers, but we are looking to add some ARM servers. I have deployed some new ARM front end servers in exactly the same way the x64 ones are deployed, but srun does not work on the ARM systems. To be clear: a job is created, but srun does not connect the user to that job. My command is “srun —pty bash” — same as on the x64 system.
On the x64 system, “srun —pty bash” results in a job and a shell on a server. On the ARM system, “srun —pty bash” results in the creation of a job, but srun never connects me to the shell on the server.
The controller log shows the job, and “squeue -u $USER” shows the job, but srun just doesn’t connect to the job.
I have done web searches and have not gotten any ideas on what might be causing this. Anyone seen this? Any ideas on how to fix it?
Thanks for any guidance or ideas.
Daniel
More information about the slurm-users
mailing list