Hello. I am new to this list and Slurm overall. I have a lot of experience in computer operations, including Kubernetes, but I am currently exploring Slurm in some depth.
I have set up a small cluster and, in general, have gotten things working, but when I try to run a container job, it runs the command but then appears to hang as if the job container is still running.
So, running the following works, but it never returns to the prompt unless I use [Control-C].
$ srun --container /shared_fs/shared/oci_images/alpine uptime
19:21:47 up 20:43, 0 users, load average: 0.03, 0.25, 0.15
I'm unsure if something is misconfigured or if I'm misunderstanding how this should work, but any help and/or pointers would be greatly appreciated.
Thanks!
Sean