Hi list,
In our institution, our instructions to users who want to spawn an interactive job (for us, a bash shell) have always been to do "srun ..." from the login node, which has always been working well for us. But when we had a recent Slurm training, the SchedMD folks advised us to use "salloc" and then "srun" to do interactive jobs. I tried this today, "salloc" gave me a shell on a server, the same as srun does, but then when I tried to "srun [programname]" it hung there with no output. Of course when I tried "srun [programname] &" it spawned the background job, and gave me back a prompt. Either time I had to Ctrl-C the running srun job, and got no output other than the srun/slurmstepd termination output.
I think I read somewhere that directly invoking srun creates an allocation; why then would I want to do an initial salloc, and then srun? (i the case that I want a foreground program, such as a bash shell)
I have surveyed some other institution's Slurm interactive jobs documentation for users, I see both examples of advice to run srun directly, or salloc and then srun.
Please help me to understand how this is intended to work, and if we are "doing it wrong" :)
Thanks,
Will