Thanks Davide,
It's true that srun will create an allocation if you aren't inside a job, but if you are inside a job and you request more resources than it has, then srun will just fail. This is the key issue that I want to avoid.
On Sat, Apr 5, 2025 at 11:48 AM Davide DelVento davide.quantum@gmail.com wrote:
The plain srun is probably the best bet, and if you really need the thing to be started from another slurm job (rather than the login node) you will need to exploit the fact that
If necessary, srun will first create a resource allocation in which to
run the parallel job.
AFAIK, there is no option to for the "create a resource allocation" even if it's not necessary. But you may try to request something that is "above and beyond" what the current allocation provides, and that might solve your problem. Looking at the srun man page, I could speculate that --clusters or --cluster-constraint might help in that regard (but I am not sure).
Have a nice weekend
On Fri, Apr 4, 2025 at 6:27 AM Michael Milton via slurm-users < slurm-users@lists.schedmd.com> wrote:
I'm helping with a workflow manager that needs to submit Slurm jobs. For logging and management reasons, the job (e.g. srun python) needs to be run as though it were a regular subprocess (python):
- stdin, stdout and stderr for the command should be connected to
process inside the job
- signals sent to the command should be sent to the job process
- We don't want to use the existing job allocation, if this is run
from a Slurm job
- The command should only terminate when the job is finished, to
avoid us needing to poll Slurm
We've tried:
- sbatch --wait, but then SIGTERM'ing the process doesn't kill the job
- salloc, but that requires a TTY process to control it (?)
- salloc srun seems to mess with the terminal when it's killed,
likely because of being "designed to be executed in the foreground"
- Plain srun re-uses the existing Slurm allocation, and specifying
resources like --mem will just request then from the current job rather than submitting a new one
What is the best solution here?
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com