[slurm-users] Question/Clarification: Batch array multiple tasks on nodes
Dana, Jason T.
Jason.Dana at jhuapl.edu
Tue Sep 1 16:36:42 UTC 2020
I am new to Slurm and I am working on setting up a cluster. I am testing out running a batch execution using an array and am seeing only one task executed in the array per node. Even if I specify in the sbatch command that only one node should be used, it executes a single task on each of the available nodes in the partition. I was under the impression that it would continue to execute tasks until the resources on the node or for the user were at their limit. Am I missing something or have I misinterpreted how sbatch and/or the job scheduling should work?
Here is one of the commands I have run:
sbatch --array=0-15 --partition=htc-amd --wrap 'python3 -c "import time; print(\"working\"); time.sleep(5)"'
The htc-amd partition has 8 nodes and the results of this command are a single task being run on each node while the others are queued waiting for them to finish. As I mentioned before, if I specify --nodes=1 it will still execute a single task on every node in the partition. The only way I have gotten it to use on a single node was to use --nodelist, which worked but only to execute a single task and queued the rest. I have also tried specifying --ntasks and --ntasks-per-node. It appears to reserve resources, as I can cause it to hit the QOS core/cpu limit, but it does not affect the number of tasks executed on each node.
Thank you for any help you can offer!
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users