its friday and i'm either doing something silly or have a misconfig somewhere, i can't figure out which
when i run
sbatch --nodes=1 --cpus-per-task=1 --array=1-100 --output test_%A_%a.txt --wrap 'uname -n'
sbatch doesn't seem to be adhering to the --nodes param. when i look at my output files it's spreading them across more nodes. in the simple case above it's 50/50, but if i through a random sleep in, it'll be more. and if i expand the array it'll use even more nodes. i'm using con/tres and have cr_core_memory,cr_one_core_per_task set
Hi Michael,
if you submit a job-array, all resources related options (number of nodes, tasks, cpus per task, memory, time, ...) are meant *per array-task*. So in your case you start 100 array-tasks (you could also call them "sub-jobs") that *each* (not your whole job) is limited to one node, one cpu and the default amount of time, memory asf. A lot of them might run in parallel and potentially on many different nodes.
So what you get is the expected behaviour. If you really would like to limit all your array tasks to one node you had to specify it with '-w'.
Regards, Hermann
On 5/31/24 19:12, Michael DiDomenico via slurm-users wrote:
its friday and i'm either doing something silly or have a misconfig somewhere, i can't figure out which
when i run
sbatch --nodes=1 --cpus-per-task=1 --array=1-100 --output test_%A_%a.txt --wrap 'uname -n'
sbatch doesn't seem to be adhering to the --nodes param. when i look at my output files it's spreading them across more nodes. in the simple case above it's 50/50, but if i through a random sleep in, it'll be more. and if i expand the array it'll use even more nodes. i'm using con/tres and have cr_core_memory,cr_one_core_per_task set