[slurm-users] strange resource allocation issue - thoughts?
pbisbal at pppl.gov
Wed Mar 27 15:02:30 UTC 2019
On 3/23/19 2:16 PM, Sharma, M D wrote:
> Hi folks,
> By default slurm allocates the whole node for a job (even if it
> specifically requested a single core). This is usually taken care of
> by adding SelectType=select/cons_res along with an appropriate
> parameter such as SelectTypeParameters=CR_Core_Memory.
> When testing the job submission and resource allocation, we can see
> things work as intended when using srun:
> srun -N1 -n1 -p fxq --mem=1000 sleep 60 &
> # A command as above, submitted 20 times would launch 20 jobs on a
> single 40 core node as intended.
> However, if the same request is submitted via sbatch, the entire node
> gets into an "allocated" state and does not accept any other jobs
> until completion of the single core job.
> Has anyone else seen this behaviour / have thoughts on a fix?
I believe by requesting -N1, you are requesting the entire node be
allocation to that job. Remove -N1 from that command, like this:
srun -n1 -p fxq --mem=1000 sleep 60 &
That should allow you to run multiple jobs on the same node.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users