[slurm-users] MaxJobs-limits

Wed Jan 29 16:36:35 UTC 2020

If you specify no cores or threads for a given job, you should end up with 1 task per node, and 1 cpu per task (see the sections for ntasks, ntasks-per-node, and cpus-per-task at [1] and [2]).

And yes, if you have one or more 10-core systems, and submit 8 1-core jobs, they’ll default to running on the same node, assuming everything was idle to start with.

Though I suppose you could tell Slurm that you have 8-core nodes instead of 10-core nodes, why would you want to limit the number of jobs or tasks that run on a compute node?

Most (if not all) of us let jobs run on every available core in the system (which normally means each job reserves resources for a period of time, and doesn’t try to run outside those limits), while reserving a small amount of RAM for OS and other processes.

I’m not saying you can’t have a good reason for what you’re asking, but I suspect there’s a better solution than restricting how many jobs can run on a given node.

[1] https://slurm.schedmd.com/sbatch.html
[2] https://slurm.schedmd.com/srun.html

> On Jan 28, 2020, at 10:19 PM, zz <anand633r at gmail.com> wrote:
> 
> Hi Michael,
> 
> Thanks for quick response what if we submit multiple job with out specifying core or thread in the same, all jobs will run parallely depends on the cpu available in the node, when no resouce available the job will go to queue as pending job. if i have 10 cpu in the system, if i submit 8 jobs (that each job requore 1 cpu max ) simultanously all 8 will run in the single node. Is there any way to limit the number of such jobs per node? ie: even the resouce availble can we say the node will only accept N jobs and kept all later jobs in pending queue. (like maxjobs, but per node not on cluster level), only cgroups is the solution  I suppose.
> 
> On Tue, Jan 28, 2020 at 7:42 PM Renfro, Michael <Renfro at tntech.edu> wrote:
> For the first question: you should be able to define each node’s core count, hyperthreading, or other details in slurm.conf. That would allow Slurm to schedule (well-behaved) tasks to each node without anything getting overloaded.
> 
> For the second question about jobs that aren’t well-behaved (a job requesting 1 CPU, but starting multiple parallel threads, or multiple MPI processes), you’ll also want to set up cgroups to constrain each job’s processes to its share of the node (so a 1-core job starting N threads will end up with each thread getting a 1/N share of a CPU).
> 
>> On Jan 28, 2020, at 6:12 AM, zz <anand633r at gmail.com> wrote:
>> 
>> Hi,
>> 
>> I am testing slurm for a small cluster, I just want to know that is there anyway I could set a max job limit per node, I have nodes with different specs running under same qos. Please ignore if it is a stupid question.
>> 
>> Also I would like to know what will happen when a process which is running on a dual core system which requires say 4 cores at  some step.
>> 
>> Thanks