[slurm-users] Pending with resource problems

Prentice Bisbal pbisbal at pppl.gov
Wed Apr 17 18:06:48 UTC 2019


No, Slurm goes strictly by what the job specifies for memory at submit 
time. Slurm has no way of knowing how much memory a job might need in 
the future. The only way to safely share a node is for Slurm to reserve 
the requested memory for the duration of the job. To do other wise would 
be a disaster.

Think about it: Your node has 64 GB of RAM.  Job1 starts and requests 40 
GB of memory, but it doesn't need that much memory until the last hour 
of an 8-hour job. For first 7 hours, it only needs 8 GB of RAM. 2 hours 
later, job2 is submitted, and will run for 12 hours, and needs 32 GB of 
memory for almost all of it's run-time. The node has enough cores for 
both jobs to run simultaneously.

If Slurm behaved the way you expected, job2 would start immediately. 
When job1 finally needs that 40 GB of memory, it tries to allocate them 
memory and then fails because job2 is already using that memory. That's 
not fair to job1, and this behavior would lead to jobs failing all the 
time.

Prentice

On 4/17/19 1:10 PM, Mahmood Naderan wrote:
> Yes. It seems that what user specifies, slurm will reserve that. The 
> other jobs realtime memory is less than what users had been specified. 
> I thought that slurm will dynamically handles that in order to put 
> more jobs in running state.
>
> Regards,
> Mahmood
>
>
>
>
> On Wed, Apr 17, 2019 at 7:54 PM Prentice Bisbal <pbisbal at pppl.gov 
> <mailto:pbisbal at pppl.gov>> wrote:
>
>     Mahmood,
>
>     What do you see as the problem here? To me, there is no problem
>     and the scheduler is working exactly has it should. The reason
>     "Resources" means that there are not enough computing resources
>     available for your job to run right now, so the job is setting in
>     the queue in the pending state waiting for the necessary resources
>     to become available. This is exactly what schedulers are
>
>     As Andreas pointed out, looking at the output of 'scontrol show
>     node compute-0-0' that you provided, compute-0-0 has 32 cores and
>     63 GB of RAM. Out of that 9 cores and 55 GB of RAM have already
>     been allocated, leaving 23 cores and only 8 GB of RAM available
>     for other jobs. The job you submitted requested 20 cores (tasks,
>     technically) and 40 GB of RAM. Since compute-0-0 doesn't have
>     enough RAM available, Slurm is keeping your job in the queue until
>     enough RAM is available for it to run. This is exactly what Slurm
>     should be doing.
>
>     Prentice
>
>     On 4/17/19 11:00 AM, Henkel, Andreas wrote:
>>     I think there isn’t enough memory.
>>     AllocTres Shows mem=55G
>>     And your job wants another 40G although the node only has 63G in
>>     total.
>>     Best,
>>     Andreas
>>
>>     Am 17.04.2019 um 16:45 schrieb Mahmood Naderan
>>     <mahmood.nt at gmail.com <mailto:mahmood.nt at gmail.com>>:
>>
>>>     Hi,
>>>     Although it was fine for previous job runs, the following script
>>>     now stuck as PD with the reason about resources.
>>>
>>>     $ cat slurm_script.sh
>>>     #!/bin/bash
>>>     #SBATCH --output=test.out
>>>     #SBATCH --job-name=g09-test
>>>     #SBATCH --ntasks=20
>>>     #SBATCH --nodelist=compute-0-0
>>>     #SBATCH --mem=40GB
>>>     #SBATCH --account=z7
>>>     #SBATCH --partition=EMERALD
>>>     g09 test.gjf
>>>     $ sbatch slurm_script.sh
>>>     Submitted batch job 878
>>>     $ squeue
>>>                  JOBID PARTITION     NAME USER ST       TIME  NODES
>>>     NODELIST(REASON)
>>>                    878   EMERALD g09-test shakerza PD      
>>>     0:00      1 (Resources)
>>>
>>>
>>>
>>>     However, all things look good.
>>>
>>>     $ sacctmgr list association
>>>     format=user,account,partition,grptres%20 | grep shaker
>>>     shakerzad+      local
>>>     shakerzad+         z7    emerald cpu=20,mem=40G
>>>     $ scontrol show node compute-0-0
>>>     NodeName=compute-0-0 Arch=x86_64 CoresPerSocket=1
>>>        CPUAlloc=9 CPUTot=32 CPULoad=8.89
>>>        AvailableFeatures=rack-0,32CPUs
>>>        ActiveFeatures=rack-0,32CPUs
>>>        Gres=(null)
>>>        NodeAddr=10.1.1.254 NodeHostName=compute-0-0 Version=18.08
>>>        OS=Linux 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50
>>>     UTC 2017
>>>        RealMemory=64261 AllocMem=56320 FreeMem=37715 Sockets=32 Boards=1
>>>        State=MIXED ThreadsPerCore=1 TmpDisk=444124 Weight=20511900
>>>     Owner=N/A MCS_label=N/A
>>>        Partitions=CLUSTER,WHEEL,EMERALD,QUARTZ
>>>        BootTime=2019-04-06T10:03:47 SlurmdStartTime=2019-04-06T10:05:54
>>>        CfgTRES=cpu=32,mem=64261M,billing=47
>>>        AllocTRES=cpu=9,mem=55G
>>>        CapWatts=n/a
>>>        CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
>>>        ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
>>>
>>>
>>>     Any idea?
>>>
>>>     Regards,
>>>     Mahmood
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190417/0a8d33a7/attachment-0001.html>


More information about the slurm-users mailing list