[slurm-users] [External] sbatch: error: memory allocation failure
Prentice Bisbal
pbisbal at pppl.gov
Thu Jun 17 19:45:08 UTC 2021
Mike,
You don't include your entire sbatch script, so it's really hard to say
what's going wrong when we only have a single line to work with. Based
on what you have told us, I'm guessing you are specifying a memory
requirement per node greater than 128000. When you specify a nodelist,
Slurm will assign your job to all of those nodes, not a subset that
matches the other job specifications (--mem or --mem-per-cpu, or
--tasks, etc.):
> *-w*, *--nodelist*=</node name list/>
> Request a specific list of hosts. The job will contain /all/ of
> these hosts and possibly additional hosts as needed to satisfy
> resource requirements.
>
Prentice
On 6/7/21 7:46 PM, Yap, Mike wrote:
>
> Hi All
>
> Can another advise the possibilities of me encountering the error
> message as below when submitting a job ?
>
> *sbatch: error: memory allocation failure*
>
> The same script use work perfectly fine until I include *#SBATCH
> --nodelist=(compute[015-046]) (once removed it work as it should)*
>
> The issues
>
> 1. For the current setup, I have specific resources available for
> each compute node
> 1. (NodeName=compute[007-014] Procs=36 CoresPerSocket=18
> RealMemory=384000 ThreadsPerCore=1 Boards=1 SocketsPerBoard=2)
> – newer model
> 2. (NodeName=compute[001-006] Procs=16 CoresPerSocket=18
> RealMemory=128000 ThreadsPerCore=1 Boards=1 SocketsPerBoard=2)
> 2. I have same resources sharing between multiple queue (working fine)
> 3. When running on parallel job, the exact same job run when assigned
> to the same node category (ie exclusively on 1a or 1b)
> 4. When running the exact same jobs but assigned between 1a and 1b,
> the job will run on 1b node but no activities on 1a
>
> Any suggestion
>
> Thanks
>
> Mike
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210617/83ced388/attachment.htm>
More information about the slurm-users
mailing list