[slurm-users] Virtual memory size requested by slurm

Mahmood Naderan mahmood.nt at gmail.com
Mon Jan 27 06:47:41 UTC 2020


>alternatively, you can ask slurm not to limit VSZ: in cgroup.conf, have
>ConstrainSwapSpace=no
>this does not actually permit arbitrary VSZ, since there are mechanisms
>outside the cgroup limit that affect max VSZ (overcommit sysctls, swap
space)


Hi Mark,
ConstrainSwapSpace=no or ConstrainSwapSpace=yes has no effect...

[root at hpc ~]$ cat /etc/slurm/cgroup.conf
CgroupAutomount=yes
CgroupReleaseAgentDir="/etc/slurm/cgroup"
ConstrainCores=no
ConstrainRAMSpace=no
ConstrainSwapSpace=no
[root at hpc ~]# systemctl restart slurmd
[root at hpc ~]# systemctl restart slurmdbd
[root at hpc ~]# systemctl restart slurmctld
[root at hpc ~]# su - shams
[shams at hpc ~]$ cat slurm_blast.sh
#!/bin/bash
#SBATCH --job-name=blast1
#SBATCH --output=my_blast.log
#SBATCH --partition=SEA
#SBATCH --account=fish
#SBATCH --mem=38GB
#SBATCH --nodelist=hpc
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
export PATH=~/ncbi-blast-2.9.0+/bin:$PATH
blastx -db ~/ncbi-blast-2.9.0+/bin/nr -query ~/khTrinityfilterless1.fasta
-max_target_seqs 5 -outfmt 6 -evalue 1e-5 -num_threads 2
[shams at hpc ~]$ sbatch slurm_blast.sh
Submitted batch job 299
[shams at hpc ~]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES
NODELIST(REASON)
[shams at hpc ~]$ cat my_blast.log
Error memory mapping:/home/shams/ncbi-blast-2.9.0+/bin/nr.52.psq
openedFilesCount=151 threadID=0
Error: NCBI C++ Exception:
    T0
"/home/coremake/release_build/build/PrepareRelease_Linux64-Centos_JSID_01_560232_130.14.18.6_9008__PrepareRelease_Linux64-Centos_1552331742/c++/compilers/unix/../../src/corelib/ncbiobj.cpp",
line 981: Critical: ncbi::CObject::ThrowNullPointerException() - Attempt to
access NULL pointer.
     Stack trace:
      blastx ???:0 ncbi::CStackTraceImpl::CStackTraceImpl() offset=0x77
addr=0x1d95da7
      blastx ???:0 ncbi::CStackTrace::CStackTrace(std::string const&)
offset=0x25 addr=0x1d98465
      blastx ???:0 ncbi::CException::x_GetStackTrace() offset=0xA0
addr=0x1ec7330
      blastx ???:0 ncbi::CException::SetSeverity(ncbi::EDiagSev)
offset=0x49 addr=0x1ec2169
      blastx ???:0 ncbi::CObject::ThrowNullPointerException() offset=0x2D2
addr=0x1f42582
      blastx ???:0 ncbi::blast::CBlastTracebackSearch::Run() offset=0x61C
addr=0xf2929c
      blastx ???:0 ncbi::blast::CLocalBlast::Run() offset=0x404
addr=0xed4684
      blastx ???:0 CBlastxApp::Run() offset=0xC9C addr=0x9cbf7c
      blastx ???:0 ncbi::CNcbiApplication::x_TryMain(ncbi::EAppDiagStream,
char const*, int*, bool*) offset=0x8E3 addr=0x1da0e13
      blastx ???:0 ncbi::CNcbiApplication::AppMain(int, char const* const*,
char const* const*, ncbi::EAppDiagStream, char const*, std::string const&)
offset=0x782 addr=0x1d9f6b2
      blastx ???:0 main offset=0x5E5 addr=0x9caa05
      /lib64/libc.so.6 ???:0 __libc_start_main offset=0xF5
addr=0x7f6119e0e505
      blastx ???:0 blastx() [0x9ca345] offset=0x0 addr=0x9ca345
[shams at hpc ~]$ free -mh
              total        used        free      shared  buff/cache
available
Mem:            62G        9.2G        289M        277M         53G
52G
Swap:            9G        1.0M          9G



Please note that when I run the command

[shams at hpc ~]$ blastx -db ~/ncbi-blast-2.9.0+/bin/nr -query
~/khTrinityfilterless1.fasta -max_target_seqs 5 -outfmt 6 -evalue 1e-5
-num_threads 2

The top command shows the following memory values

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
24449 shams     20   0   80.1g   1.3g   1.3g S 199.7  2.1   7:20.67 blastx

The Res value increases slightly over time but at the beginning of the run,
the VIRT is 80GB.
Even, if my have specified small --mem (now 38GB), the error should be
thrown in the middle of the run.
However, when I use sbatch, the program quickly terminates.


Regards,
Mahmood
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200127/9f5d2d84/attachment.htm>


More information about the slurm-users mailing list