<div dir="ltr"><div dir="ltr"><div style="font-family:tahoma,sans-serif" class="gmail_default">
<p class="MsoNormal"><span style="font-size:11pt;font-family:"Calibri",sans-serif;color:rgb(31,73,125)">>This line is probably what is limiting you to around 40gb.</span></p>
<p class="MsoNormal"><span style="font-family:"Tahoma",sans-serif">>#SBATCH --mem=38GB</span></p>
</div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif">Yes. If I change that value, the "ulimit -v" also changes. See below</div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif">[shams@hpc ~]$ cat slurm_blast.sh | grep mem<br>#SBATCH --mem=50GB<br>[shams@hpc ~]$ cat my_blast.log<br>virtual memory      (kbytes, -v) 57671680<br>/var/spool/slurmd/job00306/slurm_script: line 13: ulimit: virtual memory: cannot modify limit: Operation not permitted<br>virtual memory      (kbytes, -v) 57671680<br>Error memory mapping:/home/shams/ncbi-blast-2.9.0+/bin/nr.69.psq openedFilesCount=168 threadID=0<br>Error: NCBI C++ Exception:</div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif">However, the solution is not to change that parameter. There are two issue with that:</div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif">1) --mem belongs to the physical memory which is requested by job and is later reserved for the job by slurm.</div><div class="gmail_default" style="font-family:tahoma,sans-serif">So, on a 64GB node, if a user requests --mem=50GB, actually no one else can run a job with 10GB memory need.</div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif">2) The virtual size of the program (according) to the top is about 140GB. So, if I set --mem=140GB, the job stuck in the queue because requested information is invalid (node has 64GB of memory).</div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif">I really think there is a problem with slurm but can not find the root of the problem. The slurm config parameters are</div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif">Configuration data as of 2020-01-28T08:04:55<br>AccountingStorageBackupHost = (null)<br>AccountingStorageEnforce = associations,limits,qos,safe,wckeys<br>AccountingStorageHost  = hpc<br>AccountingStorageLoc   = N/A<br>AccountingStoragePort  = 6819<br>AccountingStorageTRES  = cpu,mem,energy,node,billing,fs/disk,vmem,pages,gres/gpu<br>AccountingStorageType  = accounting_storage/slurmdbd<br>AccountingStorageUser  = N/A<br>AccountingStoreJobComment = Yes<br>AcctGatherEnergyType   = acct_gather_energy/none<br>AcctGatherFilesystemType = acct_gather_filesystem/none<br>AcctGatherInterconnectType = acct_gather_interconnect/none<br>AcctGatherNodeFreq    = 0 sec<br>AcctGatherProfileType  = acct_gather_profile/none<br>AllowSpecResourcesUsage = 0<br>AuthAltTypes       = (null)<br>AuthInfo         = (null)<br>AuthType         = auth/munge<br>BatchStartTimeout    = 10 sec<br>BOOT_TIME        = 2020-01-27T09:53:58<br>BurstBufferType     = (null)<br>CheckpointType      = checkpoint/none<br>CliFilterPlugins     = (null)<br>ClusterName       = jupiter<br>CommunicationParameters = (null)<br>CompleteWait       = 0 sec<br>CoreSpecPlugin      = core_spec/none<br>CpuFreqDef        = Unknown<br>CpuFreqGovernors     = Performance,OnDemand,UserSpace<br>CredType         = cred/munge<br>DebugFlags        = Backfill,BackfillMap,NO_CONF_HASH,Priority<br>DefMemPerNode      = UNLIMITED<br>DisableRootJobs     = No<br>EioTimeout        = 60<br>EnforcePartLimits    = NO<br>Epilog          = (null)<br>EpilogMsgTime      = 2000 usec<br>EpilogSlurmctld     = (null)<br>ExtSensorsType      = ext_sensors/none<br>ExtSensorsFreq      = 0 sec<br>FairShareDampeningFactor = 5<br>FastSchedule       = 0<br>FederationParameters   = (null)<br>FirstJobId        = 1<br>GetEnvTimeout      = 2 sec<br>GresTypes        = gpu<br>GpuFreqDef        = high,memory=high<br>GroupUpdateForce     = 1<br>GroupUpdateTime     = 600 sec<br>HASH_VAL         = Match<br>HealthCheckInterval   = 0 sec<br>HealthCheckNodeState   = ANY<br>HealthCheckProgram    = (null)<br>InactiveLimit      = 30 sec<br>JobAcctGatherFrequency  = 30<br>JobAcctGatherType    = jobacct_gather/linux<br>JobAcctGatherParams   = (null)<br>JobCheckpointDir     = /var/spool/slurm.checkpoint<br>JobCompHost       = hpc<br>JobCompLoc        = /var/log/slurm_jobcomp.log<br>JobCompPort       = 0<br>JobCompType       = jobcomp/none<br>JobCompUser       = root<br>JobContainerType     = job_container/none<br>JobCredentialPrivateKey = (null)<br>JobCredentialPublicCertificate = (null)<br>JobDefaults       = (null)<br>JobFileAppend      = 0<br>JobRequeue        = 1<br>JobSubmitPlugins     = (null)<br>KeepAliveTime      = SYSTEM_DEFAULT<br>KillOnBadExit      = 0<br>KillWait         = 60 sec<br>LaunchParameters     = (null)<br>LaunchType        = launch/slurm<br>Layouts         =<br>Licenses         = (null)<br>LicensesUsed       = (null)<br>LogTimeFormat      = iso8601_ms<br>MailDomain        = (null)<br>MailProg         = /bin/mail<br>MaxArraySize       = 1001<br>MaxJobCount       = 10000<br>MaxJobId         = 67043328<br>MaxMemPerNode      = UNLIMITED<br>MaxStepCount       = 40000<br>MaxTasksPerNode     = 512<br>MCSPlugin        = mcs/none<br>MCSParameters      = (null)<br>MessageTimeout      = 10 sec<br>MinJobAge        = 300 sec<br>MpiDefault        = none<br>MpiParams        = (null)<br>MsgAggregationParams   = (null)<br>NEXT_JOB_ID       = 305<br>NodeFeaturesPlugins   = (null)<br>OverTimeLimit      = 0 min<br>PluginDir        = /usr/lib64/slurm<br>PlugStackConfig     = /etc/slurm/plugstack.conf<br>PowerParameters     = (null)<br>PowerPlugin       =<br>PreemptMode       = OFF<br>PreemptType       = preempt/none<br>PreemptExemptTime    = 00:00:00<br>PriorityParameters    = (null)<br>PrioritySiteFactorParameters = (null)<br>PrioritySiteFactorPlugin = (null)<br>PriorityDecayHalfLife  = 14-00:00:00<br>PriorityCalcPeriod    = 00:05:00<br>PriorityFavorSmall    = No<br>PriorityFlags      =<br>PriorityMaxAge      = 1-00:00:00<br>PriorityUsageResetPeriod = NONE<br>PriorityType       = priority/multifactor<br>PriorityWeightAge    = 10<br>PriorityWeightAssoc   = 0<br>PriorityWeightFairShare = 10000<br>PriorityWeightJobSize  = 100<br>PriorityWeightPartition = 10000<br>PriorityWeightQOS    = 0<br>PriorityWeightTRES    = cpu=2000,mem=1,gres/gpu=400<br>PrivateData       = none<br>ProctrackType      = proctrack/linuxproc<br>Prolog          = (null)<br>PrologEpilogTimeout   = 65534<br>PrologSlurmctld     = (null)<br>PrologFlags       = (null)<br>PropagatePrioProcess   = 0<br>PropagateResourceLimits = ALL<br>PropagateResourceLimitsExcept = (null)<br>RebootProgram      = (null)<br>ReconfigFlags      = (null)<br>RequeueExit       = (null)<br>RequeueExitHold     = (null)<br>ResumeFailProgram    = (null)<br>ResumeProgram      = /etc/slurm/resumehost.sh<br>ResumeRate        = 4 nodes/min<br>ResumeTimeout      = 450 sec<br>ResvEpilog        = (null)<br>ResvOverRun       = 0 min<br>ResvProlog        = (null)<br>ReturnToService     = 2<br>RoutePlugin       = route/default<br>SallocDefaultCommand   = (null)<br>SbcastParameters     = (null)<br>SchedulerParameters   = (null)<br>SchedulerTimeSlice    = 30 sec<br>SchedulerType      = sched/backfill<br>SelectType        = select/cons_res<br>SelectTypeParameters   = CR_CORE_MEMORY<br>SlurmUser        = root(0)<br>SlurmctldAddr      = (null)<br>SlurmctldDebug      = info<br>SlurmctldHost[0]     = hpc(10.1.1.1)<br>SlurmctldLogFile     = /var/log/slurm/slurmctld.log<br>SlurmctldPort      = 6817<br>SlurmctldSyslogDebug   = unknown<br>SlurmctldPrimaryOffProg = (null)<br>SlurmctldPrimaryOnProg  = (null)<br>SlurmctldTimeout     = 300 sec<br>SlurmctldParameters   = (null)<br>SlurmdDebug       = info<br>SlurmdLogFile      = /var/log/slurm/slurmd.log<br>SlurmdParameters     = (null)<br>SlurmdPidFile      = /var/run/slurmd.pid<br>SlurmdPort        = 6818<br>SlurmdSpoolDir      = /var/spool/slurmd<br>SlurmdSyslogDebug    = unknown<br>SlurmdTimeout      = 300 sec<br>SlurmdUser        = root(0)<br>SlurmSchedLogFile    = (null)<br>SlurmSchedLogLevel    = 0<br>SlurmctldPidFile     = /var/run/slurmctld.pid<br>SlurmctldPlugstack    = (null)<br>SLURM_CONF        = /etc/slurm/slurm.conf<br>SLURM_VERSION      = 19.05.2<br>SrunEpilog        = (null)<br>SrunPortRange      = 0-0<br>SrunProlog        = (null)<br>StateSaveLocation    = /var/spool/slurm.state<br>SuspendExcNodes     = (null)<br>SuspendExcParts     = (null)<br>SuspendProgram      = /etc/slurm/suspendhost.sh<br>SuspendRate       = 4 nodes/min<br>SuspendTime       = NONE<br>SuspendTimeout      = 45 sec<br>SwitchType        = switch/none<br>TaskEpilog        = (null)<br>TaskPlugin        = task/affinity<br>TaskPluginParam     = (null type)<br>TaskProlog        = (null)<br>TCPTimeout        = 2 sec<br>TmpFS          = /state/partition1<br>TopologyParam      = (null)<br>TopologyPlugin      = topology/none<br>TrackWCKey        = Yes<br>TreeWidth        = 50<br>UsePam          = 0<br>UnkillableStepProgram  = (null)<br>UnkillableStepTimeout  = 60 sec<br>VSizeFactor       = 110 percent<br>WaitTime         = 60 sec<br>X11Parameters      = (null)<br></div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div class="gmail_default" style="font-family:tahoma,sans-serif"><br></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><font face="tahoma,sans-serif">Regards,<br>Mahmood</font><br><br><br></div></div></div><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div lang="EN-US"><div class="gmail-m_4357538815788791602WordSection1"><div>
</div>
</div>
</div>
</blockquote></div></div>