[slurm-users] Checking memory requirements in job_submit.lua
Hendryk Bockelmann
bockelmann at dkrz.de
Fri Jun 15 00:07:25 MDT 2018
Hi,
based on information given in job_submit_lua.c we decided not to use
pn_min_memory any more. The comment in src says:
/*
* FIXME: Remove this in the future, lua can't handle 64bit
* numbers!!!. Use min_mem_per_node|cpu instead.
*/
Instead we check in job_submit.lua for s,th, like
if (job_desc.min_mem_per_node ~= nil) and
(job_desc.min_mem_per_node == 0) then
slurm.log_user("minimum real mem per node specified as %u",
job_desc.min_mem_per_node)
end
For mem-per-cpu things are more confusing. Somehow min_mem_per_cpu =
2^63 = 0x8000000000000000 if sbatch/salloc does not set --mem-per-cpu,
instead of being nil as expected !
But one can still check for
if (job_desc.min_mem_per_cpu == 0) then
slurm.log_user("minimum real mem per CPU specified as %u",
job_desc.min_mem_per_cpu)
end
Maybe this helps a bit.
CU,
Hendryk
On 14.06.2018 19:38, Prentice Bisbal wrote:
>
> On 06/13/2018 01:59 PM, Prentice Bisbal wrote:
>> In my environment, we have several partitions that are 'general
>> access', with each partition providing different hardware resources
>> (IB, large mem, etc). Then there are other partitions that are for
>> specific departments/projects. Most of this configuration is
>> historical, and I can't just rearrange the partition layout, etc,
>> which would allow Slurm to apply it's own logic to redirect jobs to
>> the appropriate nodes.
>>
>> For the general access partitions, I've decided apply some of this
>> logic in my job_submit.lua script. This logic would look at some of
>> the job specifications and change the QOS/Partition for the job as
>> appropriate. One thing I'm trying to do is have large memory jobs be
>> assigned to my large memory partition, which is named mque for
>> historical reasons.
>>
>> To do this, I have added the following logic to my job_submit.lua script:
>>
>> if job_desc.pn_min_mem > 65536 then
>> slurm.user_msg("NOTICE: Partition switched to mque due to memory
>> requirements.")
>> job_desc.partition = 'mque'
>> job_desc.qos = 'mque'
>> return slurm.SUCCESS
>> end
>>
>> This works when --mem is specified, doesn't seem to work when
>> --mem-per-cpu is specified. What is the best way to check this when
>> --mem-per-cpu is specified instead? Logically, one would have to
>> calculate
>>
>> mem per node = ntasks_per_node * ( ntasks_per_core / min_mem_per_cpu )
>>
>> Is correct? If so, are there any flaws in the logic/variable names
>> above? Also, is this quantity automatically calculated in Slurm by a
>> variable that is accessible by job_submit.lua at this point, or do I
>> need to calculate this myself?
>>
>>
>
> I've given up on calculating mem per node when --mem-per-cpu is
> specified. I was hoping to do this to protect my users from themselves,
> but the more I think about this, the more this looks like a fool's errand.
>
> Prentice
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4973 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180615/1e3eded6/attachment.bin>
More information about the slurm-users
mailing list