[slurm-users] License management and invoking scontrol in the prolog
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Wed Sep 7 08:09:30 UTC 2022
Hi Davide,
I suggest that you check your job_submit.lua script with the LUA compiler:
luac -p /etc/slurm/job_submit.lua
I have written some more details in my Wiki page
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins
Best regards,
Ole
On 9/7/22 01:51, Davide DelVento wrote:
> Thanks again to both of you.
>
> I actually did not build Slurm myself, otherwise I'd keep extensive
> logs of what I did. Other people did, so I don't know. However, I get
> the same grep'ing results as yours.
>
> Looking at the logs reveals some info, but it's cryptic.
>
> [2022-09-06T17:33:56.513] debug3: job_submit/lua:
> slurm_lua_loadscript: skipping loading Lua script:
> /opt/slurm/job_submit.lua
> [2022-09-06T17:33:56.513] error: job_submit/lua:
> /opt/slurm/job_submit.lua: [string "slurm.user_msg
> (string.format(table.unpack({...})))"]:1: bad argument #2 to 'format'
> (no value)
>
> As you can see, there is no line number and there is nothing like
> user_msg in this code. There is indeed an "unpack" which is used in
> the SchedMD-defined logging helper function which has a comment
> "Implicit definition of arg was removed in Lua 5.2" and that's where I
> speculate the error occurs.
>
> I should stress, this is with their own example, not my code. I guess
> I could forgo the logging and move forward, but that won't probably
> lead me very far.
>
> I am contemplating submitting a github issue about it? I did check
> that the version of the job_submit.lua I have is the same currently in
> the repo at https://github.com/SchedMD/slurm/blob/master/etc/job_submit.lua.example
>
> On Thu, Sep 1, 2022 at 11:55 PM Ole Holm Nielsen
> <Ole.H.Nielsen at fysik.dtu.dk> wrote:
>>
>> Did you install all prerequiste packages (including lua) on the server
>> where you built the Slurm packages?
>>
>> On my system I get:
>>
>> $ strings `which slurmctld ` | grep HAVE_LUA
>> HAVE_LUA 1
>>
>> /Ole
>>
>> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-prerequisites
>>
>> On 9/2/22 05:15, Davide DelVento wrote:
>>> Thanks.
>>>
>>> I did try a lua script as soon as I got your first email, but that
>>> never worked (yes, I enabled it in slurm.conf and ran "scontrol
>>> reconfigure" after). Slurm simply acted as if there was no job_submit script.
>>>
>>> After various tests, all unsuccessful, today I found that link which I
>>> mentioned saying that lua might not be compiled in, hence all my most
>>> recent messages of this thread.
>>>
>>> That file is indeed there, so that's good news that I don't need to recompile.
>>> However I'm puzzled on what might be missing...
>>>
>>>
>>> On Thu, Sep 1, 2022 at 6:33 PM Brian Andrus <toomuchit at gmail.com> wrote:
>>>>
>>>> lua is the language you can use with the job_submit plugin.
>>>>
>>>> I was showing a quick way to see that job_submit capability is indeed in
>>>> there.
>>>>
>>>> You can see if lua support is there by looking for the job_submit_lua.so
>>>> file is there.
>>>> It would be part of the slurm rpm (not the slurm-slurmctl rpm)
>>>>
>>>> Usually it would be found at /usr/lib64/slurm/job_submit_lua.so
>>>>
>>>> If that is there, you should be good with trying out a job_submit lua
>>>> script.
>>>>
>>>> Brian Andrus
>>>>
>>>> On 9/1/2022 1:24 PM, Davide DelVento wrote:
>>>>> Thanks again, Brian, indeed that grep returns many hits, but none of
>>>>> them includes lua, i.e.
>>>>>
>>>>> strings `which slurmctld ` | grep -i job_submit | grep -i lua
>>>>>
>>>>> returns nothing. So I should use the C rather than the more convenient
>>>>> lua interface, unless I recompile or am I missing something?
>>>>>
>>>>> On Thu, Sep 1, 2022 at 12:30 PM Brian Andrus <toomuchit at gmail.com> wrote:
>>>>>> I would be surprised if it were compiled without the support. However,
>>>>>> you could check and run something like:
>>>>>>
>>>>>> strings /sbin/slurmctld | grep job_submit
>>>>>>
>>>>>> (or where ever your slurmctld binary is). There should be quite a few
>>>>>> lines with that in it.
>>>>>>
>>>>>> Brian Andrus
>>>>>>
>>>>>> On 9/1/2022 10:54 AM, Davide DelVento wrote:
>>>>>>> Thanks Brian for the suggestion, which I am now exploring.
>>>>>>>
>>>>>>> The documentation is a bit cryptic for me, but exploring a few things
>>>>>>> and checking https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
>>>>>>> I suspect my slurm install (provided by cluster vendor) was not
>>>>>>> compiled with the lua plugin installed. Do you know how to verify if
>>>>>>> that is the case or if it's something else? I don't see a way to show
>>>>>>> if the plugin is actually being "seen" by slurm, and I suspect it's
>>>>>>> not.
>>>>>>>
>>>>>>> Does anyone else have other suggestions or comment on either the
>>>>>>> plugin or the prolog workaround?
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Aug 30, 2022 at 3:01 PM Brian Andrus <toomuchit at gmail.com> wrote:
>>>>>>>> Not sure if you can do all the things you intend, but the job_submit
>>>>>>>> script is precisely where you want to check submission options.
>>>>>>>>
>>>>>>>> https://slurm.schedmd.com/job_submit_plugins.html
>>>>>>>>
>>>>>>>> Brian Andrus
>>>>>>>>
>>>>>>>> On 8/30/2022 12:58 PM, Davide DelVento wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I would like to soft-enforce license utilization only when the -L is
>>>>>>>>> set. My idea: check in the prolog if the license was requested and
>>>>>>>>> only if it were, set the environmental variables needed for the
>>>>>>>>> license.
>>>>>>>>>
>>>>>>>>> I looked at all environmental variables set by slurm and did not find
>>>>>>>>> any related to the license as I was hoping.
>>>>>>>>>
>>>>>>>>> As a workaround, I could check
>>>>>>>>>
>>>>>>>>> scontrol show job $SLURM_JOB_ID | grep License
>>>>>>>>>
>>>>>>>>> and that would work, but (as discussed in other messages in this list)
>>>>>>>>> the documentation at https://slurm.schedmd.com/prolog_epilog.html say
>>>>>>>>>
>>>>>>>>>> Prolog and Epilog scripts should be designed to be as short as possible
>>>>>>>>>> and should not call Slurm commands (e.g. squeue, scontrol, sacctmgr,
>>>>>>>>>> etc). [...] Slurm commands in these scripts can potentially lead to performance
>>>>>>>>>> issues and should not be used.
>>>>>>>>> This is a bit of a concern, since the prolog would be invoked for
>>>>>>>>> every job on the cluster, and it's a prolog (rather than the epilogue
>>>>>>>>> like discussed in earlier messages).
>>>>>>>>>
>>>>>>>>> So two questions:
>>>>>>>>>
>>>>>>>>> 1) is there a better workaround to check in the prolog if the current
>>>>>>>>> job requested a license and/or
>>>>>>>>> 2) would this kind of use of scontrol be okay or is indeed a concern
>>>>>>>>>
More information about the slurm-users
mailing list