[slurm-users] License management and invoking scontrol in the prolog
Brian Andrus
toomuchit at gmail.com
Wed Sep 7 17:16:49 UTC 2022
Possibly way off base, but did you happen to do any of the editing in
Windows? Maybe running into the cr/lf issue for how windows saves text
files?
Brian Andrus
On 9/7/2022 5:21 AM, Davide DelVento wrote:
> Thanks Ole, your wiki page sheds some light on this mystery.
> Very frustrating that even the simple example provided in the release
> fails, and it fails at the most basic logging functionality.
>
> Note that "my" job_submit.lua is now the unmodified, slurm-provided
> one.... and that the luac command returns nothing in my case (this is
> Lua 5.3.4) so syntax seems correct?
>
> Yet the logs report the problem I mentioned rather than the actual
> content that the plugin is attempting to log.
>
> On Wed, Sep 7, 2022 at 2:13 AM Ole Holm Nielsen
> <Ole.H.Nielsen at fysik.dtu.dk> wrote:
>> Hi Davide,
>>
>> I suggest that you check your job_submit.lua script with the LUA compiler:
>>
>> luac -p /etc/slurm/job_submit.lua
>>
>> I have written some more details in my Wiki page
>> https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#job-submit-plugins
>>
>> Best regards,
>> Ole
>>
>> On 9/7/22 01:51, Davide DelVento wrote:
>>> Thanks again to both of you.
>>>
>>> I actually did not build Slurm myself, otherwise I'd keep extensive
>>> logs of what I did. Other people did, so I don't know. However, I get
>>> the same grep'ing results as yours.
>>>
>>> Looking at the logs reveals some info, but it's cryptic.
>>>
>>> [2022-09-06T17:33:56.513] debug3: job_submit/lua:
>>> slurm_lua_loadscript: skipping loading Lua script:
>>> /opt/slurm/job_submit.lua
>>> [2022-09-06T17:33:56.513] error: job_submit/lua:
>>> /opt/slurm/job_submit.lua: [string "slurm.user_msg
>>> (string.format(table.unpack({...})))"]:1: bad argument #2 to 'format'
>>> (no value)
>>>
>>> As you can see, there is no line number and there is nothing like
>>> user_msg in this code. There is indeed an "unpack" which is used in
>>> the SchedMD-defined logging helper function which has a comment
>>> "Implicit definition of arg was removed in Lua 5.2" and that's where I
>>> speculate the error occurs.
>>>
>>> I should stress, this is with their own example, not my code. I guess
>>> I could forgo the logging and move forward, but that won't probably
>>> lead me very far.
>>>
>>> I am contemplating submitting a github issue about it? I did check
>>> that the version of the job_submit.lua I have is the same currently in
>>> the repo at https://github.com/SchedMD/slurm/blob/master/etc/job_submit.lua.example
>>>
>>> On Thu, Sep 1, 2022 at 11:55 PM Ole Holm Nielsen
>>> <Ole.H.Nielsen at fysik.dtu.dk> wrote:
>>>> Did you install all prerequiste packages (including lua) on the server
>>>> where you built the Slurm packages?
>>>>
>>>> On my system I get:
>>>>
>>>> $ strings `which slurmctld ` | grep HAVE_LUA
>>>> HAVE_LUA 1
>>>>
>>>> /Ole
>>>>
>>>> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#install-prerequisites
>>>>
>>>> On 9/2/22 05:15, Davide DelVento wrote:
>>>>> Thanks.
>>>>>
>>>>> I did try a lua script as soon as I got your first email, but that
>>>>> never worked (yes, I enabled it in slurm.conf and ran "scontrol
>>>>> reconfigure" after). Slurm simply acted as if there was no job_submit script.
>>>>>
>>>>> After various tests, all unsuccessful, today I found that link which I
>>>>> mentioned saying that lua might not be compiled in, hence all my most
>>>>> recent messages of this thread.
>>>>>
>>>>> That file is indeed there, so that's good news that I don't need to recompile.
>>>>> However I'm puzzled on what might be missing...
>>>>>
>>>>>
>>>>> On Thu, Sep 1, 2022 at 6:33 PM Brian Andrus <toomuchit at gmail.com> wrote:
>>>>>> lua is the language you can use with the job_submit plugin.
>>>>>>
>>>>>> I was showing a quick way to see that job_submit capability is indeed in
>>>>>> there.
>>>>>>
>>>>>> You can see if lua support is there by looking for the job_submit_lua.so
>>>>>> file is there.
>>>>>> It would be part of the slurm rpm (not the slurm-slurmctl rpm)
>>>>>>
>>>>>> Usually it would be found at /usr/lib64/slurm/job_submit_lua.so
>>>>>>
>>>>>> If that is there, you should be good with trying out a job_submit lua
>>>>>> script.
>>>>>>
>>>>>> Brian Andrus
>>>>>>
>>>>>> On 9/1/2022 1:24 PM, Davide DelVento wrote:
>>>>>>> Thanks again, Brian, indeed that grep returns many hits, but none of
>>>>>>> them includes lua, i.e.
>>>>>>>
>>>>>>> strings `which slurmctld ` | grep -i job_submit | grep -i lua
>>>>>>>
>>>>>>> returns nothing. So I should use the C rather than the more convenient
>>>>>>> lua interface, unless I recompile or am I missing something?
>>>>>>>
>>>>>>> On Thu, Sep 1, 2022 at 12:30 PM Brian Andrus <toomuchit at gmail.com> wrote:
>>>>>>>> I would be surprised if it were compiled without the support. However,
>>>>>>>> you could check and run something like:
>>>>>>>>
>>>>>>>> strings /sbin/slurmctld | grep job_submit
>>>>>>>>
>>>>>>>> (or where ever your slurmctld binary is). There should be quite a few
>>>>>>>> lines with that in it.
>>>>>>>>
>>>>>>>> Brian Andrus
>>>>>>>>
>>>>>>>> On 9/1/2022 10:54 AM, Davide DelVento wrote:
>>>>>>>>> Thanks Brian for the suggestion, which I am now exploring.
>>>>>>>>>
>>>>>>>>> The documentation is a bit cryptic for me, but exploring a few things
>>>>>>>>> and checking https://funinit.wordpress.com/2018/06/07/how-to-use-job_submit_lua-with-slurm/
>>>>>>>>> I suspect my slurm install (provided by cluster vendor) was not
>>>>>>>>> compiled with the lua plugin installed. Do you know how to verify if
>>>>>>>>> that is the case or if it's something else? I don't see a way to show
>>>>>>>>> if the plugin is actually being "seen" by slurm, and I suspect it's
>>>>>>>>> not.
>>>>>>>>>
>>>>>>>>> Does anyone else have other suggestions or comment on either the
>>>>>>>>> plugin or the prolog workaround?
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2022 at 3:01 PM Brian Andrus <toomuchit at gmail.com> wrote:
>>>>>>>>>> Not sure if you can do all the things you intend, but the job_submit
>>>>>>>>>> script is precisely where you want to check submission options.
>>>>>>>>>>
>>>>>>>>>> https://slurm.schedmd.com/job_submit_plugins.html
>>>>>>>>>>
>>>>>>>>>> Brian Andrus
>>>>>>>>>>
>>>>>>>>>> On 8/30/2022 12:58 PM, Davide DelVento wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I would like to soft-enforce license utilization only when the -L is
>>>>>>>>>>> set. My idea: check in the prolog if the license was requested and
>>>>>>>>>>> only if it were, set the environmental variables needed for the
>>>>>>>>>>> license.
>>>>>>>>>>>
>>>>>>>>>>> I looked at all environmental variables set by slurm and did not find
>>>>>>>>>>> any related to the license as I was hoping.
>>>>>>>>>>>
>>>>>>>>>>> As a workaround, I could check
>>>>>>>>>>>
>>>>>>>>>>> scontrol show job $SLURM_JOB_ID | grep License
>>>>>>>>>>>
>>>>>>>>>>> and that would work, but (as discussed in other messages in this list)
>>>>>>>>>>> the documentation at https://slurm.schedmd.com/prolog_epilog.html say
>>>>>>>>>>>
>>>>>>>>>>>> Prolog and Epilog scripts should be designed to be as short as possible
>>>>>>>>>>>> and should not call Slurm commands (e.g. squeue, scontrol, sacctmgr,
>>>>>>>>>>>> etc). [...] Slurm commands in these scripts can potentially lead to performance
>>>>>>>>>>>> issues and should not be used.
>>>>>>>>>>> This is a bit of a concern, since the prolog would be invoked for
>>>>>>>>>>> every job on the cluster, and it's a prolog (rather than the epilogue
>>>>>>>>>>> like discussed in earlier messages).
>>>>>>>>>>>
>>>>>>>>>>> So two questions:
>>>>>>>>>>>
>>>>>>>>>>> 1) is there a better workaround to check in the prolog if the current
>>>>>>>>>>> job requested a license and/or
>>>>>>>>>>> 2) would this kind of use of scontrol be okay or is indeed a concern
>>>>>>>>>>>
More information about the slurm-users
mailing list