[slurm-users] GRES and GPUs

Thu Jul 20 14:02:21 UTC 2023

Hey everyone,

I am answering my own question:
It wasn't working because I need to *reload slurmd* on the machine, too.
So the full "test gpu management without gpu" workflow is:

1. Start your slurm cluster.
2. Add a gpu to an instance of your choice in the *slurm.conf*

For example:*
*

    *DebugFlags=GRES *# consider this for initial setup.
    *SelectType=select/cons_tres**
    **GresTypes=gpu*
    NodeName=master SocketsPerBoard=8 CoresPerSocket=1 RealMemory=8000
    *GRES=gpu:1* State=UNKNOWN

3. Register it at *gres.conf *and give it *some file*

    NodeName=master Name=gpu File=/dev/tty0 Count=1 # count seems to be
    optional

4. Reload slurmctld (on the master) and slurmd (on the gpu node)*
*

    *sudo systemctl restart slurmctld**
    **sudo systemctl restart slurmd*

I haven't tested this solution thoroughly yet, but at least commands like:*
*

    *sudo systemctl restart slurmd*
    # master

run without any issues afterwards.

Thank you for all your help!

Best regards,
Xaver

On 19.07.23 17:05, Xaver Stiensmeier wrote:
>
> Hi Hermann,
>
> count doesn't make a difference, but I noticed that when I reconfigure
> slurm and do reloads afterwards, the error "gpu count lower than
> configured" no longer appears - so maybe it is just because a
> reconfigure is needed after reloading slurmctld - or maybe it doesn't
> show the error anymore, because the node is still invalid? However, I
> still get the error:
>
>     error: _slurm_rpc_node_registration node=NName: Invalid argument
>
> If I understand correctly, this is telling me that there's something
> wrong with my slurm.conf. I know that all pre-existing parameters are
> correct, so I assume it must be the gpus entry, but I don't see where
> it's wrong:
>
>     NodeName=NName SocketsPerBoard=8 CoresPerSocket=1 RealMemory=8000
>     Gres=gpu:1 State=CLOUD # bibiserv
>
> Thanks for all the help,
> Xaver
>
> On 19.07.23 15:04, Hermann Schwärzler wrote:
>> Hi Xaver,
>>
>> I think you are missing the "Count=..." part in gres.conf
>>
>> It should read
>>
>> NodeName=NName Name=gpu File=/dev/tty0 Count=1
>>
>> in your case.
>>
>> Regards,
>> Hermann
>>
>> On 7/19/23 14:19, Xaver Stiensmeier wrote:
>>> Okay,
>>>
>>> thanks to S. Zhang I was able to figure out why nothing changed.
>>> While I did restart systemctld at the beginning of my tests, I
>>> didn't do so later, because I felt like it was unnecessary, but it
>>> is right there in the fourth line of the log that this is needed.
>>> Somehow I misread it and thought it automatically restarted slurmctld.
>>>
>>> Given the setup:
>>>
>>> slurm.conf
>>> ...
>>> GresTypes=gpu
>>> NodeName=NName SocketsPerBoard=8 CoresPerSocket=1 RealMemory=8000
>>> GRES=gpu:1 State=UNKNOWN
>>> ...
>>>
>>> gres.conf
>>> NodeName=NName Name=gpu File=/dev/tty0
>>>
>>> When restarting, I get the following error:
>>>
>>> error: Setting node NName state to INVAL with reason:gres/gpu count
>>> reported lower than configured (0 < 1)
>>>
>>> So it is still not working, but at least I get a more helpful log
>>> message. Because I know that this /dev/tty trick works, I am still
>>> unsure where the current error lies, but I will try to investigate
>>> it further. I am thankful for any ideas in that regard.
>>>
>>> Best regards,
>>> Xaver
>>>
>>> On 19.07.23 10:23, Xaver Stiensmeier wrote:
>>>>
>>>> Alright,
>>>>
>>>> I tried a few more things, but I still wasn't able to get past:
>>>> srun: error: Unable to allocate resources: Invalid generic resource
>>>> (gres) specification.
>>>>
>>>> I should mention that the node I am trying to test GPU with,
>>>> doesn't really have a gpu, but Rob was so kind to find out that you
>>>> do not need a gpu as long as you just link to a file in /dev/ in
>>>> the gres.conf. As mentioned: This is just for testing purposes - in
>>>> the end we will run this on a node with a gpu, but it is not
>>>> available at the moment.
>>>>
>>>> *The error isn't changing*
>>>>
>>>> If I omitt "GresTypes=gpu" and "Gres=gpu:1", I still get the same
>>>> error.
>>>>
>>>> *Debug Info*
>>>>
>>>> I added the gpu debug flag and logged the following:
>>>>
>>>> [2023-07-18T14:59:45.026] restoring original state of nodes
>>>> [2023-07-18T14:59:45.026] select/cons_tres: part_data_create_array:
>>>> select/cons_tres: preparing for 2 partitions
>>>> [2023-07-18T14:59:45.026] error: GresPlugins changed from (null) to
>>>> gpu ignored
>>>> [2023-07-18T14:59:45.026] error: Restart the slurmctld daemon to
>>>> change GresPlugins
>>>> [2023-07-18T14:59:45.026] read_slurm_conf: backup_controller not
>>>> specified
>>>> [2023-07-18T14:59:45.026] error: GresPlugins changed from (null) to
>>>> gpu ignored
>>>> [2023-07-18T14:59:45.026] error: Restart the slurmctld daemon to
>>>> change GresPlugins
>>>> [2023-07-18T14:59:45.026] select/cons_tres: select_p_reconfigure:
>>>> select/cons_tres: reconfigure
>>>> [2023-07-18T14:59:45.027] select/cons_tres: part_data_create_array:
>>>> select/cons_tres: preparing for 2 partitions
>>>> [2023-07-18T14:59:45.027] No parameter for mcs plugin, default
>>>> values set
>>>> [2023-07-18T14:59:45.027] mcs: MCSParameters = (null). ondemand set.
>>>> [2023-07-18T14:59:45.028] _slurm_rpc_reconfigure_controller:
>>>> completed usec=5898
>>>> [2023-07-18T14:59:45.952]
>>>> SchedulerParameters=default_queue_depth=100,max_rpc_cnt=0,max_sched_time=2,partition_job_depth=0,sched_max_job_start=0,sched_min_interval=2
>>>>
>>>> I am a bit unsure what to do next to further investigate this issue.
>>>>
>>>> Best regards,
>>>> Xaver
>>>>
>>>> On 17.07.23 15:57, Groner, Rob wrote:
>>>>> That would certainly do it.  If you look at the slurmctld log when
>>>>> it comes up, it will say that it's marking that node as invalid
>>>>> because it has less (0) gres resources then you say it should
>>>>> have.  That's because slurmd on that node will come up and say
>>>>> "What gres resources??"
>>>>>
>>>>> For testing purposes,  you can just create a dummy file on the
>>>>> node, then in gres.conf, point to that file as the "graphics file"
>>>>> interface.  As long as you don't try to actually use it as a
>>>>> graphics file, that should be enough for that node to think it has
>>>>> gres/gpu resources. That's what I do in my vagrant slurm cluster.
>>>>>
>>>>> Rob
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on
>>>>> behalf of Xaver Stiensmeier <xaverstiensmeier at gmx.de>
>>>>> *Sent:* Monday, July 17, 2023 9:43 AM
>>>>> *To:* slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>
>>>>> *Subject:* Re: [slurm-users] GRES and GPUs
>>>>> Hi Hermann,
>>>>>
>>>>> Good idea, but we are already using `SelectType=select/cons_tres`.
>>>>> After
>>>>> setting everything up again (in case I made an unnoticed mistake),
>>>>> I saw
>>>>> that the node got marked STATE=inval.
>>>>>
>>>>> To be honest, I thought I can just claim that a node has a gpu
>>>>> even if
>>>>> it doesn't have one - just for testing purposes. Could this be the
>>>>> issue?
>>>>>
>>>>> Best regards,
>>>>> Xaver Stiensmeier
>>>>>
>>>>> On 17.07.23 14:11, Hermann Schwärzler wrote:
>>>>> > Hi Xaver,
>>>>> >
>>>>> > what kind of SelectType are you using in your slurm.conf?
>>>>> >
>>>>> > Per
>>>>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fgres.html&data=05%7C01%7Crug262%40psu.edu%7Cbc4b7775beae4d2e376c08db86cbfc7b%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C638251982928987379%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PqvE6pL2sKSb6KxLngi0sbm6qhIv8MRYTmUM%2Bgq1hrI%3D&reserved=0
>>>>> <https://slurm.schedmd.com/gres.html> you have to consider:
>>>>> > "As for the --gpu* option, these options are only supported by
>>>>> Slurm's
>>>>> > select/cons_tres plugin."
>>>>> >
>>>>> > So you can use "--gpus ..." only when you state
>>>>> > SelectType              = select/cons_tres
>>>>> > in your slurm.conf.
>>>>> >
>>>>> > But "--gres=gpu:1" should work always.
>>>>> >
>>>>> > Regards
>>>>> > Hermann
>>>>> >
>>>>> >
>>>>> > On 7/17/23 13:43, Xaver Stiensmeier wrote:
>>>>> >> Hey,
>>>>> >>
>>>>> >> I am currently trying to understand how I can schedule a job that
>>>>> >> needs a GPU.
>>>>> >>
>>>>> >> I read about GRES
>>>>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fgres.html&data=05%7C01%7Crug262%40psu.edu%7Cbc4b7775beae4d2e376c08db86cbfc7b%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C638251982928987379%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PqvE6pL2sKSb6KxLngi0sbm6qhIv8MRYTmUM%2Bgq1hrI%3D&reserved=0
>>>>> <https://slurm.schedmd.com/gres.html> and tried to use:
>>>>> >>
>>>>> >> GresTypes=gpu
>>>>> >> NodeName=test Gres=gpu:1
>>>>> >>
>>>>> >> But calling - after a 'sudo scontrol reconfigure':
>>>>> >>
>>>>> >> srun --gpus 1 hostname
>>>>> >>
>>>>> >> didn't work:
>>>>> >>
>>>>> >> srun: error: Unable to allocate resources: Invalid generic
>>>>> resource
>>>>> >> (gres) specification
>>>>> >>
>>>>> >> so I read more
>>>>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fgres.conf.html&data=05%7C01%7Crug262%40psu.edu%7Cbc4b7775beae4d2e376c08db86cbfc7b%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C638251982928987379%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aCh8X6QtJpRlIWxo%2BQxL85CC%2FbIo6bDxAY%2Fd5B9khmE%3D&reserved=0
>>>>> <https://slurm.schedmd.com/gres.conf.html> but that
>>>>> >> didn't really help me.
>>>>> >>
>>>>> >>
>>>>> >> I am rather confused. GRES claims to be generic resources but
>>>>> then it
>>>>> >> comes with three defined resources (GPU, MPS, MIG) and using
>>>>> one of
>>>>> >> those didn't work in my case.
>>>>> >>
>>>>> >> Obviously, I am misunderstanding something, but I am unsure
>>>>> where to
>>>>> >> look.
>>>>> >>
>>>>> >>
>>>>> >> Best regards,
>>>>> >> Xaver Stiensmeier
>>>>> >>
>>>>> >
>>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230720/3157f94c/attachment-0001.htm>