[slurm-users] Query about Compute + GPUs
Merlin Hartley
merlin-slurm at mrc-mbu.cam.ac.uk
Tue Nov 21 03:26:53 MST 2017
Could you give us your submission command?
It may be that you are requesting the wrong partition - i.e. relying on the default partition selection…
try with “--partition cpu”
M
--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
Cambridge, CB2 0XY
United Kingdom
> On 21 Nov 2017, at 09:52, Markus Köberl <markus.koeberl at tugraz.at <mailto:markus.koeberl at tugraz.at>> wrote:
>
> On Friday, 3 November 2017 10:12:32 CET Merlin Hartley wrote:
>> They would need to have different NodeNames - but the same NodeAddr for
>> example:
>>
>> NodeName=fisesta-21-3 NodeAddr=10.1.21.3 CPUs=6 Weight=20485797
>> Feature=rack-21,6CPUs NodeName=fisesta-21-3-gpu NodeAddr=10.1.21.3 CPUs=2
>> Weight=20485797 Feature=rack-21,2CPUs Gres=gpu:1
>>
>> Hope this is useful!
>
> For me this is not working.
>
> I have the following lines in slurm.conf:
>
> NodeName=gpu1 NodeAddr=10.1.2.3 RealMemory=229376 Weight=998002 Sockets=2
> CoresPerSocket=3 ThreadsPerCore=2 Gres=gpu:TeslaK40c:6
>
> NodeName=gpu1-cpu NodeAddr=10.1.2.3 RealMemory=229376 Weight=998002 Sockets=2
> CoresPerSocket=11 ThreadsPerCore=2
>
> PartitionName=gpu Nodes=gpu1
> PartitionName=cpu Nodes=gpu1-cpu
>
> But if i submit to node gpu1-cpu I get the following error:
>
> [2017-11-21T09:06:55.840] launch task 999708.0 request from 1044.1000 at 10.1.2.3 <mailto:1044.1000 at 10.1.2.3>
> (port 45252)
> [2017-11-21T09:06:55.840] error: Invalid job 999708.0 credential for user
> 1044: host gpu1 not in hostset gpu1-cpu
> [2017-11-21T09:06:55.840] error: Invalid job credential from 1044 at 10.1.2.3 <mailto:1044 at 10.1.2.3>:
> Invalid job credential
>
> It seams I am missing something. Any ideas what that could be?
> I am using slurm 16.05.9 on debian stretch.
>
>
> regards
> Markus Köberl
> --
> Markus Koeberl
> Graz University of Technology
> Signal Processing and Speech Communication Laboratory
> E-mail: markus.koeberl at tugraz.at <mailto:markus.koeberl at tugraz.at>
>
--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
Cambridge, CB2 0XY
United Kingdom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171121/f60171d1/attachment.html>
More information about the slurm-users
mailing list