[slurm-users] Query about Compute + GPUs

Merlin Hartley merlin-slurm at mrc-mbu.cam.ac.uk
Tue Nov 21 03:26:53 MST 2017


Could you give us your submission command?
It may be that you are requesting the wrong partition - i.e. relying on the default partition selection… 
try with “--partition cpu”


M



--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
Cambridge, CB2 0XY
United Kingdom

> On 21 Nov 2017, at 09:52, Markus Köberl <markus.koeberl at tugraz.at <mailto:markus.koeberl at tugraz.at>> wrote:
> 
> On Friday, 3 November 2017 10:12:32 CET Merlin Hartley wrote:
>> They would need to have different NodeNames - but the same NodeAddr for
>> example:
>> 
>> NodeName=fisesta-21-3 NodeAddr=10.1.21.3 CPUs=6 Weight=20485797
>> Feature=rack-21,6CPUs NodeName=fisesta-21-3-gpu NodeAddr=10.1.21.3 CPUs=2
>> Weight=20485797 Feature=rack-21,2CPUs Gres=gpu:1
>> 
>> Hope this is useful!
> 
> For me this is not working.
> 
> I have the following lines in slurm.conf:
> 
> NodeName=gpu1 NodeAddr=10.1.2.3 RealMemory=229376 Weight=998002  Sockets=2 
> CoresPerSocket=3 ThreadsPerCore=2 Gres=gpu:TeslaK40c:6
> 
> NodeName=gpu1-cpu NodeAddr=10.1.2.3 RealMemory=229376 Weight=998002  Sockets=2 
> CoresPerSocket=11 ThreadsPerCore=2
> 
> PartitionName=gpu Nodes=gpu1
> PartitionName=cpu Nodes=gpu1-cpu
> 
> But if i submit to node gpu1-cpu I get the following error:
> 
> [2017-11-21T09:06:55.840] launch task 999708.0 request from 1044.1000 at 10.1.2.3 <mailto:1044.1000 at 10.1.2.3> 
> (port 45252)
> [2017-11-21T09:06:55.840] error: Invalid job 999708.0 credential for user 
> 1044: host gpu1 not in hostset gpu1-cpu
> [2017-11-21T09:06:55.840] error: Invalid job credential from 1044 at 10.1.2.3 <mailto:1044 at 10.1.2.3>: 
> Invalid job credential
> 
> It seams I am missing something. Any ideas what that could be?
> I am using slurm 16.05.9 on debian stretch.
> 
> 
> regards
> Markus Köberl
> -- 
> Markus Koeberl
> Graz University of Technology
> Signal Processing and Speech Communication Laboratory
> E-mail: markus.koeberl at tugraz.at <mailto:markus.koeberl at tugraz.at>
> 



--
Merlin Hartley
Computer Officer
MRC Mitochondrial Biology Unit
Cambridge, CB2 0XY
United Kingdom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171121/f60171d1/attachment.html>


More information about the slurm-users mailing list