Hi Brian,
sorry for breaking the email thread, my subscription settings were not set correctly and I didn't receive your response as email.
Thanks for the tips! I'll give that a try.
Best,
Nick
Nick,
Presuming you have followed the SchedMD instructions, you should be able to get a session in your login container:
kubectl --namespace=slurm exec -it statefulsets/slurm-controller -- bash --login
From there, you can do any standard testing you like. Simple 'srun hostname' should work to let you know slurm itself is doing it's part.
You can also do commands such as 'scontrol show nodes' to see what and how many resources you have configured.
Outside that, you need to ensure you have configured your slurm containers to request the resources you plan on using (eg: gpus) and that you have enough of them for the script(s) you wish to run.
Brian Andrus