[slurm-users] Two jobs ends up on one GPU?
Chris Samuel
chris at csamuel.org
Wed Jan 16 16:21:04 UTC 2019
Hi Magnus,
On 15/1/19 4:15 am, Magnus Jonsson wrote:
> We have a user that in some way have managed to get both jobs to end up
> on the same GPU (verified via nvidia-smi).
So this was with nvidia-smi run by root from outside, showing both
processes on the same GPU and the other with none? That would be really
strange and I've not noticed it before.
All I can think would be to check the /proc/$pid/cgroup file of them to
see what cgroups are set and then go poking around in the cgroup
filesystem to see what restrictions are set for them.
You don't have Docker installed by some chance either? That could allow
users to escape their cgroup settings as it can set up its own.
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the slurm-users
mailing list