[slurm-users] Dynamic MIG Question
Aaron Kollmann
aaron.kollmann at student.hpi.de
Wed Nov 22 18:22:14 UTC 2023
Hello All,
I am currently working in a research project and we are trying to find
out whether we can use NVIDIAs multi-instance GPU (MIG) dynamically in
SLURM.
For instance:
- a user requests a job and wants a GPU but none is available
- now SLURM will reconfigure a MIG GPU to create a partition (e.g.
1g.5gb) which becomes available and allocated immediately
I can already reconfigure MIG + SLURM within a few seconds to start jobs
on newly partitioned resources, but Jobs get killed when I restart
slurmd on nodes with a changed MIG config. (see script example below)
*Do you think it is possible to develop a plugin or change SLURM to the
extent that dynamic MIG will be supported one day? *
(The website says it is not supported)*
*
*
*
Best
- Aaron*
*
#!/usr/bin/bash
# Generate Start Config
killall slurmd
killall slurmctld
nvidia-smi mig -dci
nvidia-smi mig -dgi
nvidia-smi mig -cgi 19,14,5 -i 0 -C
nvidia-smi mig -cgi 0 -i 1 -C
cp -f ./slurm-19145-0.conf /etc/slurm/slurm.conf
slurmd -c
slurmctld -c
sleep 5
# Start a running and a pending job (the first job gets killed by slurm)
srun -w gx06 -c 2 --mem 1G --gres=gpu:a100_1g.5gb:1 sleep 300 &
srun -w gx06 -c 2 --mem 1G --gres=gpu:a100_1g.5gb:1 sleep 300 &
sleep 5
# Simulate MIG Config Change
nvidia-smi mig -i 1 -dci
nvidia-smi mig -i 1 -dgi
nvidia-smi mig -cgi 19,14,5 -i 1 -C
cp -f ./slurm-2x19145.conf /etc/slurm/slurm.conf
killall slurmd
killall slurmctld
slurmd
slurmctld
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231122/d6d5ec05/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6051 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231122/d6d5ec05/attachment.bin>
More information about the slurm-users
mailing list