Dear all
We have just installed a small SLURM cluster composed of 12 nodes:
- 6 CPU only nodes: 2 Sockets=2, 96 CoresPerSocket 2, ThreadsPerCore=2, 1.5 TB of RAM
- 6 nodes with also GPUS: same conf of the CPU-only node + 4 H100 per node
We started with a setup with 2 partitions:
- a 'onlycpus' partition which sees all the cpu-only nodes
- a 'gpus' partition which sees the nodes with gpus
and asked users to use the 'gpus' partition only for jobs that need gpus (for the time being we are not technically enforced that).
The problem is that a job requiring a GPU usually needs only a few cores and only a few GB of RAM, which means wasting a lot of CPU cores.
And having all nodes in the same partition would mean that there is the risk that a job requiring a GPU can't start if all CPU cores and/or all memory is used by CPU only jobs
I went through the mailing list archive and I think that "splitting" a GPU node into two logical nodes (one to be used in the 'gpus' partition and one to be used in the 'onlycpus' partition) as discussed in [*] would help.
Since that proposed solution is considered by his author a "bit of a kludge" and since I read that splitting a node into multiple logical nodes is in a general a bad idea, I'd like to understand if you could suggest other/best options.
I also found this [**] thread, but I don't like too much that approach (i.e. relying on MaxCPUsPerNode) because it would mean having 3 partition (if I have got it right): two partitions for cpu only jobs and 1 partition for gpu jobs
Many thanks, Massimo
[*]
https://groups.google.com/g/slurm-users/c/IUd7jLKME3M[**]
https://groups.google.com/g/slurm-users/c/o7AiYAQ1YJ0