[slurm-users] Custom Gres for SSD

Shunran Zhang szhang at ngs.gen-info.osaka-u.ac.jp
Mon Jul 24 03:48:06 UTC 2023


Hi all,

I am attempting to setup a gres to manage jobs that need a
scratch space, but only a few of our computational nodes are
equipped with SSD for such scratch space. Originally I setup a new
partition for those IO-bound jobs, but it ended up that those jobs
might be allocated to the same node thus fighting each other for
IO.

With a look over other settings it appears that the gres setting
looks promising. However I was having some difficulties figuring
out how to limit access to such space to those who requested
--gres=ssd:1.

For now I am using Flags=CountOnly to trust users who uses SSD
request for it, but apparently any job submitted to a node with
SSD can just use such space. Our scratch space implementation is 2
disks (sda and sdb) formatted to btrfs and RAID 0. What should I
do to enforce such limit on which job can use such space?

Related configurations for ref:
gres.conf: NodeName=scratch-1 Name=ssd Flags=CountOnly cgroup.conf: 
ConstrainDevices=yes slurm.conf: GresTypes=gpu,ssd 
NodeName=scratch-1 CPUs=88 Sockets=2 CoresPerSocket=22 ThreadsPerCore=2 
  RealMemory=180000 Gres=ssd:1 State=UNKNOWN Sincerely,
S. Zhang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230724/d6bb80aa/attachment.htm>


More information about the slurm-users mailing list