[slurm-users] Ideal NFS exported StateSaveLocation size.
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Mon Oct 24 08:14:10 UTC 2022
On 10/24/22 09:57, Diego Zuccato wrote:
> Il 24/10/2022 09:32, Ole Holm Nielsen ha scritto:
>
> > It is definitely a BAD idea to store Slurm StateSaveLocation on a slow
> > NFS directory! SchedMD recommends to use local NVME or SSD disks
> > because there will be many IOPS to this file system!
>
> IIUC it does have to be shared between controllers, right?
>
> Possibly use NVME-backed (or even better NVDIMM-backed) NFS share. Or
> replica-3 Gluster volume with NVDIMMs for the bricks, for the paranoid :)
IOPS is the key parameter! Local NVME or SSD should beat any networked
storage. The original question refers to having StateSaveLocation on a
standard (slow) NFS drive, AFAICT.
I don't know how many people prefer using 2 slurmctld hosts (primary and
backup)? I certainly don't do that. Slurm does have a configurable
SlurmctldTimeout parameter so that you can reboot the server quickly when
needed.
It would be nice if people with experience in HA storage for slurmctld
could comment.
/Ole
More information about the slurm-users
mailing list