[slurm-users] Kinda Off-Topic: data management for Slurm clusters
wdennis at nec-labs.com
Fri Feb 22 16:50:25 UTC 2019
Not directly Slurm-related, but... We have a couple of research groups that have large data sets they are processing via Slurm jobs (deep-learning applications) and are presently consuming the data via NFS mounts (both groups have 10G ethernet interconnects between the Slurm nodes and the NFS servers.) They are both now complaining of "too-long loading times" for the data, and are casting about for a way to bring the needed data onto the processing node, onto fast SSD single drives (or even SSD arrays.) These local drives would be considered "scratch space", not for long-term data storage, but for use over the lifetime of a job, or maybe perhaps a few sequential jobs (given the nature of the work.) "Permanent" storage would remain the existing NFS servers. We don't really have the funding for 25-100G networks and/or all-flash commercial data storage appliances (NetApp, Pure, etc.)
Any good patterns that I might be able to learn about implementing here? We have a few ideas floating about, but I figured this already may be a solved problem in this community...
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users