[slurm-users] Salloc expand feature

abel pinto abelpc_uff at yahoo.com.br
Tue Apr 11 18:01:00 UTC 2023

Hi —,

I have a question about a silent feature removal. It is about the --dependency:expand feature, that was present in Slurm for 10 years until its removal in version 21.08.03.

Until Slurm 21.08.02, the expand option had an extensive documentation with the dynamic job elasticity features, with resource shrinkage and expansion options. You can see this in the archived FAQ: https://slurm.schedmd.com/archive/slurm-21.08.2/faq.html#job_size

Just a note that these features were added in Slurm 2.3 back in 2011, so it was supported for nearly 10 years. For instance, see Slide 7 in https://slurm.schedmd.com/slurm_ug_2011/SLURM.v23.status.pdf

However, the expand option was silently removed from the Slurm documentation in October 22nd 2021, few weeks before the release of Slurm 21.08.03: https://github.com/SchedMD/slurm/commit/11ce912f31519799494fde3140f530cfc8cfff6a

There was no announcement as to why the feature was removed. As one can see in the release notes for Slurm version 21.08.03 that happened in November, 2021, nothing is really mentioned: https://lists.schedmd.com/pipermail/slurm-announce/2021/000066.html

Today, one can still dynamically “shrink” a job though: https://slurm.schedmd.com/faq.html#job_size

My question is: why was the feature removed? What were the conceptual and technical issues that made not supporting this feature an option?

I can understand why properly expanding a job may be tricky, and why shrinking it is not. Specially with queued jobs that may be waiting. However, having jobs to wait more, or less, is a well known expectation in HPC cluster. I thought a clearer reasoning as of why the feature was removed would be worth learning about. 

Thank you,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230411/afa67d8c/attachment.htm>

More information about the slurm-users mailing list