I think the issue is more severe than you describe.


Slurm juggles the needs of many jobs. Just because there are some resources available at the exact second a job starts, doesn't mean those resource are not pre-allocated for some future job waiting for even more resources, or what about the use case of the opportunistic job being a backfill job, and prevents a higher priority job from starting, or being pushed back due to asking more resources at the last minute?


The request, while understandable from a user's point of view, is a non-starter for a shared cluster.


Just my 2 cents.


On 02/08/2024 17:34, Laura Hild via slurm-users wrote:
My read is that Henrique wants to specify a job to require a variable number of CPUs on one node, so that when the job is at the front of the queue, it will run opportunistically on however many happen to be available on a single node as long as there are at least five.

I don't personally know of a way to specify such a job, and wouldn't be surprised if there isn't one, since as other posters have suggested, usually there's a core-count sweet spot that should be used, achieving a performance goal while making efficient use of resources.  A cluster administrator may in fact not want you using extra cores, even if there's a bit more speed-up to be had, when those cores could be used more efficiently by another job.  I'm also not sure how one would set a judicious TimeLimit on a job that would have such a variable wall-time.

So there is the question of whether it is possible, and whether it is advisable.