Hi Daniel,

Appreciate your response.

I think you may be feeling that since we take the placement part of the scheduling to ourselves then Slurm has no other role to play!

That's not quite true. Below in brief are other important roles which Slurm must perform which presently come to my mind
(this mayn't be exhaustive):

1> Slurm should inform the job scheduling priority order.
     Admin can impose policies (fairshare, usergroup preference etc) which can prioritize more recent jobs compared to older ones &
     we would like to see them as computed by Slurm on a real time basis.

2> Slurm should update us the info for any configured resource limits.
      Limits on resources can be there like CPU Cores, number of running jobs, host group CPU limits etc.
      Our backend app need to be updated from time to time about the same so that unnecessary allocates are avoided right away.

3> Preemptable job candidates.
     Admin can mark certain jobs from certain users as preemptable ones.
     Our app needs to be informed about that should the need arise to preempt running jobs.

4> Specified host resources for job start.
     User may want their jobs to start on specific hosts & the same should be communicated back by Slurm.
     Similarly, the same applies if user wants his job to run on a certain  set of hosts.

5> Preferential hosts for scheduling. If there is some preferential order of hosts or backfill scheduling enabled the same
     needs to be communicated to us.

6> Regular intimation of job events, like dispatch , suspended, finish, re-submission etc so that we can take appropriate action.

Hope this clears our requirements & expectations.

Regards,
Bhaskar.
On Thursday, 18 July, 2024 at 04:47:51 am IST, Daniel Letai via slurm-users <slurm-users@lists.schedmd.com> wrote:


In the scenario you provide, you don't need anything special.


You just have to configure a partition that is available only to you, and to no other account on the cluster. This partition will only include your hosts. All other partition will not include any of your hosts.

Then use you own implementation to do whatever you want with the hosts. As long as you are the exclusive owners of the hosts, Slurm is not really part of the equation.


You don't even have to allocate the hosts using Slurm, as there is no contention.


If you want to use Slurm to use your placement, instead of directly starting the app on the nodes, just use the -w (--nodelist) option with the hosts requested. Make sure to only request your partition.


You really don't need anything special from Slurm, you don't really need Slurm for this.



On 15/07/2024 19:26:10, jubhaskar--- via slurm-users wrote:
Hi Daniel,
Thanks for picking up this query. Let me try to briefly describe my problem.

As you rightly guessed, we have some hardware on the backend which would be used for our
jobs to run. The app which manages the h/w has its own set of resource placement/remapping
rules to place a job.
So, for eg., if only 3 hosts h1, h2, h3 (2 cores available each) are available at some point for a
4 core job then it's only a few combination of cores from these hosts can be allowed for
the job. Also there is a preference order of the placements decided by our app.

It's in this respect we want our backend app to bring the placement for the job.
Slurm would then dispatch the job accordingly while honoring the exact resource distribution
as asked for. In case for the need of preemption as well our backend would decide the placement
which would decide which preemptable job candidates to preempt.

So, how should we proceed then?
We mayn't have the whole site/cluster to ourselves. There me be other jobs which we don't
care about & hence they should go in the usual route from the select plugin which is there (linear, cons_tres etc).

Is there a scope for a separate partition which will encompass our resources only & trigger our
plugin only for our jobs?
How do the options a>, b> , c> stand (as described in my 1st message) now that I mention our requirement?

A 4th option which comes to my mind is that if there's a possibility through some API interface from Slurm
which will inform a separate process P (say) about resource availability on a real time basis.
P will talk to our backend app, bring a placement & then ask lSurm to place our job.

Your concern about everchanging resources (being allocated before our backend comes up) is uncalled for
as the hosts are segregated as far as our system is concerned. Our hosts will run only our jobs & other Slurm
jobs would run in different hosts.

Hope I make myself little more clearer ! Any help would be appreciated.

(Note: We already have a working solution with LSF! LSF does provide option for custom scheduler plugins
to let one connect in the decision making loop during scheduling. This only led us to believe Slurm would also
have some possibilities.)

Regards,
Bhaskar.

-- 
Regards,

Daniel Letai
+972 (0)505 870 456

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com