[slurm-users] Job Step Resource Requests are Ignored

David Braun dlbraun at umich.edu
Wed May 6 18:35:03 UTC 2020


i'm not sure I understand the problem.  If you want to make sure the
preamble and postamble run even if the main job doesn't run you can use '-d'

from the man page

-d, --dependency=<dependency_list>
              Defer   the   start   of   this   job   until   the
specified   dependencies   have   been   satisfied   completed.
 <dependency_list>   is   of   the  form
              <type:job_id[:job_id][,type:job_id[:job_id]]> or
<type:job_id[:job_id][?type:job_id[:job_id]]>.  All dependencies must be
satisfied  if  the  ","  separator  is
              used.   Any  dependency  may  be  satisfied  if  the "?"
separator is used.  Many jobs can share the same dependency and these jobs
may even belong to different
              users. The  value may be changed after job submission using
the scontrol command.  Once a job dependency fails due to the termination
state of a preceding  job,
              the dependent job will never be run, even if the preceding
job is requeued and has a different termination state in a subsequent
execution.


for instance, create a job that contains this:

preamble_id=`sbatch preamble.job`
main_id=`sbatch -d afterok:$preamble_id main.job`
sbatch -d afterany:$main_id postamble.job

Best,

D

On Wed, May 6, 2020 at 2:19 PM Maria Semple <maria at rstudio.com> wrote:

> Hi Chris,
>
> I think my question isn't quite clear, but I'm also pretty confident the
> answer is no at this point. The idea is that the script is sort of like a
> template for running a job, and an end user can submit a custom job with
> their own desired resource requests which will end up filling in the
> template. I'm not in control of the Slurm cluster that will ultimately run
> the job, nor the details of the job itself. For example, template-job.sh
> might look like this:
>
> #!/bin/bash
> srun -c 1 --mem=1k echo "Preamble"
> srun -c <CPUs> --mem=<Memory>m /bin/sh -c <user's shell script>
> srun -c 1 --mem=1k echo "Postamble"
>
> My goal is that even if the user requests 10 CPUs when the cluster only
> has 4 available, the Preamble and Postamble steps will always run. But as I
> said, it seems like that's not possible since the maximum number of CPUs
> needs to be set on the sbatch allocation and the whole job would be
> rejected on the basis that too many CPUs were requested. Is that correct?
>
> On Tue, May 5, 2020, 11:13 PM Chris Samuel <chris at csamuel.org> wrote:
>
>> On Tuesday, 5 May 2020 11:00:27 PM PDT Maria Semple wrote:
>>
>> > Is there no way to achieve what I want then? I'd like the first and
>> last job
>> > steps to always be able to run, even if the second step needs too many
>> > resources (based on the cluster).
>>
>> That should just work.
>>
>> #!/bin/bash
>> #SBATCH -c 2
>> #SBATCH -n 1
>>
>> srun -c 1 echo hello
>> srun -c 4 echo big wide
>> srun -c 1 echo world
>>
>> gives:
>>
>> hello
>> srun: Job step's --cpus-per-task value exceeds that of job (4 > 2). Job
>> step
>> may never run.
>> srun: error: Unable to create step for job 604659: More processors
>> requested
>> than permitted
>> world
>>
>> > As a side note, do you know why it's not even possible to restrict the
>> > number of resources a single step uses (i.e. set less CPUs than are
>> > available to the full job)?
>>
>> My suspicion is that you've not set up Slurm to use cgroups to restrict
>> the
>> resources a job can use to just those requested.
>>
>> https://slurm.schedmd.com/cgroups.html
>>
>> All the best,
>> Chris
>> --
>>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200506/9cab1880/attachment.htm>


More information about the slurm-users mailing list