[slurm-users] RES: RES: multiple srun commands in the same SLURM script

Kevin Broch kbroch at rivosinc.com
Wed Nov 1 15:23:05 UTC 2023


Could this apply in your case:
https://slurm.schedmd.com/faq.html#opencl_pmix ?

On Wed, Nov 1, 2023 at 5:24 AM Paulo Jose Braga Estrela <
paulo.estrela at petrobras.com.br> wrote:

> Yeah, you are right. I don’t know why but it seems that my email client
> messed with message formatting putting all srun commands in one line.
>
>
>
> PÚBLICA
> -----Mensagem original-----
> De: slurm-users <slurm-users-bounces at lists.schedmd.com> Em nome de
> Bjørn-Helge Mevik
> Enviada em: quarta-feira, 1 de novembro de 2023 04:55
> Para: slurm-users at schedmd.com
> Assunto: Re: [slurm-users] RES: multiple srun commands in the same SLURM
> script
>
> Paulo Jose Braga Estrela <paulo.estrela at petrobras.com.br> writes:
>
> > Hi,
> >
> > I think that you have a syntax error in your bash script. The "&"
> > means that you want to send a process to background not that you want
> > to run many commands in parallel. To run commands in a serial fashion
> > you should use cmd && cmd2, then the cmd2 will only be executed if the
> > command 1 return 0 as exit code.
> >
> > To run commands in parallel with srun you should set the number of
> > tasks to 4, so srun will spawn 4 tasks of the same command. Take a
> > look at the examples section in srun docs.
> > (https://slurm.schedmd.com/srun.html)
>
> Well, if you look at Example 7 in that section:
>
> Example 7:
>     This example shows a script in which Slurm is used to provide resource
> management for a job by executing the various job steps as processors
> become available for their dedicated use.
>
>     $ cat my.script
>     #!/bin/bash
>     srun -n4 prog1 &
>     srun -n3 prog2 &
>     srun -n1 prog3 &
>     srun -n1 prog4 &
>     wait
>
> which is what OP tries to do.  It is mainly for running *different*
> programs in parallel inside a job.  If one wants to run *the same* program
> in parallel, then a single srun is indeed the recommended way.
>
> I think the main problem is that the original job script only asks for a
> single CPU, so the sruns will only run one at a time.  Try adding
> --ntasks-per-node=4 or similar.
>
> Note that exactly how to run different programs in parallel with srun has
> changed quite a bit in the recent versions, and the example above is for
> the latest version, so check the srun man page for your version.
> (And unfortunately, the documentation in the srun man page has not always
> been correct, so you might need to experiment.  For instance, I believe
> Example 7 above is missing `--exact` or `SLURM_EXACT`. :) )
>
> --
> Regards,
> Bjørn-Helge Mevik, dr. scient,
> Department for Research Computing, University of Oslo
>
> O emitente desta mensagem é responsável por seu conteúdo e endereçamento e
> deve observar as normas internas da Petrobras. Cabe ao destinatário
> assegurar que as informações e dados pessoais contidos neste correio
> eletrônico somente sejam utilizados com o grau de sigilo adequado e em
> conformidade com a legislação de proteção de dados e privacidade aplicável.
> A utilização das informações e dados pessoais contidos neste correio
> eletrônico em desconformidade com as normas aplicáveis acarretará a
> aplicação das sanções cabíveis.
>
> The sender of this message is responsible for its content and address and
> must comply with Petrobras' internal rules. It is up to the recipient to
> ensure that the information and personal data contained in this email are
> only used with the appropriate degree of confidentiality and in compliance
> with applicable data protection and privacy legislation. The use of the
> information and personal data contained in this e-mail in violation of the
> applicable rules will result in the application of the applicable sanctions.
>
> El remitente de este mensaje es responsable por su contenido y dirección y
> debe cumplir con las normas internas de Petrobras. Corresponde al
> destinatario asegurarse de que la información y los datos personales
> contenidos en este correo electrónico solo se utilicen con el grado
> adecuado de confidencialidad y de conformidad con la legislación aplicable
> en materia de privacidad y protección de datos. El uso de la información y
> datos personales contenidos en este correo electrónico en contravención de
> las normas aplicables dará lugar a la aplicación de las sanciones
> correspondientes.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231101/644525dc/attachment-0001.htm>


More information about the slurm-users mailing list