[slurm-users] Job Step Output Delay

Maria Semple maria at rstudio.com
Wed Feb 10 21:11:07 UTC 2021


Hi Sean,

Thanks for your suggestion!

Adding the -u flag does not seem to have an impact on whether data is
buffered. I also tried adding stdbuf -o0 before the call to srun, to no
avail.

Best,
Maria

On Wed, Feb 10, 2021 at 4:30 AM Sean Maxwell <stm at case.edu> wrote:

> Hi Maria,
>
> Have you tried adding the -u flag (specifies unbuffered) to your srun
> command?
>
> https://slurm.schedmd.com/srun.html#OPT_unbuffered
>
> Your description sounds like buffering, so this might help.
>
> Thanks,
>
> -Sean
>
> On Tue, Feb 9, 2021 at 6:49 PM Maria Semple <maria at rstudio.com> wrote:
>
>> Hello all,
>>
>> I've noticed an odd behaviour with job steps in some Slurm environments.
>> When a script is launched directly as a job, the output is written to file
>> immediately. When the script is launched as a step in a job, output is
>> written in ~30 second chunks. This doesn't happen in all Slurm
>> environments, but if it happens in one, it seems to always happen. For
>> example, on my local development cluster, which is a single node on Ubuntu
>> 18, I don't experience this. On a large Centos 7 based cluster, I do.
>>
>> Below is a simple reproducible example:
>>
>> loop.sh:
>> #!/bin/bash
>> for i in {1..100}
>> do
>>    echo $i
>>    sleep 1
>> done
>>
>> withsteps.sh:
>> #!/bin/bash
>> srun ./loop.sh
>>
>> Then from the command line running sbatch loop.sh followed by tail -f
>> slurm-<job #>.out prints the job output in smaller chunks, which appears
>> to be related to file system buffering or the time it takes for the tail
>> process to notice that the file has updated. Running cat on the file
>> every second shows that the output is in the file immediately after it is
>> emitted by the script.
>>
>> If you run sbatch withsteps.sh instead, tail-ing or repeatedly cat-ing
>> the output file will show that the job output is written in a chunk of 30 -
>> 35 lines.
>>
>> I'm hoping this is something that is possible to work around, potentially
>> related to an OS setting, the way Slurm was compiled, or a Slurm setting.
>>
>> --
>> Thanks,
>> Maria
>>
>

-- 
Thanks,
Maria
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210210/b3d07f49/attachment-0001.htm>


More information about the slurm-users mailing list