[slurm-users] Slurm overhead

Fri Apr 20 00:09:49 MDT 2018

Hi Mahmood,

Mahmood Naderan <mahmood.nt at gmail.com> writes:

> Hi,
> I have installed a program on all nodes since it is an rpm. Therefore,
> when the program is running, it won't use the shared file system and
> it just use its own /usr/local/program files.
>
> I also set a scratch path in the bashrc which is actually the path on
> the running node. For example, I set TMPFOLDER=/tmp/mahmood/program in
> the bashrc (home is shared), then I ssh to the node and create that
> path. Therefore, when the program wants to read/write some data during
> the execution it won't go through the network.
>
> Thing is that, when I directly ssh to the node and run the program
> with time command, I see
>
> real    7m34.738s
>
> However, when I submit the job via slurm on the head node, I see
>
> [mahmood at rocks7 g]$ sacct -X -j 66 --format=elapsed
>    Elapsed
> ----------
>   00:11:28
>
>
> So, I think the slurm overhead is large (about 50%). Is that correct?

Rather than the overhead being 50%, maybe it is just 4 minutes.  If
another job runs for a week, that might not be a problem.  In addition,
you just have one data point, so it is rather difficult to draw any
conclusion.

However, I think that it is unlikely that Slurm is responsible for
this difference.  What can happen is that, if a node is powered down
before the job starts, then the clock starts ticking as soon as the job
is assigned to the node.  This means that the elapsed time also includes
the time for the node to be provisioned.  If this is not relevant in
your case, then you are probably just not comparing like with like,
e.g. is the hardware underlying /tmp identical in both cases?

Cheers,

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email loris.bennett at fu-berlin.de