RE:  Placing the full pathname of the job stdout in an environment variable

Would others find it useful if new variables were added that contained the full pathnames of the standard input, error and input files of batch jobs?

## SYNOPSIS

Proposed new environment variables SLURM_STDOUT,SLURM_STDERR, and
SLURM_STDIN pointing to the full pathnames of the output, error output,
and input files of a task would be a useful feature for batch jobs
in particular.

## PROBLEM

There are cases where it is desirable to have a job process the standard
files (stdin, stdout, stderr) from a batch job from within the job
instead of later via an epilog or via post-processing.

Just a few examples where the job might want to reference the full
pathname of stdout:

   * copy the file to a remote machine (via scp(1), for example)
   * move the file from local or scratch space to permanent globally
     accessible storage
   * remove the file depending on completion status
   * mail the output
   * archive it
   * post-process it

Slurm commands like scontrol(1) and squeue(1) can help you locate the
file names. But the names returned do not always expand all the macros
allowed when specifying the names, and require calling a Slurm command
(which might overload Slurm if millions of jobs are submitted or cause
problems if Slurm is not responding for any reason(?)). For
example:

      #!/bin/bash
      # Minimalist submit script for sbatch(1)
      #SBATCH --nodes 1-1 --ntasks=1 --time 0-0:1:00 --chdir=/tmp
      #SBATCH --output=stdout.%j
      #SBATCH --error=stderr.%A:%a:%J:%j:%N:%n:%s:%t:%u.out
      # query filenames via squeue(1)
      export SLURM_STDOUT=$(squeue --noheader --Format=STDOUT: --job=$SLURM_JOBID)
      export SLURM_STDERR=$(squeue --noheader --Format=STDERR: --job=$SLURM_JOBID)
      # PS: Current documentation for squeue(1) does not explicitly tell the
      #     user a null size works.
      # query filenames via scontrol(1)
      declare -x $(scontrol show job=$SLURM_JOBID|grep StdOut)
      declare -x $(scontrol show job=$SLURM_JOBID|grep StdErr)
      cat <<EOF
      SLURM_STDOUT=$SLURM_STDOUT
      SLURM_STDERR=$SLURM_STDERR
      StdOut=$StdOut
      StdErr=$StdErr
      EOF
      ls stdout.* stderr.*

The resulting output shows either none or some macros are expanded by
the commands (and I am not sure all commands will always return a full pathname(?))

      SLURM_STDOUT=/home/urbanjs/venus/V600/stdout.%j
      SLURM_STDERR=/home/urbanjs/venus/V600/stderr.%A:%a:%J:%j:%N:%n:%s:%t:%u.out
      StdOut=/home/urbanjs/venus/V600/stdout.96
      StdErr=/home/urbanjs/venus/V600/stderr.96:4294967294:%J:96:%N:%n:%s:%t:urbanjs.out
      stderr.96:4294967294:96:96:mercury:0:4294967294:0:urbanjs.out
      stdout.96

One currently available work-around would be that the user just avoid
the filename macros and always specify the filenames using a standard
convention (which would obviously have to be specific to a specific
platform).  This is error-prone as it requires the user to strictly follow
a protocol where the pathnames are specified that can be overridden by
command-line options on sbatch(1) and so-on.

Alternatively instead of the new environment variables Slurm could have
a command option that returns the fully expanded names always expanding the
macros.

But if you query Slurm you have to worry about scaleability. Is it
OK for a 100 000, jobs to query Slurm simulataneously via squeue(1)
or scontrol(1)?  What happens if for external reasons the Slurm daemons
are down or not responding?

As a current work-around using commands like realpath(1), stat(1),
ls(1),find(1),and getting pids with fuser(1), pidof(1), and ps(1) become
attractive. I find using the realpath(1) command at the top of the job
and $SLURM_TASK_ID works the best.

realpath(1) may not be available on all platforms(?) as well as the
/proc/$PID/fd/1 file, in which case the other commands might be used to
the same effect but it would be nice to have a simple standard way that
does not require calling a command. The proposed environment variables
seem the most obvious solution.

    # do this early in the job in case the user changes or closes
    # file descriptor 1 ...
    export SLURM_STDOUT=$(realpath /proc/$SLURM_TASK_PID/fd/1)
    # create softlink OUTPUT to stdout of job
    ln -s -f $SLURM_STDOUT OUTPUT
    # now user can do things like mailx(1) or scp(1) the OUTPUT file.

Other reasons I might want to know the stdout pathname are to minimize
network traffic and filespace on central servers.  I might specify the
output of a job goes to scratch or local storage or a memory-resident file
system and move it all to a location on another system or a long-term
archive instead of a central area that might be filled by other jobs or
that I have a quota on, etc.

So since the stdout might not be easily accessible from my other sessions
also having something like "scontrol write batch_stdout=$SLURM_JOBID"
that shows the stdout of the job would also be useful. Note the LSF job
scheduler bpeek(1) command allows for a tail(1)-like interface to the
job stdout to support the stdout not being accessible from all system
access points, just as described.

The current situation where in typical usage the output file is often in
a globally mounted area and I can just access it from other nodes with
grep(1), tail(1), and so on covers many users needs in simple cluster
configurations; but there are several scenarios as described above
primarily encountered by people running many thousands of single-node jobs
where assigning output to a local scratch device and then selectively
processing at job termination is preferable.  As mentioned, those users
can specify such a local name with "#SBATCH --output" and then, knowing
it, process it as desired but having something like $SLURM_STDOUT being
the full pathname to the file is much more generic.

## SUMMARY

So adding the variables creates an ability for users to reliably access
the standard files of a job with a low-load scalable method; and being
a SLURM_* module it could be much more reliably accessed via prolog and
epilog scripts and module(1) scripts as well.

## PS:

Related to this would be instead of just --chdir a --mkdir and --scratch
#SBATCH option would be useful that expanded file macros that would be
similar to --chdir but would make sure the directory existed (using only
user privelages) and a --scratch that would do the same as --mkdir but
remove the directory at job termination.
urbanjs@mercury:~/SLURM/proposals$





Sent with Proton Mail secure email.