[slurm-users] scontrol for a heterogenous job appears incorrect

Tue Apr 23 22:02:15 UTC 2019

I'm testing using heterogenous jobs for a user on out cluster, but seeing I think incorrect output from "scontrol show job XXX" for the job. The cluster is currently using Slurm 18.08.

So my job script looks like this:

#!/bin/sh

### This is a general SLURM script. You'll need to make modifications for this to
### work with the appropriate packages that you want. Remember that the .bashrc
### file will get executed on each node upon login and any settings in this script
### will be in addition to, or will override, the system bashrc file settings. Users will
### find it advantageous to use only the specific modules they want or
### specify a certain PATH environment variable, etc. If you have questions,
### please contact the ARCC at arcc-info at uwyo.edu for help.

### Informational text is usually indicated by "###". Don't uncomment these lines.

### Lines beginning with "#SBATCH" are SLURM directives. They tell SLURM what to do.
### For example, #SBATCH --job-name my_job tells SLURM that the name of the job is "my_job".
### Don't remove the "#SBATCH".

### Job Name
#SBATCH --job-name=CHECK_NODE

### Declare an account for the job to run under
#SBATCH --account=arcc

### Standard output stream files are have a default name of:
### "slurm_<jobid>.out" However, this can be changed using options
### below. If you would like stdout and stderr to be combined,
### omit the "SBATCH -e" option below.
###SBATCH -o stdout_file
###SBATCH -e stderr_file

### mailing options
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --mail-user=xxx

### Set max walltime (days-hours:minutes:seconds)
#SBATCH --time=0-01:00:00

### Specify Resources
### 2 nodes, 16 processors (cores) each node
#SBATCH --nodes=1 --ntasks=1 --cpus-per-task=1  --partition=teton-hugemem
#SBATCH packjob
#SBATCH --nodes=9 --ntasks-per-node=32 --partition=teton

### Load needed modules
#module load gcc/7.3.0
#module load swset/2018.05
#module load openmpi/3.1.0

### Start the job via launcher.
### Command normally given on command line
srun check_nodes

sleep 600

When I submit the job and check it with "scontrol show job XXX"

JobId=2607083 PackJobId=2607082 PackJobOffset=1 JobName=CHECK_NODE
   PackJobIdSet=2607082-2607083
   UserId=jrlang(10024903) GroupId=jrlang(10024903) MCS_label=N/A
  Priority=1086 Nice=0 Account=arcc QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:03:33 TimeLimit=01:00:00 TimeMin=N/A
   SubmitTime=2019-04-23T15:42:45 EligibleTime=2019-04-23T15:42:45
   AccrueTime=2019-04-23T15:42:45
   StartTime=2019-04-23T15:42:49 EndTime=2019-04-23T16:42:49 Deadline=N/A
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   LastSchedEval=2019-04-23T15:42:49
   Partition=teton AllocNode:Sid=tmgt1:34097
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=t[456-464]
   BatchHost=t456
   NumNodes=9 NumCPUs=288 NumTasks=288 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=288,mem=288000M,node=9,billing=288
   Socks/Node=* NtasksPerN:B:S:C=32:0:*:* CoreSpec=*
   MinCPUsNode=32 MinMemoryCPU=1000M MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=/pfs/tsfs1/home/jrlang/TEST_CODE/check_nodes.sbatch
   WorkDir=/pfs/tsfs1/home/jrlang/TEST_CODE
   StdErr=/pfs/tsfs1/home/jrlang/TEST_CODE/slurm-2607083.out
   StdIn=/dev/null
   StdOut=/pfs/tsfs1/home/jrlang/TEST_CODE/slurm-2607083.out
   Power=

Looking at the nodelist and the NumNodes they are both incorrect.   They should show the first node and then the additional nodes assigned.

Using pestat I see the 10 nodes allocated to the job.

    t456           teton    alloc  32  32    0.01*   128000   119539  2607083 jrlang
    t457           teton    alloc  32  32    0.01*   128000   119459  2607083 jrlang
    t458           teton    alloc  32  32    0.03*   128000   119854  2607083 jrlang
    t459           teton    alloc  32  32    0.01*   128000   119567  2607083 jrlang
    t460           teton    alloc  32  32    0.01*   128000   119567  2607083 jrlang
    t461           teton    alloc  32  32    0.01*   128000   119308  2607083 jrlang
    t462           teton    alloc  32  32    0.01*   128000   119570  2607083 jrlang
    t463           teton    alloc  32  32    0.01*   128000   119241  2607083 jrlang
    t464           teton    alloc  32  32    0.01*   128000   119329  2607083 jrlang
   thm03   teton-hugemem      mix   1  32    0.01*  1024000  1017834  2607082 jrlang

So why is scontrol not showing the thm03 node in the nodelist and including it in the Numnodes?

One other question is how does Slurm treat the job output for the job. This job is a "hello world" type which just outputs the nodes the node and rack the parts run on.  When the job completes I only see one line in the output from the Rack 0 task.

So where is all the Rank output ending up?

Jeff

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190423/74ddba7a/attachment-0001.html>