[slurm-users] Unable to get output file once job is completed

Luke Yeager lyeager at nvidia.com
Mon Feb 8 15:51:46 UTC 2021


The output file is written to the filesystem mounted on the compute node[s], not the control node. Do you have a shared filesystem? Is the output file for your job being written to that shared filesystem?

From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Zainul Abiddin
Sent: Sunday, February 7, 2021 10:59 PM
To: slurm-users at lists.schedmd.com
Subject: [slurm-users] Unable to get output file once job is completed

External email: Use caution opening links or attachments

Hi All,

I have created accounts and users on a cluster and I have login with one of my users and submitted the job, job is completed but output file is not created can anyone help me on this.

[zain at smaster ~]$ sacctmgr list associations cluster=scluster format=Account,Cluster,User,Fairshare tree withd
             Account    Cluster       User     Share
-------------------- ---------- ---------- ---------
root                   scluster                    1
 root                  scluster       root         1
 software              scluster                   50
  ai                   scluster                   20
   ai                  scluster    jeyaraj         1
  hpc                  scluster                   30
   hpc                 scluster   srikanth        10
   hpc                 scluster       zain        10
[zain at smaster ~]$
[zain at smaster ~]$ sbatch --wrap="uptime"
Submitted batch job 47
[zain at smaster ~]$ ls
Desktop  Documents  Downloads  Music  Pictures  Public  Templates  test.sh  Videos
[zain at smaster ~]$ sacct
       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
31             hostname        hpc        hpc          1  COMPLETED      0:0
37                   n2        hpc        hpc          1     FAILED      2:0
38                   N2        hpc        hpc          1     FAILED      2:0
39             hostname        hpc        hpc          2  COMPLETED      0:0
40                 wrap        hpc        hpc          1  COMPLETED      0:0
40.batch          batch                   hpc          1  COMPLETED      0:0
41              testjob        hpc        hpc          1  COMPLETED      0:0
41.batch          batch                   hpc          1  COMPLETED      0:0
47                 wrap        hpc        hpc          1  COMPLETED      0:0
47.batch          batch                   hpc          1  COMPLETED      0:0
[zain at smaster ~]$ sbatch test.sh
Submitted batch job 48
[zain at smaster ~]$ cat test.sh
#!/bin/bash

#SBATCH -N 1
#SBATCH -n 1
#SBATCH -p hpc
#SBATCH -t 01:00:00
#SBATCH -J testjob
#SBATCH -o testjob.o%j
#SBATCH -e testjob.e%j
hostname
date
[zain at smaster ~]$
[zain at smaster ~]$ ls
Desktop  Documents  Downloads  Music  Pictures  Public  Templates  test.sh  Videos
[zain at smaster ~]$ sacct
       JobID    JobName  Partition    Account  AllocCPUS      State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
31             hostname        hpc        hpc          1  COMPLETED      0:0
37                   n2        hpc        hpc          1     FAILED      2:0
38                   N2        hpc        hpc          1     FAILED      2:0
39             hostname        hpc        hpc          2  COMPLETED      0:0
40                 wrap        hpc        hpc          1  COMPLETED      0:0
40.batch          batch                   hpc          1  COMPLETED      0:0
41              testjob        hpc        hpc          1  COMPLETED      0:0
41.batch          batch                   hpc          1  COMPLETED      0:0
47                 wrap        hpc        hpc          1  COMPLETED      0:0
47.batch          batch                   hpc          1  COMPLETED      0:0
48              testjob        hpc        hpc          1  COMPLETED      0:0
48.batch          batch                   hpc          1  COMPLETED      0:0
[zain at smaster ~]$

[zain at smaster ~]$ sinfo -Nl
Mon Feb 08 12:24:46 2021
NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
smaster        1      hpc*        idle 8       8:1:1   1024        0      1   (null) none
snode          1     debug        idle 4       4:1:1   1024        0      1   (null) none
snode          1      hpc*        idle 4       4:1:1   1024        0      1   (null) none
[zain at smaster ~]$

Regards,
Zain

--
Regards
Zain
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210208/87e55197/attachment-0001.htm>


More information about the slurm-users mailing list