[slurm-users] Unable to get output file once job is completed
Luke Yeager
lyeager at nvidia.com
Mon Feb 8 15:51:46 UTC 2021
The output file is written to the filesystem mounted on the compute node[s], not the control node. Do you have a shared filesystem? Is the output file for your job being written to that shared filesystem?
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Zainul Abiddin
Sent: Sunday, February 7, 2021 10:59 PM
To: slurm-users at lists.schedmd.com
Subject: [slurm-users] Unable to get output file once job is completed
External email: Use caution opening links or attachments
Hi All,
I have created accounts and users on a cluster and I have login with one of my users and submitted the job, job is completed but output file is not created can anyone help me on this.
[zain at smaster ~]$ sacctmgr list associations cluster=scluster format=Account,Cluster,User,Fairshare tree withd
Account Cluster User Share
-------------------- ---------- ---------- ---------
root scluster 1
root scluster root 1
software scluster 50
ai scluster 20
ai scluster jeyaraj 1
hpc scluster 30
hpc scluster srikanth 10
hpc scluster zain 10
[zain at smaster ~]$
[zain at smaster ~]$ sbatch --wrap="uptime"
Submitted batch job 47
[zain at smaster ~]$ ls
Desktop Documents Downloads Music Pictures Public Templates test.sh Videos
[zain at smaster ~]$ sacct
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
31 hostname hpc hpc 1 COMPLETED 0:0
37 n2 hpc hpc 1 FAILED 2:0
38 N2 hpc hpc 1 FAILED 2:0
39 hostname hpc hpc 2 COMPLETED 0:0
40 wrap hpc hpc 1 COMPLETED 0:0
40.batch batch hpc 1 COMPLETED 0:0
41 testjob hpc hpc 1 COMPLETED 0:0
41.batch batch hpc 1 COMPLETED 0:0
47 wrap hpc hpc 1 COMPLETED 0:0
47.batch batch hpc 1 COMPLETED 0:0
[zain at smaster ~]$ sbatch test.sh
Submitted batch job 48
[zain at smaster ~]$ cat test.sh
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -p hpc
#SBATCH -t 01:00:00
#SBATCH -J testjob
#SBATCH -o testjob.o%j
#SBATCH -e testjob.e%j
hostname
date
[zain at smaster ~]$
[zain at smaster ~]$ ls
Desktop Documents Downloads Music Pictures Public Templates test.sh Videos
[zain at smaster ~]$ sacct
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
31 hostname hpc hpc 1 COMPLETED 0:0
37 n2 hpc hpc 1 FAILED 2:0
38 N2 hpc hpc 1 FAILED 2:0
39 hostname hpc hpc 2 COMPLETED 0:0
40 wrap hpc hpc 1 COMPLETED 0:0
40.batch batch hpc 1 COMPLETED 0:0
41 testjob hpc hpc 1 COMPLETED 0:0
41.batch batch hpc 1 COMPLETED 0:0
47 wrap hpc hpc 1 COMPLETED 0:0
47.batch batch hpc 1 COMPLETED 0:0
48 testjob hpc hpc 1 COMPLETED 0:0
48.batch batch hpc 1 COMPLETED 0:0
[zain at smaster ~]$
[zain at smaster ~]$ sinfo -Nl
Mon Feb 08 12:24:46 2021
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
smaster 1 hpc* idle 8 8:1:1 1024 0 1 (null) none
snode 1 debug idle 4 4:1:1 1024 0 1 (null) none
snode 1 hpc* idle 4 4:1:1 1024 0 1 (null) none
[zain at smaster ~]$
Regards,
Zain
--
Regards
Zain
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210208/87e55197/attachment-0001.htm>
More information about the slurm-users
mailing list