[slurm-users] Job cannot start on slurm v18.08.0pre2

zhangtao102019 at 126.com zhangtao102019 at 126.com
Fri Aug 17 21:39:42 MDT 2018


Hi,
I have installed SLURM 18.08.0-0pre2 on a my cluster based on RHEL7.4 (x86_64).
My configure parameters likes this: 
./configure --prefix=/opt/slurm17 --with-munge=/opt/munge --with-pmix=/opt/pmix --with-ucx=/opt/openucx --with-hwloc=/usr 
(openucx version is 1.5.0, pmix version is 3.0.0, hwloc version is 1.11.8)

After completing the installation and configuration, it looks like slurm is working normally. But when I submitted a simple test job with sbatch sleep.sh(just call srun sleep 30 at single computing node), I found that the job (ID=1032) state was R, but the job did not start normally on the computation node (no process found).

The appendix is the output log of the computing node of the management node.
I can't tell if the cause of this problem is related to the compilation parameters I specify (such as pmix, ucx), and I've never seen anything similar in earlier versions.
Has anyone ever responded to a similar phenomenon with me? How to solve the problem? 

Best regards



zhangtao102019 at 126.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180818/17a1fe94/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logs.tgz
Type: application/octet-stream
Size: 14748 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180818/17a1fe94/attachment-0001.obj>


More information about the slurm-users mailing list