[slurm-users] Error running jobs with srun

Lachlan Musicman datakid at gmail.com
Wed Nov 8 16:48:56 MST 2017


On 9 November 2017 at 10:35, Elisabetta Falivene <e.falivene at ilabroma.com>
wrote:

> Wow, thank you. There's a way to check which directories the master and
> The nodes share?
>

There's no explicit way.
1. Check the cluster documentation written by the cluster admins
2. Ask the cluster admins
3. Run "mount" or "cat /etc/mtab" or "df -H" on the master node and check
against the same commands on a worker node (by getting an interactive
terminal: "srun --pty bash" )

Cheers
L.

------
"The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic civics
is the insistence that we cannot ignore the truth, nor should we panic
about it. It is a shared consciousness that our institutions have failed
and our ecosystem is collapsing, yet we are still here — and we are
creative agents who can shape our destinies. Apocalyptic civics is the
conviction that the only way out is through, and the only way through is
together. "

*Greg Bloom* @greggish
https://twitter.com/greggish/status/873177525903609857



> Il mercoledì 8 novembre 2017, Lachlan Musicman <datakid at gmail.com> ha
> scritto:
>
>> On 9 November 2017 at 09:19, Elisabetta Falivene <e.falivene at ilabroma.com
>> > wrote:
>>
>>> I'm getting this message anytime I try to execute any job on my cluster.
>>> (node01 is the name of my first of eight nodes and is up and running)
>>>
>>> Trying a python simple script:
>>> *root at mycluster:/tmp# srun python test.py *
>>> *slurmd[node01]: error: task/cgroup: unable to build job physical cores*
>>> */usr/bin/python: can't open file 'test.py': [Errno 2] No such file or
>>> directory*
>>> *srun: error: node01: task 0: Exited with exit code 2*
>>>
>>>
>> This error - which I've seen too many times to mention - is because the
>> file isn't visible to the node.
>>
>> EG: If all the cluster share /opt and /home/ but not /root, and you run
>> "srun python test.py" from /root - then node1 can't find it (because on
>> node1, /root/test.py doesn't exist)
>>
>> Cheers
>> L.
>>
>>
>> ------
>> "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic
>> civics is the insistence that we cannot ignore the truth, nor should we
>> panic about it. It is a shared consciousness that our institutions have
>> failed and our ecosystem is collapsing, yet we are still here — and we are
>> creative agents who can shape our destinies. Apocalyptic civics is the
>> conviction that the only way out is through, and the only way through is
>> together. "
>>
>> *Greg Bloom* @greggish https://twitter.com/greggish/s
>> tatus/873177525903609857
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171109/063a57b5/attachment-0001.html>


More information about the slurm-users mailing list