[slurm-users] salloc unable to find the file path

Mahmood Naderan mahmood.nt at gmail.com
Wed Dec 26 13:17:26 MST 2018


Please see below that salloc says unable to exec command, while srun is
able to run that

[mahmood at rocks7 ~]$ salloc --nodelist=compute-0-4 -p RUBY -A y4 --mem=4G -c
4 --spankx11 /state/partition1/v190/Framework/bin/Linux64/runwb2
salloc: Granted job allocation 104
salloc: error: Unable to exec command
"/state/partition1/v190/Framework/bin/Linux64/runwb2"
salloc: Relinquishing job allocation 104
[mahmood at rocks7 ~]$ srun --nodelist=compute-0-4 -p RUBY -A y4 --mem=4G -c 4
--spankx11 /state/partition1/v190/Framework/bin/Linux64/runwb2
^Csrun: interrupt (one more within 1 sec to abort)
srun: step:105.0 task 0: running
^Csrun: sending Ctrl-C to job 105.0
srun: Job step aborted: Waiting up to 62 seconds for job step to finish.
slurmstepd: error: *** STEP 105.0 ON compute-0-4 CANCELLED AT
2018-12-26T15:15:52 ***
[mahmood at rocks7 ~]$


I intentionally pressed ^C for the srun to exit the job.

Regards,
Mahmood




On Wed, Dec 26, 2018 at 11:44 PM Mahmood Naderan <mahmood.nt at gmail.com>
wrote:

> But the problem is that it doesn't find the file path. That is not related
> to slurm parameters AFAIK.
>
> Regards,
> Mahmood
>
>
>
>
> On Wed, Dec 26, 2018 at 11:37 PM Ing. Gonzalo E. Arroyo <
> garroyo at ifimar-conicet.gob.ar> wrote:
>
>> You should try starting with small amount of parameters, for example if I
>> have my IFIMAR partition and I need 2 cores in 1 node: salloc -N1 -n2
>> --partition IFIMAR
>>
>> *Ing. Gonzalo E. Arroyo - CPA Profesional*
>> IFIMAR - CONICET
>> *www.ifimar-conicet.gob.ar <http://www.ifimar-conicet.gob.ar>*
>>
>> *Este mensaje es confidencial. Puede contener información amparada por el
>> secreto comercial. Si usted ha recibido este e-mail por error, deberá
>> eliminarlo de su sistema. No deberá copiar el mensaje ni divulgar su
>> contenido a ninguna persona. Muchas gracias.*
>> This message is confidential. It may also contain information that is
>> privileged or not authorized to be disclosed. If you have received it by
>> mistake, delete it from your system. You should not copy the messsage nor
>> disclose its contents to anyone. Thanks.
>>
>>
>> El mié., 26 dic. 2018 a las 16:37, Mahmood Naderan (<mahmood.nt at gmail.com>)
>> escribió:
>>
>>> Hi,
>>> Although the command exists on the node, salloc says there is no such
>>> file. Please see below
>>>
>>> [mahmood at rocks7 ~]$ cat workbench.sh
>>> #!/bin/bash
>>> #SBATCH --nodes=1
>>> #SBATCH --cores=4
>>> #SBATCH --mem=4G
>>> #SBATCH --partition=RUBY
>>> #SBATCH --account=y4
>>> unset SLURM_GTIDS
>>> /state/partition1/v190/Framework/bin/Linux64/runwb2
>>>
>>> [mahmood at rocks7 ~]$ rocks run host compute-0-4 "ls -l
>>> /state/partition1v190/Framework/bin/Linux64/runwb2"
>>> Warning: untrusted X11 forwarding setup failed: xauth key data not
>>> generated
>>> -rwxr-xr-x 1 root root 8232 Nov 30  2017
>>> /state/partition1/v190/Framework/bin/Linux64/runwb2
>>> [mahmood at rocks7 ~]$ salloc --nodelist=compute-0-4 --spankx11
>>> ./workbench.sh
>>> salloc: Granted job allocation 95
>>> ./workbench.sh: line 8:
>>> /state/partition1/v190/Framework/bin/Linux64/runwb2: No such file or
>>> directory
>>> salloc: Relinquishing job allocation 95
>>>
>>>
>>> Any idea about that?
>>>
>>> Regards,
>>> Mahmood
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181226/b7592945/attachment-0001.html>


More information about the slurm-users mailing list