[slurm-users] Job step aborted

Matthieu Hautreux matthieu.hautreux at gmail.com
Thu May 17 13:10:57 MDT 2018


Le jeu. 17 mai 2018 11:28, Mahmood Naderan <mahmood.nt at gmail.com> a écrit :

> Hi,
> For an interactive job via srun, I see that after opening the gui, the
> session is terminated automatically which is weird.
>
> [mahmood at rocks7 ansys_test]$ srun --x11 -A y8 -p RUBY --ntasks=10
> --mem=8GB --pty bash
> [mahmood at compute-0-6 ansys_test]$ /state/partition1/scfd/sc -t10
> srun: First task exited 60s ago
> srun: step:292.0 task 0: running
> srun: step:292.0 tasks 1-9: exited
> srun: Terminating job step 292.0
> srun: Job step aborted: Waiting up to 62 seconds for job step to finish.
> srun: error: compute-0-6: task 0: Killed
>
> What does that mean?
>

It means what is written : your job is terminated because 9 tasks out of 10
exited more than 60s before.

The logic behind the 60 seconds (configurable) is described in the srun man
page. You should look at it closely.

You should also look at the FAQ here https://slurm.schedmd.com/faq.html.

You should set --ntask=1, if I properly guess your goal.

HTH



> Regards,
> Mahmood
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180517/523bbace/attachment.html>


More information about the slurm-users mailing list