[slurm-users] Odd prolog Error?

Jason Simms jsimms1 at swarthmore.edu
Tue Apr 11 17:28:25 UTC 2023


Thanks, Brian, helpful as always. Yes, /opt/slurm/prolog.sh is mounted
across IB on all nodes, so it's reachable from everywhere. And the slurmd
user can execute it.

I'll keep mucking around with it...

Warmest regards,
Jason

On Tue, Apr 11, 2023 at 12:57 PM Brian Andrus <toomuchit at gmail.com> wrote:

> From the documentation:
>
>
> *Parameter*
>
> *Location*
>
> *Invoked by*
>
> *User*
>
> *When executed*
>
> Prolog (from slurm.conf)
>
> Compute or front end node
>
> slurmd daemon
>
> SlurmdUser (normally user root)
>
> First job or job step initiation on that node (by default);
> PrologFlags=Alloc will force the script to be executed at job allocation
>
> So ensure:
> 1) /opt/slurm/prolog.sh exists on the node(s)
> 2) the slurmd user is able to execute it
>
> I would connect to the node and try to run the command as the slurmd user.
> Also, ensure the user exists on the node, however you are propagating the
> uids.
>
> Brian ANdrus
>
> On 4/11/2023 9:48 AM, Jason Simms wrote:
>
> Hello all,
>
> Regularly I'm seeing array jobs fail, and the only log info from the
> compute node is this:
>
> [2023-04-11T11:41:12.336] error: /opt/slurm/prolog.sh: exited with status
> 0x0100
> [2023-04-11T11:41:12.336] error: [job 26090] prolog failed status=1:0
> [2023-04-11T11:41:12.336] Job 26090 already killed, do not launch batch job
>
> The contents of prolog.sh are incredibly simple:
>
> #!/bin/bash
> loginctl enable-linger $SLURM_JOB_USER
>
> I can't sort out what may be going on here. An example script from a job
> that can result in this error is here:
>
> #!/bin/bash
> #SBATCH -t 2:00:00
> #SBATCH -n 1
> #SBATCH -N 1
> #SBATCH -p compute
> #SBATCH --array=1-100
> #SBATCH -o tempOut/MSO-%j-%a.log
>
> module load python3/python3
> python3 runVoltage.py $SLURM_ARRAY_TASK_ID
>
> Any insight would be welcome! This is really frustrating because it's
> constantly causing nodes to drain.
>
> Warmest regards,
> Jason
>
> --
> *Jason L. Simms, Ph.D., M.P.H.*
> Manager of Research Computing
> Swarthmore College
> Information Technology Services
> (610) 328-8102
> Schedule a meeting: https://calendly.com/jlsimms
>
>

-- 
*Jason L. Simms, Ph.D., M.P.H.*
Manager of Research Computing
Swarthmore College
Information Technology Services
(610) 328-8102
Schedule a meeting: https://calendly.com/jlsimms
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230411/3fa61053/attachment-0001.htm>


More information about the slurm-users mailing list