[slurm-users] What is the 2^32-1 values in "stepd_connect to <jobid>.4294967295 failed" telling you

Christopher Samuel chris at csamuel.org
Fri Mar 8 18:16:13 UTC 2019


On 3/8/19 12:25 AM, Kevin Buckley wrote:

> error: stepd_connect to <jobid>.1 failed: No such file or directory
> error: stepd_connect to <jobid>.4294967295 failed: No such file or 
> directory
> 
> We can imagine why a job that got killed in step 0 might still be looking
> for the <jobid>.1 step but the <jobid>.2^32-1 is beyond our imagination.

That's the internal representation of the extern step from memory.

All the best,
Chris
-- 
   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA



More information about the slurm-users mailing list