[slurm-users] %x in job names

Bill Barth bbarth at tacc.utexas.edu
Fri May 28 20:03:32 UTC 2021


We noticed today that a %x anywhere in a job name like 

#SBATCH -J abcdefghijklmnopqrstuvw%xyz

Etc. will send scontrol (and maybe other %x-respecting programs) into an infinite loop. We had a user cron launching 'scontrol show job ######' regularly on a system and it was just going off the rails and eating resources until we killed it. The Slurm version 18.08.4 release email says that

-- Expand %x in job name in 'scontrol show job'.

...so I wonder if that is armored to look for self-refferential calls. I haven't looked at the code, myself. I thought I'd give a heads up. I don't think our user was being malicious, and their actual -J was

#SBATCH -J sd-PBEpvw9040%x

Probably a hash and probably machine-generated/unlucky. 

I hope this helps and is actually a problem report. We're on 18.08.5, so I hope we don't have to go backwards to stop this error.

Best regards,
Bill.

-- 
Bill Barth, Ph.D., Director, FutureTechnologies
bbarth at tacc.utexas.edu        |   Phone: (512) 232-7069
Office: ROC 1.435            |   Fax:   (512) 475-9445
 
 



More information about the slurm-users mailing list