[slurm-users] backfill scheduler does not work for heterogeneous jobs (version 17.11)
Kenneth Roberts
kroberts at materialsdesign.com
Fri Nov 30 09:43:17 MST 2018
There are some Limitations that mention backfill on the heterogeneous job support page.
https://slurm.schedmd.com/heterogeneous_jobs.html#limitations
Maybe there’s some information there to help?
Ken
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Ana Jokanovic
Sent: Thursday, November 29, 2018 4:28 AM
To: slurm-users at lists.schedmd.com
Subject: [slurm-users] backfill scheduler does not work for heterogeneous jobs (version 17.11)
Hello,
I did a simple test submitting the workload of three jobs (see below) on a cluster of 5 nodes:
sbatch --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 --time=15
sbatch --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 --time=15
sleep 5;
sbatch --ntasks=1 --time=2 : --ntasks=1 --time=1
I would expect that the third submitted job is backfilled but it does not happen.
Here is the job completion log:
JobId=2 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 EndTime=1543317774 NodeList=s19r2b09 NodeCnt=1 ProcCnt=48
JobId=3 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 EndTime=1543317774 NodeList=s19r2b10 NodeCnt=1 ProcCnt=48
JobId=4 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 EndTime=1543317774 NodeList=s19r2b12 NodeCnt=1 ProcCnt=48
JobId=8 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:02:00 SubmitTime=1543317699 StartTime=1543317804 EndTime=1543317824 NodeList=s19r2b14 NodeCnt=1 ProcCnt=48
JobId=9 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:01:00 SubmitTime=1543317699 StartTime=1543317804 EndTime=1543317824 NodeList=s19r2b16 NodeCnt=1 ProcCnt=48
JobId=5 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 EndTime=1543317864 NodeList=s19r2b09 NodeCnt=1 ProcCnt=48
JobId=6 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 EndTime=1543317864 NodeList=s19r2b10 NodeCnt=1 ProcCnt=48
JobId=7 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 EndTime=1543317864 NodeList=s19r2b12 NodeCnt=1 ProcCnt=48
Would you expect this behavior?
Thanks.
Best regards,
Ana
--
Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: anaj82 at gmail.com <mailto:anaj82 at gmail.com> or ana.jokanovic at bsc.es <mailto:ana.jokanovic at bsc.es>
tel: +34 93 4137246
--
Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: anaj82 at gmail.com <mailto:anaj82 at gmail.com> or ana.jokanovic at bsc.es <mailto:ana.jokanovic at bsc.es>
tel: +34 93 4137246
--
Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: anaj82 at gmail.com <mailto:anaj82 at gmail.com> or ana.jokanovic at bsc.es <mailto:ana.jokanovic at bsc.es>
tel: +34 93 4137246
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181130/6a889c87/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4987 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181130/6a889c87/attachment-0001.bin>
More information about the slurm-users
mailing list