[slurm-users] backfill scheduler does not work for heterogeneous jobs (version 17.11)

Kenneth Roberts kroberts at materialsdesign.com
Fri Nov 30 09:43:17 MST 2018


There are some Limitations that mention backfill on the heterogeneous job support page.

 

https://slurm.schedmd.com/heterogeneous_jobs.html#limitations

 

Maybe there’s some information there to help?

 

Ken

 

From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Ana Jokanovic
Sent: Thursday, November 29, 2018 4:28 AM
To: slurm-users at lists.schedmd.com
Subject: [slurm-users] backfill scheduler does not work for heterogeneous jobs (version 17.11)

 

 

 

Hello,

 

I did a simple test submitting the workload of three jobs (see below) on a cluster of 5 nodes:

 

sbatch --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 --time=15

sbatch --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 --time=15 : --cpus-per-task=2 --ntasks=6 --time=15

sleep 5;

sbatch --ntasks=1 --time=2 : --ntasks=1 --time=1

 

I would expect that the third submitted job is backfilled but it does not happen.

Here is the job completion log:

 

JobId=2 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 EndTime=1543317774 NodeList=s19r2b09 NodeCnt=1 ProcCnt=48 

JobId=3 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 EndTime=1543317774 NodeList=s19r2b10 NodeCnt=1 ProcCnt=48 

JobId=4 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317714 EndTime=1543317774 NodeList=s19r2b12 NodeCnt=1 ProcCnt=48 

JobId=8 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:02:00 SubmitTime=1543317699 StartTime=1543317804 EndTime=1543317824 NodeList=s19r2b14 NodeCnt=1 ProcCnt=48 

JobId=9 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:01:00 SubmitTime=1543317699 StartTime=1543317804 EndTime=1543317824 NodeList=s19r2b16 NodeCnt=1 ProcCnt=48 

JobId=5 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 EndTime=1543317864 NodeList=s19r2b09 NodeCnt=1 ProcCnt=48 

JobId=6 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 EndTime=1543317864 NodeList=s19r2b10 NodeCnt=1 ProcCnt=48 

JobId=7 UserId=3113 GroupId=8950 Name=sleep JobState=COMPLETED Partition=debug TimeLimit=00:15:00 SubmitTime=1543317694 StartTime=1543317804 EndTime=1543317864 NodeList=s19r2b12 NodeCnt=1 ProcCnt=48 

 

Would you expect this behavior?

 

Thanks.

 

Best regards,

Ana

-- 

Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: anaj82 at gmail.com <mailto:anaj82 at gmail.com>  or ana.jokanovic at bsc.es <mailto:ana.jokanovic at bsc.es> 
tel: +34 93 4137246




 

-- 

Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: anaj82 at gmail.com <mailto:anaj82 at gmail.com>  or ana.jokanovic at bsc.es <mailto:ana.jokanovic at bsc.es> 
tel: +34 93 4137246




 

-- 

Ana Jokanovic, PhD
Barcelona Supercomputing Center
c/ Jordi Girona 1-3, K2M Building, 1st floor
08034 Barcelona - SPAIN
e-mail: anaj82 at gmail.com <mailto:anaj82 at gmail.com>  or ana.jokanovic at bsc.es <mailto:ana.jokanovic at bsc.es> 
tel: +34 93 4137246

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181130/6a889c87/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4987 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181130/6a889c87/attachment-0001.bin>


More information about the slurm-users mailing list