[slurm-users] Re: Why my job can't start (backfill reservation issue)

13 Apr 2026

      Hi Diego,

I believe that a node may run jobs from multiple partitions at the same 
time.  Example of a node in our cluster:

$ sinfo -n sd652
PARTITION      AVAIL  TIMELIMIT  NODES  STATE NODELIST
(lines deleted)
a100_week         up 7-00:00:00      1  alloc sd652
a100              up 2-02:00:00      1  alloc sd652

I believe this was always the case (we're running Slurm 25.11.4).

Best regards,
Ole

On 4/13/26 13:54, Diego Zuccato via slurm-users wrote:
...
IIRC, you can not have jobs from two partitions running concurrently on 
the same node, the requested resources are irrelevant. Seems a node can 
only be in a single partition at a time.
Diego
Il 13/04/26 13:02, Massimo Sgaravatto via slurm-users ha scritto:
...
Dear all
I (try to) manage a slurm cluster composed by some CPU-only nodes and 
some worker nodes which have also GPUs:
NodeName=cld-ter-[01-06] Sockets=2 CoresPerSocket=96 ThreadsPerCore=2 
RealMemory=1536000 State=UNKNOWN
NodeName=cld-ter-gpu-[01-05] Sockets=2 CoresPerSocket=96 
ThreadsPerCore=2 Gres=gpu:nvidia-h100:4 RealMemory=1536000 State=UNKNOWN
The GPU nodes are exposed through multiple partitions:
PartitionName=gpus Nodes=cld-ter-gpu-[01-02] State=UP PriorityTier=20
PartitionName=sparch Nodes=cld-ter-gpu-03 AllowAccounts=sparch,operators 
QoS=sparch State=UP PriorityTier=20
PartitionName=geant4 Nodes=cld-ter-gpu-03 AllowAccounts=geant4,operators 
QoS=geant4 State=UP PriorityTier=20
PartitionName=enipred Nodes=cld-ter-gpu-04 
AllowAccounts=enipred,operators QoS=enipred State=UP PriorityTier=20
PartitionName=enipiml Nodes=cld-ter-gpu-05 
AllowAccounts=enipiml,operators QoS=enipiml State=UP PriorityTier=20
We also set a partition to allow cpu-only jobs on the GPU nodes, but 
these jobs should be preempted (killed and requeued) if jobs submitted 
to partitions with higher priorities require those resources:
PreemptType=preempt/partition_prio
PreemptMode=REQUEUE
PartitionName=onlycpus-opp Nodes=cld-ter-gpu-[01-05],cld-dfa-gpu-06,btc- 
dfa-gpu-02 State=UP PriorityTier=10
Now, I don't understand why this job [*] submitted on the onlycpus-opp 
partition can't start running e.g. on the cld-ter-gpu-01, since it has a 
lot of free resources:
[sgaravat@cld-ter-ui-01 ~]$ scontrol show node cld-ter-gpu-01
NodeName=cld-ter-gpu-01 Arch=x86_64 CoresPerSocket=96
    CPUAlloc=8 CPUEfctv=384 CPUTot=384 CPULoad=5.93
    AvailableFeatures=(null)
    ActiveFeatures=(null)
    Gres=gpu:nvidia-h100:4
    NodeAddr=cld-ter-gpu-01 NodeHostName=cld-ter-gpu-01 Version=25.11.3
    OS=Linux 5.14.0-611.45.1.el9_7.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Apr 
1 05:56:53 EDT 2026
    RealMemory=1536000 AllocMem=560000 FreeMem=1192357 Sockets=2 Boards=1
    State=MIXED+PLANNED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A 
MCS_label=N/A
    Partitions=gpus,onlycpus-opp
    BootTime=2026-04-09T10:39:35 SlurmdStartTime=2026-04-09T10:40:01
    LastBusyTime=2026-04-09T11:54:46 ResumeAfterTime=None
    CfgTRES=cpu=384,mem=1500G,billing=839,gres/gpu=4,gres/gpu:nvidia-h100=4
    AllocTRES=cpu=8,mem=560000M,gres/gpu=4,gres/gpu:nvidia-h100=4
    CurrentWatts=0 AveWatts=0
I guess the "MIXED+PLANNED" is the answer, but as far as I can see only 
a job (283469) is planned  for this worker node:
sgaravat@cld-ter-ui-01 ~]$ squeue --start | grep ter-gpu-01
              JOBID PARTITION     NAME     USER ST          START_TIME 
  NODES SCHEDNODES           NODELIST(REASON)
             283469      gpus vllm-pod ciangott PD 2026-04-13T14:31:40 
    1 cld-ter-gpu-01       (Resources)
But job 283469 doesn't require too many resources [**], so the 2 jobs 
could run together. Why job 283534 can't start ?
Any hints ?
Thanks, Massimo
[*]
[sgaravat@cld-ter-ui-01 ~]$ scontrol show job=283534
JobId=283534 JobName=myscript.sh
    UserId=sgaravat(5008) GroupId=tbadmin(5001) MCS_label=N/A
    Priority=542954 Nice=0 Account=operators QOS=normal
    JobState=RUNNING Reason=None Dependency=(null)
    Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
    RunTime=00:00:41 TimeLimit=1-00:00:00 TimeMin=N/A
    SubmitTime=2026-04-13T11:10:13 EligibleTime=2026-04-13T11:10:13
    AccrueTime=2026-04-13T11:10:13
    StartTime=2026-04-13T11:58:39 EndTime=2026-04-14T11:58:39 Deadline=N/A
    PreemptEligibleTime=2026-04-13T11:58:39 PreemptTime=None
    SuspendTime=None SecsPreSuspend=0 LastSchedEval=2026-04-13T11:58:39 
Scheduler=Backfill
    Partition=onlycpus-opp AllocNode:Sid=cld-ter-ui-01:3035857
    ReqNodeList=(null) ExcNodeList=(null)
    NodeList=btc-dfa-gpu-02
    BatchHost=btc-dfa-gpu-02
    NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
    ReqTRES=cpu=1,mem=100G,node=1,billing=26
    AllocTRES=cpu=1,mem=100G,node=1,billing=26
    Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
    MinCPUsNode=1 MinMemoryNode=100G MinTmpDiskNode=0
    Features=(null) DelayBoot=00:00:00
    OverSubscribe=OK Contiguous=0 Licenses=(null) LicensesAlloc=(null) 
Network=(null)
    Command=/shared/home/sgaravat/myscript.sh
    SubmitLine=sbatch myscript.sh
    WorkDir=/shared/home/sgaravat
    StdErr=/shared/home/sgaravat/JOB-myscript.sh.283534.4294967294.err
    StdIn=/dev/null
    StdOut=/shared/home/sgaravat/JOB-myscript.sh.283534.4294967294.out
    MailUser=massimo.sgaravatto@pd.infn.it 
<mailto:massimo.sgaravatto@pd.infn.it> 
MailType=INVALID_DEPEND,BEGIN,END,FAIL,REQUEUE,STAGE_OUT
[**]
sgaravat@cld-ter-ui-01 ~]$ scontrol show job=283469
JobId=283469 JobName=vllm-pod
    UserId=ciangott(6054) GroupId=tbuser(6000) MCS_label=N/A
    Priority=499703 Nice=0 Account=cms QOS=normal
    JobState=PENDING Reason=Resources Dependency=(null)
    Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
    RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
    SubmitTime=2026-04-13T06:48:37 EligibleTime=2026-04-13T06:48:37
    AccrueTime=2026-04-13T06:48:37
    StartTime=2026-04-13T14:31:40 EndTime=2026-04-14T14:31:40 Deadline=N/A
    SuspendTime=None SecsPreSuspend=0 LastSchedEval=2026-04-13T11:59:48 
Scheduler=Main
    Partition=gpus AllocNode:Sid=cld-ter-ui-01:3015801
    ReqNodeList=(null) ExcNodeList=(null)
    NodeList= SchedNodeList=cld-ter-gpu-01
    NumNodes=1-1 NumCPUs=32 NumTasks=1 CPUs/Task=32 ReqB:S:C:T=0:0:*:*
    ReqTRES=cpu=32,mem=190734M,node=1,billing=118,gres/gpu=2,gres/ 
gpu:nvidia-h100=2
    AllocTRES=(null)
    Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
    MinCPUsNode=32 MinMemoryNode=190734M MinTmpDiskNode=0
    Features=(null) DelayBoot=00:00:00
    OverSubscribe=OK Contiguous=0 Licenses=(null) LicensesAlloc=(null) 
Network=(null)
    Command=.interlink/jobs/default-0c0257f8-d1ea-4135- 
a602-96c229ce8516/job.slurm
    SubmitLine=sbatch .interlink/jobs/default-0c0257f8-d1ea-4135- 
a602-96c229ce8516/job.slurm
    WorkDir=/shared/home/ciangott
    StdErr=
    StdIn=/dev/null
    StdOut=/shared/home/ciangott/.interlink/jobs/default-0c0257f8- 
d1ea-4135-a602-96c229ce8516/job.out
    TresPerNode=gres/gpu:nvidia-h100:2
    TresPerTask=cpu=32

[slurm-users] Re: Why my job can't start (backfill reservation issue)

Ole Holm Nielsen