Hi

What do you mean that you can not have jobs from two partitions running concurrently on
the same node ?
E.g. right now the node btc-dfa-gpu-02 is running jobs from the qst and the onlycpus-opp partitions:

sgaravat@cld-ter-ui-01 ~]$ squeue | grep btc-dfa
            283558 onlycpus- myscript sgaravat  R       0:10      1 btc-dfa-gpu-02
            283559 onlycpus- myscript sgaravat  R       0:10      1 btc-dfa-gpu-02
            283560 onlycpus- myscript sgaravat  R       0:10      1 btc-dfa-gpu-02
            283561 onlycpus- myscript sgaravat  R       0:10      1 btc-dfa-gpu-02
            283562 onlycpus- myscript sgaravat  R       0:10      1 btc-dfa-gpu-02
            283563 onlycpus- myscript sgaravat  R       0:10      1 btc-dfa-gpu-02
            283382       qst morun_ci   barone  R 1-23:37:36      1 btc-dfa-gpu-02
            283383       qst morun_ci   barone  R 1-23:37:36      1 btc-dfa-gpu-02
            283388       qst morun_mv   barone  R 1-23:37:36      1 btc-dfa-gpu-02
            283381       qst morun_ci   barone  R 1-23:37:37      1 btc-dfa-gpu-02


Cheers, Massimo

On Mon, Apr 13, 2026 at 2:18 PM Diego Zuccato via slurm-users <slurm-users@lists.schedmd.com> wrote:
IIRC, you can not have jobs from two partitions running concurrently on
the same node, the requested resources are irrelevant. Seems a node can
only be in a single partition at a time.

Diego

Il 13/04/26 13:02, Massimo Sgaravatto via slurm-users ha scritto:
> Dear all
>
> I (try to) manage a slurm cluster composed by some CPU-only nodes and
> some worker nodes which have also GPUs:
>
> NodeName=cld-ter-[01-06] Sockets=2 CoresPerSocket=96 ThreadsPerCore=2
> RealMemory=1536000 State=UNKNOWN
> NodeName=cld-ter-gpu-[01-05] Sockets=2 CoresPerSocket=96
> ThreadsPerCore=2 Gres=gpu:nvidia-h100:4 RealMemory=1536000 State=UNKNOWN
>
> The GPU nodes are exposed through multiple partitions:
>
>
> PartitionName=gpus Nodes=cld-ter-gpu-[01-02] State=UP PriorityTier=20
> PartitionName=sparch Nodes=cld-ter-gpu-03 AllowAccounts=sparch,operators
> QoS=sparch State=UP PriorityTier=20
> PartitionName=geant4 Nodes=cld-ter-gpu-03 AllowAccounts=geant4,operators
> QoS=geant4 State=UP PriorityTier=20
> PartitionName=enipred Nodes=cld-ter-gpu-04
> AllowAccounts=enipred,operators QoS=enipred State=UP PriorityTier=20
> PartitionName=enipiml Nodes=cld-ter-gpu-05
> AllowAccounts=enipiml,operators QoS=enipiml State=UP PriorityTier=20
>
>
>
> We also set a partition to allow cpu-only jobs on the GPU nodes, but
> these jobs should be preempted (killed and requeued) if jobs submitted
> to partitions with higher priorities require those resources:
>
>
>
> PreemptType=preempt/partition_prio
> PreemptMode=REQUEUE
> PartitionName=onlycpus-opp Nodes=cld-ter-gpu-[01-05],cld-dfa-gpu-06,btc-
> dfa-gpu-02 State=UP PriorityTier=10
>
> Now, I don't understand why this job [*] submitted on the onlycpus-opp
> partition can't start running e.g. on the cld-ter-gpu-01, since it has a
> lot of free resources:
>
> [sgaravat@cld-ter-ui-01 ~]$ scontrol show node cld-ter-gpu-01
> NodeName=cld-ter-gpu-01 Arch=x86_64 CoresPerSocket=96
>     CPUAlloc=8 CPUEfctv=384 CPUTot=384 CPULoad=5.93
>     AvailableFeatures=(null)
>     ActiveFeatures=(null)
>     Gres=gpu:nvidia-h100:4
>     NodeAddr=cld-ter-gpu-01 NodeHostName=cld-ter-gpu-01 Version=25.11.3
>     OS=Linux 5.14.0-611.45.1.el9_7.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Apr
> 1 05:56:53 EDT 2026
>     RealMemory=1536000 AllocMem=560000 FreeMem=1192357 Sockets=2 Boards=1
>     State=MIXED+PLANNED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A
> MCS_label=N/A
>     Partitions=gpus,onlycpus-opp
>     BootTime=2026-04-09T10:39:35 SlurmdStartTime=2026-04-09T10:40:01
>     LastBusyTime=2026-04-09T11:54:46 ResumeAfterTime=None
>     CfgTRES=cpu=384,mem=1500G,billing=839,gres/gpu=4,gres/gpu:nvidia-h100=4
>     AllocTRES=cpu=8,mem=560000M,gres/gpu=4,gres/gpu:nvidia-h100=4
>     CurrentWatts=0 AveWatts=0
>
>
> I guess the "MIXED+PLANNED" is the answer, but as far as I can see only
> a job (283469) is planned  for this worker node:
>
> sgaravat@cld-ter-ui-01 ~]$ squeue --start | grep ter-gpu-01
>               JOBID PARTITION     NAME     USER ST          START_TIME
>   NODES SCHEDNODES           NODELIST(REASON)
>
>              283469      gpus vllm-pod ciangott PD 2026-04-13T14:31:40   
>     1 cld-ter-gpu-01       (Resources)
>
> But job 283469 doesn't require too many resources [**], so the 2 jobs
> could run together. Why job 283534 can't start ?
> Any hints ?
>
> Thanks, Massimo
>
>
>
> [*]
>
> [sgaravat@cld-ter-ui-01 ~]$ scontrol show job=283534
> JobId=283534 JobName=myscript.sh
>     UserId=sgaravat(5008) GroupId=tbadmin(5001) MCS_label=N/A
>     Priority=542954 Nice=0 Account=operators QOS=normal
>     JobState=RUNNING Reason=None Dependency=(null)
>     Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
>     RunTime=00:00:41 TimeLimit=1-00:00:00 TimeMin=N/A
>     SubmitTime=2026-04-13T11:10:13 EligibleTime=2026-04-13T11:10:13
>     AccrueTime=2026-04-13T11:10:13
>     StartTime=2026-04-13T11:58:39 EndTime=2026-04-14T11:58:39 Deadline=N/A
>     PreemptEligibleTime=2026-04-13T11:58:39 PreemptTime=None
>     SuspendTime=None SecsPreSuspend=0 LastSchedEval=2026-04-13T11:58:39
> Scheduler=Backfill
>     Partition=onlycpus-opp AllocNode:Sid=cld-ter-ui-01:3035857
>     ReqNodeList=(null) ExcNodeList=(null)
>     NodeList=btc-dfa-gpu-02
>     BatchHost=btc-dfa-gpu-02
>     NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
>     ReqTRES=cpu=1,mem=100G,node=1,billing=26
>     AllocTRES=cpu=1,mem=100G,node=1,billing=26
>     Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
>     MinCPUsNode=1 MinMemoryNode=100G MinTmpDiskNode=0
>     Features=(null) DelayBoot=00:00:00
>     OverSubscribe=OK Contiguous=0 Licenses=(null) LicensesAlloc=(null)
> Network=(null)
>     Command=/shared/home/sgaravat/myscript.sh
>     SubmitLine=sbatch myscript.sh
>     WorkDir=/shared/home/sgaravat
>     StdErr=/shared/home/sgaravat/JOB-myscript.sh.283534.4294967294.err
>     StdIn=/dev/null
>     StdOut=/shared/home/sgaravat/JOB-myscript.sh.283534.4294967294.out
>     MailUser=massimo.sgaravatto@pd.infn.it
> <mailto:massimo.sgaravatto@pd.infn.it>
> MailType=INVALID_DEPEND,BEGIN,END,FAIL,REQUEUE,STAGE_OUT
>
> [**]
> sgaravat@cld-ter-ui-01 ~]$ scontrol show job=283469
> JobId=283469 JobName=vllm-pod
>     UserId=ciangott(6054) GroupId=tbuser(6000) MCS_label=N/A
>     Priority=499703 Nice=0 Account=cms QOS=normal
>     JobState=PENDING Reason=Resources Dependency=(null)
>     Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
>     RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
>     SubmitTime=2026-04-13T06:48:37 EligibleTime=2026-04-13T06:48:37
>     AccrueTime=2026-04-13T06:48:37
>     StartTime=2026-04-13T14:31:40 EndTime=2026-04-14T14:31:40 Deadline=N/A
>     SuspendTime=None SecsPreSuspend=0 LastSchedEval=2026-04-13T11:59:48
> Scheduler=Main
>     Partition=gpus AllocNode:Sid=cld-ter-ui-01:3015801
>     ReqNodeList=(null) ExcNodeList=(null)
>     NodeList= SchedNodeList=cld-ter-gpu-01
>     NumNodes=1-1 NumCPUs=32 NumTasks=1 CPUs/Task=32 ReqB:S:C:T=0:0:*:*
>     ReqTRES=cpu=32,mem=190734M,node=1,billing=118,gres/gpu=2,gres/
> gpu:nvidia-h100=2
>     AllocTRES=(null)
>     Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
>     MinCPUsNode=32 MinMemoryNode=190734M MinTmpDiskNode=0
>     Features=(null) DelayBoot=00:00:00
>     OverSubscribe=OK Contiguous=0 Licenses=(null) LicensesAlloc=(null)
> Network=(null)
>     Command=.interlink/jobs/default-0c0257f8-d1ea-4135-
> a602-96c229ce8516/job.slurm
>     SubmitLine=sbatch .interlink/jobs/default-0c0257f8-d1ea-4135-
> a602-96c229ce8516/job.slurm
>     WorkDir=/shared/home/ciangott
>     StdErr=
>     StdIn=/dev/null
>     StdOut=/shared/home/ciangott/.interlink/jobs/default-0c0257f8-
> d1ea-4135-a602-96c229ce8516/job.out
>     TresPerNode=gres/gpu:nvidia-h100:2
>     TresPerTask=cpu=32
>
>
>

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786


--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com