<div dir="ltr"><div>Hi,</div><div>I'm also very interested in how this could be done properly. At the moment what we are doing is setting up partitions with MaxCPUsPerNode set to CPUs-GPUs. Maybe this can help you in the meanwhile, but this is a suboptimal solution (in fact we have nodes with different number of CPUs, so we had to make a partition per "node type"). Someone else can have a better idea.</div><div>Cheers,<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El lun., 31 ago. 2020 a las 16:45, Manuel BERTRAND (<<a href="mailto:Manuel.Bertrand@lis-lab.fr">Manuel.Bertrand@lis-lab.fr</a>>) escribió:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi list,<br>
<br>
I am totally new to Slurm and have just deployed a heterogeneous GPU/CPU <br>
cluster by following the latest OpenHPC recipe on CentOS 8.2 (thanks <br>
OpenHPC team for making those !)<br>
Every thing works great so far but now I would like to bound a specific <br>
core to each GPUs on each node. By "bound" I mean to make a particular <br>
core not assignable to a CPU job alone so that the GPU is available <br>
whatever the CPU workload on the node. I'm asking this because in the <br>
actual state a CPU only user can monopolize the whole node, preventing a <br>
GPU user to come in as there is no CPU available even if the GPU is <br>
free. I'm not sure what is the best way to enforce this. Hope this is <br>
clear :)<br>
<br>
Any help greatly appreciated !<br>
<br>
Here is my gres.conf, cgroup.conf, partitions configuration, followed by <br>
the output of 'scontrol show config':<br>
<br>
########### gres.conf ############<br>
NodeName=gpunode1 Name=gpu  File=/dev/nvidia0<br>
NodeName=gpunode1 Name=gpu  File=/dev/nvidia1<br>
NodeName=gpunode1 Name=gpu  File=/dev/nvidia2<br>
NodeName=gpunode1 Name=gpu  File=/dev/nvidia3<br>
NodeName=gpunode2 Name=gpu  File=/dev/nvidia0<br>
NodeName=gpunode2 Name=gpu  File=/dev/nvidia1<br>
NodeName=gpunode2 Name=gpu  File=/dev/nvidia2<br>
NodeName=gpunode3 Name=gpu  File=/dev/nvidia0<br>
NodeName=gpunode3 Name=gpu  File=/dev/nvidia1<br>
NodeName=gpunode3 Name=gpu  File=/dev/nvidia2<br>
NodeName=gpunode3 Name=gpu  File=/dev/nvidia3<br>
NodeName=gpunode3 Name=gpu  File=/dev/nvidia4<br>
NodeName=gpunode3 Name=gpu  File=/dev/nvidia5<br>
NodeName=gpunode3 Name=gpu  File=/dev/nvidia6<br>
NodeName=gpunode3 Name=gpu  File=/dev/nvidia7<br>
NodeName=gpunode4 Name=gpu  File=/dev/nvidia0<br>
NodeName=gpunode4 Name=gpu  File=/dev/nvidia1<br>
NodeName=gpunode5 Name=gpu  File=/dev/nvidia0<br>
NodeName=gpunode5 Name=gpu  File=/dev/nvidia1<br>
NodeName=gpunode5 Name=gpu  File=/dev/nvidia2<br>
NodeName=gpunode5 Name=gpu  File=/dev/nvidia3<br>
NodeName=gpunode5 Name=gpu  File=/dev/nvidia4<br>
NodeName=gpunode5 Name=gpu  File=/dev/nvidia5<br>
NodeName=gpunode6 Name=gpu  File=/dev/nvidia0<br>
NodeName=gpunode6 Name=gpu  File=/dev/nvidia1<br>
NodeName=gpunode6 Name=gpu  File=/dev/nvidia2<br>
NodeName=gpunode6 Name=gpu  File=/dev/nvidia3<br>
NodeName=gpunode7 Name=gpu  File=/dev/nvidia0<br>
NodeName=gpunode7 Name=gpu  File=/dev/nvidia1<br>
NodeName=gpunode7 Name=gpu  File=/dev/nvidia2<br>
NodeName=gpunode7 Name=gpu  File=/dev/nvidia3<br>
NodeName=gpunode8 Name=gpu  File=/dev/nvidia0<br>
NodeName=gpunode8 Name=gpu  File=/dev/nvidia1<br>
<br>
########### cgroup.conf ############<br>
CgroupAutomount=yes<br>
TaskAffinity=no<br>
ConstrainCores=yes<br>
ConstrainRAMSpace=yes<br>
ConstrainSwapSpace=yes<br>
ConstrainKmemSpace=no<br>
ConstrainDevices=yes<br>
<br>
<br>
########### partitions configuration ###########<br>
PartitionName=cpu Nodes=cpunode1,cpunode2,cpunode3,cpunode4,cpunode5 <br>
Default=NO DefaultTime=60 MaxTime=168:00:00 State=UP<br>
PartitionName=gpu <br>
Nodes=gpunode1,gpunode2,gpunode3,gpunode4,gpunode5,gpunode6,gpunode7,gpunode8 <br>
Default=NO DefaultTime=60 MaxTime=168:00:00 State=UP<br>
PartitionName=all Nodes=ALL Default=YES DefaultTime=60 MaxTime=168:00:00 <br>
State=UP<br>
<br>
<br>
########### Slurm configuration ###########<br>
Configuration data as of 2020-08-31T16:23:54<br>
AccountingStorageBackupHost = (null)<br>
AccountingStorageEnforce = none<br>
AccountingStorageHost   = sms.mycluster<br>
AccountingStorageLoc    = N/A<br>
AccountingStoragePort   = 6819<br>
AccountingStorageTRES   = cpu,mem,energy,node,billing,fs/disk,vmem,pages<br>
AccountingStorageType   = accounting_storage/slurmdbd<br>
AccountingStorageUser   = N/A<br>
AccountingStoreJobComment = No<br>
AcctGatherEnergyType    = acct_gather_energy/none<br>
AcctGatherFilesystemType = acct_gather_filesystem/none<br>
AcctGatherInterconnectType = acct_gather_interconnect/none<br>
AcctGatherNodeFreq      = 0 sec<br>
AcctGatherProfileType   = acct_gather_profile/none<br>
AllowSpecResourcesUsage = No<br>
AuthAltTypes            = (null)<br>
AuthInfo                = (null)<br>
AuthType                = auth/munge<br>
BatchStartTimeout       = 10 sec<br>
<br>
EpilogMsgTime           = 2000 usec<br>
EpilogSlurmctld         = (null)<br>
ExtSensorsType          = ext_sensors/none<br>
ExtSensorsFreq          = 0 sec<br>
FederationParameters    = (null)<br>
FirstJobId              = 1<br>
GetEnvTimeout           = 2 sec<br>
GresTypes               = gpu<br>
GpuFreqDef              = high,memory=high<br>
GroupUpdateForce        = 1<br>
GroupUpdateTime         = 600 sec<br>
HASH_VAL                = Match<br>
HealthCheckInterval     = 300 sec<br>
HealthCheckNodeState    = ANY<br>
HealthCheckProgram      = /usr/sbin/nhc<br>
InactiveLimit           = 0 sec<br>
JobAcctGatherFrequency  = 30<br>
JobAcctGatherType       = jobacct_gather/none<br>
JobAcctGatherParams     = (null)<br>
JobCompHost             = localhost<br>
JobCompLoc              = /var/log/slurm_jobcomp.log<br>
JobCompPort             = 0<br>
JobCompType             = jobcomp/none<br>
JobCompUser             = root<br>
JobContainerType        = job_container/none<br>
JobCredentialPrivateKey = (null)<br>
JobCredentialPublicCertificate = (null)<br>
JobDefaults             = (null)<br>
JobFileAppend           = 0<br>
JobRequeue              = 1<br>
JobSubmitPlugins        = (null)<br>
KeepAliveTime           = SYSTEM_DEFAULT<br>
KillOnBadExit           = 0<br>
KillWait                = 30 sec<br>
LaunchParameters        = (null)<br>
LaunchType              = launch/slurm<br>
Layouts                 =<br>
Licenses                = (null)<br>
LogTimeFormat           = iso8601_ms<br>
MailDomain              = (null)<br>
MailProg                = /usr/bin/mail<br>
MaxArraySize            = 1001<br>
MaxDBDMsgs              = 20052<br>
MaxJobCount             = 10000<br>
MaxJobId                = 67043328<br>
MaxMemPerNode           = UNLIMITED<br>
MaxStepCount            = 40000<br>
<br>
PropagateResourceLimits = (null)<br>
PropagateResourceLimitsExcept = MEMLOCK<br>
RebootProgram           = /sbin/reboot<br>
ReconfigFlags           = (null)<br>
RequeueExit             = (null)<br>
RequeueExitHold         = (null)<br>
ResumeFailProgram       = (null)<br>
ResumeProgram           = (null)<br>
ResumeRate              = 300 nodes/min<br>
ResumeTimeout           = 600 sec<br>
ResvEpilog              = (null)<br>
ResvOverRun             = 0 min<br>
ResvProlog              = (null)<br>
ReturnToService         = 2<br>
RoutePlugin             = route/default<br>
SallocDefaultCommand    = (null)<br>
SbcastParameters        = (null)<br>
SchedulerParameters     = (null)<br>
SchedulerTimeSlice      = 30 sec<br>
SchedulerType           = sched/backfill<br>
SelectType              = select/cons_tres<br>
SelectTypeParameters    = CR_CORE<br>
SlurmUser               = slurm(202)<br>
SlurmctldAddr           = (null)<br>
SlurmctldDebug          = debug2<br>
SlurmctldHost[0]        = sms.mycluster<br>
SlurmctldLogFile        = /var/log/slurmctld.log<br>
SlurmctldPort           = 6817<br>
SlurmctldSyslogDebug    = unknown<br>
SlurmctldPrimaryOffProg = (null)<br>
SlurmctldPrimaryOnProg  = (null)<br>
SlurmctldTimeout        = 300 sec<br>
SlurmctldParameters     = enable_configless<br>
SlurmdDebug             = debug2<br>
SlurmdLogFile           = /var/log/slurmd.log<br>
SlurmdParameters        = (null)<br>
SlurmdPidFile           = /var/run/slurmd.pid<br>
SlurmdPort              = 6818<br>
SlurmdSpoolDir          = /var/spool/slurm/d<br>
SlurmdSyslogDebug       = unknown<br>
SlurmdTimeout           = 300 sec<br>
SlurmdUser              = root(0)<br>
SlurmSchedLogFile       = (null)<br>
SlurmSchedLogLevel      = 0<br>
SlurmctldPidFile        = /var/run/slurmctld.pid<br>
SlurmctldPlugstack      = (null)<br>
SLURM_CONF              = /etc/slurm/slurm.conf<br>
<br>
SrunPortRange           = 0-0<br>
SrunProlog              = (null)<br>
StateSaveLocation       = /var/spool/slurm/ctld<br>
SuspendExcNodes         = (null)<br>
SuspendExcParts         = (null)<br>
SuspendProgram          = (null)<br>
SuspendRate             = 60 nodes/min<br>
SuspendTime             = NONE<br>
SuspendTimeout          = 30 sec<br>
SwitchType              = switch/none<br>
TaskEpilog              = (null)<br>
TaskPlugin              = task/affinity,task/cgroup<br>
TaskPluginParam         = (null type)<br>
TaskProlog              = (null)<br>
TCPTimeout              = 2 sec<br>
TmpFS                   = /scratch<br>
TopologyParam           = (null)<br>
TopologyPlugin          = topology/none<br>
TrackWCKey              = No<br>
TreeWidth               = 50<br>
UsePam                  = No<br>
UnkillableStepProgram   = (null)<br>
UnkillableStepTimeout   = 60 sec<br>
VSizeFactor             = 0 percent<br>
WaitTime                = 0 sec<br>
X11Parameters           = (null)<br>
<br>
Cgroup Support Configuration:<br>
AllowedDevicesFile      = /etc/slurm/cgroup_allowed_devices_file.conf<br>
AllowedKmemSpace        = (null)<br>
AllowedRAMSpace         = 100.0%<br>
AllowedSwapSpace        = 0.0%<br>
CgroupAutomount         = yes<br>
CgroupMountpoint        = /sys/fs/cgroup<br>
ConstrainCores          = yes<br>
ConstrainDevices        = yes<br>
ConstrainKmemSpace      = no<br>
ConstrainRAMSpace       = yes<br>
ConstrainSwapSpace      = yes<br>
MaxKmemPercent          = 100.0%<br>
MaxRAMPercent           = 100.0%<br>
MaxSwapPercent          = 100.0%<br>
MemorySwappiness        = (null)<br>
MinKmemSpace            = 30 MB<br>
MinRAMSpace             = 30 MB<br>
TaskAffinity            = no<br>
<br>
Slurmctld(primary) at sms.mycluster is UP<br>
<br>
<br>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div style="font-size:12.8px">Stephan Schott Verdugo<br></div><span style="font-size:12.8px">Biochemist</span><br style="font-size:12.8px"><div style="font-size:12.8px"><br>Heinrich-Heine-Universitaet Duesseldorf<br>Institut fuer Pharm. und Med. Chemie<br>Universitaetsstr. 1<br>40225 Duesseldorf<br>Germany</div></div></div>