<div dir="ltr"><div>Hi,</div><div>I'm also very interested in how this could be done properly. At the moment what we are doing is setting up partitions with MaxCPUsPerNode set to CPUs-GPUs. Maybe this can help you in the meanwhile, but this is a suboptimal solution (in fact we have nodes with different number of CPUs, so we had to make a partition per "node type"). Someone else can have a better idea.</div><div>Cheers,<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El lun., 31 ago. 2020 a las 16:45, Manuel BERTRAND (<<a href="mailto:Manuel.Bertrand@lis-lab.fr">Manuel.Bertrand@lis-lab.fr</a>>) escribió:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi list,<br>
<br>
I am totally new to Slurm and have just deployed a heterogeneous GPU/CPU <br>
cluster by following the latest OpenHPC recipe on CentOS 8.2 (thanks <br>
OpenHPC team for making those !)<br>
Every thing works great so far but now I would like to bound a specific <br>
core to each GPUs on each node. By "bound" I mean to make a particular <br>
core not assignable to a CPU job alone so that the GPU is available <br>
whatever the CPU workload on the node. I'm asking this because in the <br>
actual state a CPU only user can monopolize the whole node, preventing a <br>
GPU user to come in as there is no CPU available even if the GPU is <br>
free. I'm not sure what is the best way to enforce this. Hope this is <br>
clear :)<br>
<br>
Any help greatly appreciated !<br>
<br>
Here is my gres.conf, cgroup.conf, partitions configuration, followed by <br>
the output of 'scontrol show config':<br>
<br>
########### gres.conf ############<br>
NodeName=gpunode1 Name=gpu File=/dev/nvidia0<br>
NodeName=gpunode1 Name=gpu File=/dev/nvidia1<br>
NodeName=gpunode1 Name=gpu File=/dev/nvidia2<br>
NodeName=gpunode1 Name=gpu File=/dev/nvidia3<br>
NodeName=gpunode2 Name=gpu File=/dev/nvidia0<br>
NodeName=gpunode2 Name=gpu File=/dev/nvidia1<br>
NodeName=gpunode2 Name=gpu File=/dev/nvidia2<br>
NodeName=gpunode3 Name=gpu File=/dev/nvidia0<br>
NodeName=gpunode3 Name=gpu File=/dev/nvidia1<br>
NodeName=gpunode3 Name=gpu File=/dev/nvidia2<br>
NodeName=gpunode3 Name=gpu File=/dev/nvidia3<br>
NodeName=gpunode3 Name=gpu File=/dev/nvidia4<br>
NodeName=gpunode3 Name=gpu File=/dev/nvidia5<br>
NodeName=gpunode3 Name=gpu File=/dev/nvidia6<br>
NodeName=gpunode3 Name=gpu File=/dev/nvidia7<br>
NodeName=gpunode4 Name=gpu File=/dev/nvidia0<br>
NodeName=gpunode4 Name=gpu File=/dev/nvidia1<br>
NodeName=gpunode5 Name=gpu File=/dev/nvidia0<br>
NodeName=gpunode5 Name=gpu File=/dev/nvidia1<br>
NodeName=gpunode5 Name=gpu File=/dev/nvidia2<br>
NodeName=gpunode5 Name=gpu File=/dev/nvidia3<br>
NodeName=gpunode5 Name=gpu File=/dev/nvidia4<br>
NodeName=gpunode5 Name=gpu File=/dev/nvidia5<br>
NodeName=gpunode6 Name=gpu File=/dev/nvidia0<br>
NodeName=gpunode6 Name=gpu File=/dev/nvidia1<br>
NodeName=gpunode6 Name=gpu File=/dev/nvidia2<br>
NodeName=gpunode6 Name=gpu File=/dev/nvidia3<br>
NodeName=gpunode7 Name=gpu File=/dev/nvidia0<br>
NodeName=gpunode7 Name=gpu File=/dev/nvidia1<br>
NodeName=gpunode7 Name=gpu File=/dev/nvidia2<br>
NodeName=gpunode7 Name=gpu File=/dev/nvidia3<br>
NodeName=gpunode8 Name=gpu File=/dev/nvidia0<br>
NodeName=gpunode8 Name=gpu File=/dev/nvidia1<br>
<br>
########### cgroup.conf ############<br>
CgroupAutomount=yes<br>
TaskAffinity=no<br>
ConstrainCores=yes<br>
ConstrainRAMSpace=yes<br>
ConstrainSwapSpace=yes<br>
ConstrainKmemSpace=no<br>
ConstrainDevices=yes<br>
<br>
<br>
########### partitions configuration ###########<br>
PartitionName=cpu Nodes=cpunode1,cpunode2,cpunode3,cpunode4,cpunode5 <br>
Default=NO DefaultTime=60 MaxTime=168:00:00 State=UP<br>
PartitionName=gpu <br>
Nodes=gpunode1,gpunode2,gpunode3,gpunode4,gpunode5,gpunode6,gpunode7,gpunode8 <br>
Default=NO DefaultTime=60 MaxTime=168:00:00 State=UP<br>
PartitionName=all Nodes=ALL Default=YES DefaultTime=60 MaxTime=168:00:00 <br>
State=UP<br>
<br>
<br>
########### Slurm configuration ###########<br>
Configuration data as of 2020-08-31T16:23:54<br>
AccountingStorageBackupHost = (null)<br>
AccountingStorageEnforce = none<br>
AccountingStorageHost = sms.mycluster<br>
AccountingStorageLoc = N/A<br>
AccountingStoragePort = 6819<br>
AccountingStorageTRES = cpu,mem,energy,node,billing,fs/disk,vmem,pages<br>
AccountingStorageType = accounting_storage/slurmdbd<br>
AccountingStorageUser = N/A<br>
AccountingStoreJobComment = No<br>
AcctGatherEnergyType = acct_gather_energy/none<br>
AcctGatherFilesystemType = acct_gather_filesystem/none<br>
AcctGatherInterconnectType = acct_gather_interconnect/none<br>
AcctGatherNodeFreq = 0 sec<br>
AcctGatherProfileType = acct_gather_profile/none<br>
AllowSpecResourcesUsage = No<br>
AuthAltTypes = (null)<br>
AuthInfo = (null)<br>
AuthType = auth/munge<br>
BatchStartTimeout = 10 sec<br>
<br>
EpilogMsgTime = 2000 usec<br>
EpilogSlurmctld = (null)<br>
ExtSensorsType = ext_sensors/none<br>
ExtSensorsFreq = 0 sec<br>
FederationParameters = (null)<br>
FirstJobId = 1<br>
GetEnvTimeout = 2 sec<br>
GresTypes = gpu<br>
GpuFreqDef = high,memory=high<br>
GroupUpdateForce = 1<br>
GroupUpdateTime = 600 sec<br>
HASH_VAL = Match<br>
HealthCheckInterval = 300 sec<br>
HealthCheckNodeState = ANY<br>
HealthCheckProgram = /usr/sbin/nhc<br>
InactiveLimit = 0 sec<br>
JobAcctGatherFrequency = 30<br>
JobAcctGatherType = jobacct_gather/none<br>
JobAcctGatherParams = (null)<br>
JobCompHost = localhost<br>
JobCompLoc = /var/log/slurm_jobcomp.log<br>
JobCompPort = 0<br>
JobCompType = jobcomp/none<br>
JobCompUser = root<br>
JobContainerType = job_container/none<br>
JobCredentialPrivateKey = (null)<br>
JobCredentialPublicCertificate = (null)<br>
JobDefaults = (null)<br>
JobFileAppend = 0<br>
JobRequeue = 1<br>
JobSubmitPlugins = (null)<br>
KeepAliveTime = SYSTEM_DEFAULT<br>
KillOnBadExit = 0<br>
KillWait = 30 sec<br>
LaunchParameters = (null)<br>
LaunchType = launch/slurm<br>
Layouts =<br>
Licenses = (null)<br>
LogTimeFormat = iso8601_ms<br>
MailDomain = (null)<br>
MailProg = /usr/bin/mail<br>
MaxArraySize = 1001<br>
MaxDBDMsgs = 20052<br>
MaxJobCount = 10000<br>
MaxJobId = 67043328<br>
MaxMemPerNode = UNLIMITED<br>
MaxStepCount = 40000<br>
<br>
PropagateResourceLimits = (null)<br>
PropagateResourceLimitsExcept = MEMLOCK<br>
RebootProgram = /sbin/reboot<br>
ReconfigFlags = (null)<br>
RequeueExit = (null)<br>
RequeueExitHold = (null)<br>
ResumeFailProgram = (null)<br>
ResumeProgram = (null)<br>
ResumeRate = 300 nodes/min<br>
ResumeTimeout = 600 sec<br>
ResvEpilog = (null)<br>
ResvOverRun = 0 min<br>
ResvProlog = (null)<br>
ReturnToService = 2<br>
RoutePlugin = route/default<br>
SallocDefaultCommand = (null)<br>
SbcastParameters = (null)<br>
SchedulerParameters = (null)<br>
SchedulerTimeSlice = 30 sec<br>
SchedulerType = sched/backfill<br>
SelectType = select/cons_tres<br>
SelectTypeParameters = CR_CORE<br>
SlurmUser = slurm(202)<br>
SlurmctldAddr = (null)<br>
SlurmctldDebug = debug2<br>
SlurmctldHost[0] = sms.mycluster<br>
SlurmctldLogFile = /var/log/slurmctld.log<br>
SlurmctldPort = 6817<br>
SlurmctldSyslogDebug = unknown<br>
SlurmctldPrimaryOffProg = (null)<br>
SlurmctldPrimaryOnProg = (null)<br>
SlurmctldTimeout = 300 sec<br>
SlurmctldParameters = enable_configless<br>
SlurmdDebug = debug2<br>
SlurmdLogFile = /var/log/slurmd.log<br>
SlurmdParameters = (null)<br>
SlurmdPidFile = /var/run/slurmd.pid<br>
SlurmdPort = 6818<br>
SlurmdSpoolDir = /var/spool/slurm/d<br>
SlurmdSyslogDebug = unknown<br>
SlurmdTimeout = 300 sec<br>
SlurmdUser = root(0)<br>
SlurmSchedLogFile = (null)<br>
SlurmSchedLogLevel = 0<br>
SlurmctldPidFile = /var/run/slurmctld.pid<br>
SlurmctldPlugstack = (null)<br>
SLURM_CONF = /etc/slurm/slurm.conf<br>
<br>
SrunPortRange = 0-0<br>
SrunProlog = (null)<br>
StateSaveLocation = /var/spool/slurm/ctld<br>
SuspendExcNodes = (null)<br>
SuspendExcParts = (null)<br>
SuspendProgram = (null)<br>
SuspendRate = 60 nodes/min<br>
SuspendTime = NONE<br>
SuspendTimeout = 30 sec<br>
SwitchType = switch/none<br>
TaskEpilog = (null)<br>
TaskPlugin = task/affinity,task/cgroup<br>
TaskPluginParam = (null type)<br>
TaskProlog = (null)<br>
TCPTimeout = 2 sec<br>
TmpFS = /scratch<br>
TopologyParam = (null)<br>
TopologyPlugin = topology/none<br>
TrackWCKey = No<br>
TreeWidth = 50<br>
UsePam = No<br>
UnkillableStepProgram = (null)<br>
UnkillableStepTimeout = 60 sec<br>
VSizeFactor = 0 percent<br>
WaitTime = 0 sec<br>
X11Parameters = (null)<br>
<br>
Cgroup Support Configuration:<br>
AllowedDevicesFile = /etc/slurm/cgroup_allowed_devices_file.conf<br>
AllowedKmemSpace = (null)<br>
AllowedRAMSpace = 100.0%<br>
AllowedSwapSpace = 0.0%<br>
CgroupAutomount = yes<br>
CgroupMountpoint = /sys/fs/cgroup<br>
ConstrainCores = yes<br>
ConstrainDevices = yes<br>
ConstrainKmemSpace = no<br>
ConstrainRAMSpace = yes<br>
ConstrainSwapSpace = yes<br>
MaxKmemPercent = 100.0%<br>
MaxRAMPercent = 100.0%<br>
MaxSwapPercent = 100.0%<br>
MemorySwappiness = (null)<br>
MinKmemSpace = 30 MB<br>
MinRAMSpace = 30 MB<br>
TaskAffinity = no<br>
<br>
Slurmctld(primary) at sms.mycluster is UP<br>
<br>
<br>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div style="font-size:12.8px">Stephan Schott Verdugo<br></div><span style="font-size:12.8px">Biochemist</span><br style="font-size:12.8px"><div style="font-size:12.8px"><br>Heinrich-Heine-Universitaet Duesseldorf<br>Institut fuer Pharm. und Med. Chemie<br>Universitaetsstr. 1<br>40225 Duesseldorf<br>Germany</div></div></div>