[slurm-users] Regression from slurm-22.05.2 to slurm-22.05.7 when using "--gpus=N" option.
Rigoberto Corujo
rcorujo at yahoo.com
Thu Jan 12 20:52:31 UTC 2023
Hello,
I have a small 2 compute node GPU cluster, where each node as 2 GPUs.
$ sinfo -o "%20N %10c %10m %25f %30G "
NODELIST CPUS MEMORY AVAIL_FEATURES GRES
o186i[126-127] 128 64000 (null) gpu:nvidia_a40:2(S:0-1)
In my batch script, I request 4 GPUs and let Slurm decide how many nodes to automatically allocate. I also tell it I want 1 task per node.
$ cat rig_batch.sh
#!/usr/bin/env bash
#SBATCH --ntasks-per-node=1
#SBATCH --nodes=1-9
#SBATCH --gpus=4
#SBATCH --error=/home/corujor/slurm-error.log
#SBATCH --output=/home/corujor/slurm-output.log
bash -c 'echo $(hostname):SLURM_JOBID=${SLURM_JOBID}:SLURM_PROCID=${SLURM_PROCID}:CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES}'
I submit my batch script on slurm-22.05.2.
$ sbatch rig_batch.sh
Submitted batch job 7
I get the expected results. That is, since each compute node has 2 GPUs and I requested 4 GPUs, Slurm allocated 2 nodes, and 1 task per node.
$ cat slurm-output.log
o186i126:SLURM_JOBID=7:SLURM_PROCID=0:CUDA_VISIBLE_DEVICES=0,1
o186i127:SLURM_JOBID=7:SLURM_PROCID=1:CUDA_VISIBLE_DEVICES=0,1
However, when I try to submit the same batch script on slurm-22.05.7, it fails.
$ sbatch rig_batch.sh
sbatch: error: Batch job submission failed: Requested node configuration is not available
Here is my configuration.
$ scontrol show config
Configuration data as of 2023-01-12T21:38:55
AccountingStorageBackupHost = (null)
AccountingStorageEnforce = none
AccountingStorageHost = localhost
AccountingStorageExternalHost = (null)
AccountingStorageParameters = (null)
AccountingStoragePort = 6819
AccountingStorageTRES = cpu,mem,energy,node,billing,fs/disk,vmem,pages
AccountingStorageType = accounting_storage/slurmdbd
AccountingStorageUser = N/A
AccountingStoreFlags = (null)
AcctGatherEnergyType = acct_gather_energy/none
AcctGatherFilesystemType = acct_gather_filesystem/none
AcctGatherInterconnectType = acct_gather_interconnect/none
AcctGatherNodeFreq = 0 sec
AcctGatherProfileType = acct_gather_profile/none
AllowSpecResourcesUsage = No
AuthAltTypes = (null)
AuthAltParameters = (null)
AuthInfo = (null)
AuthType = auth/munge
BatchStartTimeout = 10 sec
BcastExclude = /lib,/usr/lib,/lib64,/usr/lib64
BcastParameters = (null)
BOOT_TIME = 2023-01-12T17:17:11
BurstBufferType = (null)
CliFilterPlugins = (null)
ClusterName = grenoble_test
CommunicationParameters = (null)
CompleteWait = 0 sec
CoreSpecPlugin = core_spec/none
CpuFreqDef = Unknown
CpuFreqGovernors = OnDemand,Performance,UserSpace
CredType = cred/munge
DebugFlags = Gres
DefMemPerNode = UNLIMITED
DependencyParameters = (null)
DisableRootJobs = Yes
EioTimeout = 60
EnforcePartLimits = ANY
Epilog = (null)
EpilogMsgTime = 2000 usec
EpilogSlurmctld = (null)
ExtSensorsType = ext_sensors/none
ExtSensorsFreq = 0 sec
FederationParameters = (null)
FirstJobId = 1
GetEnvTimeout = 2 sec
GresTypes = gpu
GpuFreqDef = high,memory=high
GroupUpdateForce = 1
GroupUpdateTime = 600 sec
HASH_VAL = Match
HealthCheckInterval = 0 sec
HealthCheckNodeState = ANY
HealthCheckProgram = (null)
InactiveLimit = 0 sec
InteractiveStepOptions = --interactive --preserve-env --pty $SHELL
JobAcctGatherFrequency = 30
JobAcctGatherType = jobacct_gather/none
JobAcctGatherParams = (null)
JobCompHost = localhost
JobCompLoc = /var/log/slurm_jobcomp.log
JobCompPort = 0
JobCompType = jobcomp/none
JobCompUser = root
JobContainerType = job_container/none
JobCredentialPrivateKey = /apps/slurm/etc/.slurm.key
JobCredentialPublicCertificate = /apps/slurm/etc/slurm.cert
JobDefaults = (null)
JobFileAppend = 0
JobRequeue = 1
JobSubmitPlugins = (null)
KillOnBadExit = 0
KillWait = 30 sec
LaunchParameters = use_interactive_step
LaunchType = launch/slurm
Licenses = (null)
LogTimeFormat = iso8601_ms
MailDomain = (null)
MailProg = /bin/mail
MaxArraySize = 1001
MaxDBDMsgs = 20008
MaxJobCount = 10000
MaxJobId = 67043328
MaxMemPerNode = UNLIMITED
MaxNodeCount = 2
MaxStepCount = 40000
MaxTasksPerNode = 512
MCSPlugin = mcs/none
MCSParameters = (null)
MessageTimeout = 10 sec
MinJobAge = 300 sec
MpiDefault = pmix
MpiParams = (null)
NEXT_JOB_ID = 274
NodeFeaturesPlugins = (null)
OverTimeLimit = 0 min
PluginDir = /apps/slurm-22-05-7-1/lib/slurm
PlugStackConfig = (null)
PowerParameters = (null)
PowerPlugin =
PreemptMode = OFF
PreemptType = preempt/none
PreemptExemptTime = 00:00:00
PrEpParameters = (null)
PrEpPlugins = prep/script
PriorityParameters = (null)
PrioritySiteFactorParameters = (null)
PrioritySiteFactorPlugin = (null)
PriorityType = priority/basic
PrivateData = none
ProctrackType = proctrack/linuxproc
Prolog = (null)
PrologEpilogTimeout = 65534
PrologSlurmctld = (null)
PrologFlags = (null)
PropagatePrioProcess = 0
PropagateResourceLimits = ALL
PropagateResourceLimitsExcept = (null)
RebootProgram = (null)
ReconfigFlags = (null)
RequeueExit = (null)
RequeueExitHold = (null)
ResumeFailProgram = (null)
ResumeProgram = (null)
ResumeRate = 300 nodes/min
ResumeTimeout = 60 sec
ResvEpilog = (null)
ResvOverRun = 0 min
ResvProlog = (null)
ReturnToService = 1
RoutePlugin = route/default
SchedulerParameters = (null)
SchedulerTimeSlice = 30 sec
SchedulerType = sched/backfill
ScronParameters = (null)
SelectType = select/cons_tres
SelectTypeParameters = CR_CPU
SlurmUser = slurm(1182)
SlurmctldAddr = (null)
SlurmctldDebug = debug
SlurmctldHost[0] = o186i208
SlurmctldLogFile = /var/log/slurmctld.log
SlurmctldPort = 6817
SlurmctldSyslogDebug = (null)
SlurmctldPrimaryOffProg = (null)
SlurmctldPrimaryOnProg = (null)
SlurmctldTimeout = 120 sec
SlurmctldParameters = (null)
SlurmdDebug = info
SlurmdLogFile = /var/log/slurmd.log
SlurmdParameters = (null)
SlurmdPidFile = /var/run/slurmd.pid
SlurmdPort = 6818
SlurmdSpoolDir = /var/spool/slurmd
SlurmdSyslogDebug = (null)
SlurmdTimeout = 300 sec
SlurmdUser = root(0)
SlurmSchedLogFile = (null)
SlurmSchedLogLevel = 0
SlurmctldPidFile = /var/slurm/run/slurmctld.pid
SlurmctldPlugstack = (null)
SLURM_CONF = /apps/slurm-22-05-7-1/etc/slurm.conf
SLURM_VERSION = 22.05.7
SrunEpilog = (null)
SrunPortRange = 0-0
SrunProlog = (null)
StateSaveLocation = /var/spool/slurmctld
SuspendExcNodes = (null)
SuspendExcParts = (null)
SuspendProgram = (null)
SuspendRate = 60 nodes/min
SuspendTime = INFINITE
SuspendTimeout = 30 sec
SwitchParameters = (null)
SwitchType = switch/none
TaskEpilog = (null)
TaskPlugin = task/affinity
TaskPluginParam = (null type)
TaskProlog = (null)
TCPTimeout = 2 sec
TmpFS = /tmp
TopologyParam = (null)
TopologyPlugin = topology/none
TrackWCKey = No
TreeWidth = 50
UsePam = No
UnkillableStepProgram = (null)
UnkillableStepTimeout = 60 sec
VSizeFactor = 0 percent
WaitTime = 0 sec
X11Parameters = (null)
MPI Plugins Configuration:
PMIxCliTmpDirBase = (null)
PMIxCollFence = (null)
PMIxDebug = 0
PMIxDirectConn = yes
PMIxDirectConnEarly = no
PMIxDirectConnUCX = no
PMIxDirectSameArch = no
PMIxEnv = (null)
PMIxFenceBarrier = no
PMIxNetDevicesUCX = (null)
PMIxTimeout = 300
PMIxTlsUCX = (null)
Slurmctld(primary) at o186i208 is UP
The only difference when I run this with slurm-22.05.2, is that I have to make this change or Slurm will complain. Other than that, the same configuration is used for both slurm-22.05.2 and slurm.05.7. In both cases, I am running on the same cluster using the same compute nodes, just pointing to different versions of Slurm.
#MpiDefault=pmix
MpiDefault=none
Seems like a regression.
Thoughts?
Thank you,
Rigoberto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230112/b818ac52/attachment-0001.htm>
More information about the slurm-users
mailing list