Hi everyone, I'm conducting some tests. I've just set up SLURM on the head node and haven't added any compute nodes yet. I'm trying to test it to ensure it's working, but I'm encountering an error: 'Nodes required for the job are DOWN, DRAINED, or reserved for jobs in higher priority partitions.
*[stsadmin@head ~]$ squeue* JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 6 lab test_slu stsadmin PD 0:00 1 (Nodes required for job are DOWN, DRAINED or reserved for jobs in higher priority partitions)
*[stsadmin@head ~]$ scontrol show job 6* JobId=6 JobName=test_slurm UserId=stsadmin(1000) GroupId=stsadmin(1000) MCS_label=N/A Priority=1 Nice=0 Account=(null) QOS=normal JobState=PENDING Reason=Nodes_required_for_job_are_DOWN,_DRAINED_or_reserved_for_jobs_in_higher_priority_partitions Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 RunTime=00:00:00 TimeLimit=01:00:00 TimeMin=N/A SubmitTime=2024-04-09T10:43:14 EligibleTime=2024-04-09T10:43:14 AccrueTime=2024-04-09T10:43:14 StartTime=Unknown EndTime=Unknown Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2024-04-09T10:43:23 Scheduler=Backfill:* Partition=lab AllocNode:Sid=head:5147 ReqNodeList=(null) ExcNodeList=(null) NodeList= NumNodes=1-1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:* ReqTRES=cpu=1,mem=1G,node=1,billing=1 AllocTRES=(null) Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=* MinCPUsNode=1 MinMemoryCPU=1G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=YES Contiguous=0 Licenses=(null) Network=(null) Command=/home/stsadmin/Downloads/test.sh WorkDir=/home/stsadmin StdErr=/home/stsadmin/test_slurm_output.txt StdIn=/dev/null StdOut=/home/stsadmin/test_slurm_output.txt Power=
*[stsadmin@head ~]$ scontrol show node head* NodeName=head CoresPerSocket=6 CPUAlloc=0 CPUEfctv=24 CPUTot=24 CPULoad=0.00 AvailableFeatures=(null) ActiveFeatures=(null) Gres=(null) NodeAddr=head NodeHostName=head RealMemory=184000 AllocMem=0 FreeMem=N/A Sockets=2 Boards=1 State=DOWN+NOT_RESPONDING ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A Partitions=lab BootTime=None SlurmdStartTime=None LastBusyTime=2024-04-09T10:42:53 ResumeAfterTime=None CfgTRES=cpu=24,mem=184000M,billing=24 AllocTRES= CapWatts=n/a CurrentWatts=0 AveWatts=0 ExtSensorsJoules=n/a ExtSensorsWatts=0 ExtSensorsTemp=n/a Reason=Not responding [slurm@2024-04-09T10:14:10]
I will take any advice to guide me in the proper direction, thank you!