Alison

 

  The sinfo shows that your head node is down due to come configuration error.

 

  Are you running slurmd on the head node?  If slurmd, is running find the log file for it and pass along the entries from it.

 

Can you redo the scontrol command and “node name” should be “nodename” one word. 

 

I need to see what’s in the test.sh file to get an idea of how your job is setup.

 

jeff

 

From: Alison Peterson <apeterson5@sdsu.edu>
Sent: Tuesday, April 9, 2024 3:15 PM
To: Jeffrey R. Lang <JRLang@uwyo.edu>
Cc: slurm-users@lists.schedmd.com
Subject: Re: [EXT] RE: [EXT] RE: [slurm-users] Nodes required for job are down, drained or reserved

 

Yes! here is the information:

 

[stsadmin@head ~]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
lab*         up   infinite      1  down* head


[stsadmin@head ~]$ scontrol show node name=head
Node name=head not found


[stsadmin@head ~]$ sbatch ~/Downloads/test.sh
Submitted batch job 7


[stsadmin@head ~]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 7       lab test_slu stsadmin PD       0:00      1 (ReqNodeNotAvail, UnavailableNodes:head)

 

On Tue, Apr 9, 2024 at 1:07PM Jeffrey R. Lang <JRLang@uwyo.edu> wrote:

Alison

 

Can you provide the output of the following commands:

 

·         sinfo

·         scontrol show node name=head

 

and the job command that your trying to run?

 

 

 

From: Alison Peterson <apeterson5@sdsu.edu>
Sent: Tuesday, April 9, 2024 3:03 PM
To: Jeffrey R. Lang <JRLang@uwyo.edu>
Cc: slurm-users@lists.schedmd.com
Subject: Re: [EXT] RE: [slurm-users] Nodes required for job are down, drained or reserved

 

Hi Jeffrey,

 I'm sorry I did add the head node in the compute nodes configuration, this is the slurm.conf

 

# COMPUTE NODES
NodeName=head CPUs=24 RealMemory=184000 Sockets=2  CoresPerSocket=6 ThreadsPerCore=2 State=UNKNOWN
PartitionName=lab  Nodes=ALL Default=YES MaxTime=INFINITE State=UP OverSubscribe=Force

 

 

On Tue, Apr 9, 2024 at 12:57PM Jeffrey R. Lang <JRLang@uwyo.edu> wrote:

Alison

 

The error message indicates that there are no resources to execute jobs.   Since you haven’t defined any compute nodes you will get this error.

 

I would suggest that you create at least one compute node.  Once, you do that this error should go away.

 

Jeff

 

From: Alison Peterson via slurm-users <slurm-users@lists.schedmd.com>
Sent: Tuesday, April 9, 2024 2:52 PM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] Nodes required for job are down, drained or reserved

 

This message was sent from a non-UWYO address. Please exercise caution when clicking links or opening attachments from external sources.

 

Hi everyone, I'm conducting some tests. I've just set up SLURM on the head node and haven't added any compute nodes yet. I'm trying to test it to ensure it's working, but I'm encountering an error: 'Nodes required for the job are DOWN, DRAINED, or reserved for jobs in higher priority partitions.

 

Any guidance will be appreciated thank you!

 

--

Alison Peterson

IT Research Support Analyst
Information Technology

O: 619-594-3364

San Diego State University | SDSU.edu

5500 Campanile Drive | San Diego, CA 92182-8080

 


 

--

Alison Peterson

IT Research Support Analyst
Information Technology

O: 619-594-3364

San Diego State University | SDSU.edu

5500 Campanile Drive | San Diego, CA 92182-8080

 


 

--

Alison Peterson

IT Research Support Analyst
Information Technology

O: 619-594-3364

San Diego State University | SDSU.edu

5500 Campanile Drive | San Diego, CA 92182-8080