[slurm-users] slurm conf with single machine with multi cores.

Le Biot, Pierre-Marie pierre-marie.lebiot at hpe.com
Wed Nov 29 10:18:59 MST 2017

Hello David,

So linuxcluster is the Head node and also a Compute node ?

Is slurmd running ?

What does /var/log/slurm/slurmd.log say ?

Pierre-Marie Le Biot

From: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com] On Behalf Of david vilanova
Sent: Wednesday, November 29, 2017 4:33 PM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] slurm conf with single machine with multi cores.

I have updated the slurm.conf as follows:

NodeName=linuxcluster CPUs=2
PartitionName=testq Nodes=linuxcluster Default=YES MaxTime=INFINITE State=UP

Still get testq node in down status ??? Any idea ?

Below log from db and controller:
==> /var/log/slurm/slurmctrl.log <==
[2017-11-29T16:28:30.446] slurmctld version 17.11.0 started on cluster linuxcluster
[2017-11-29T16:28:30.850] error: SelectType specified more than once, latest value used
[2017-11-29T16:28:30.851] layouts: no layout to initialize
[2017-11-29T16:28:30.855] layouts: loading entities/relations information
[2017-11-29T16:28:30.855] Recovered state of 1 nodes
[2017-11-29T16:28:30.855] Down nodes: linuxcluster
[2017-11-29T16:28:30.855] Recovered information about 0 jobs
[2017-11-29T16:28:30.855] cons_res: select_p_node_init
[2017-11-29T16:28:30.855] cons_res: preparing for 1 partitions
[2017-11-29T16:28:30.856] Recovered state of 0 reservations
[2017-11-29T16:28:30.856] _preserve_plugins: backup_controller not specified
[2017-11-29T16:28:30.856] cons_res: select_p_reconfigure
[2017-11-29T16:28:30.856] cons_res: select_p_node_init
[2017-11-29T16:28:30.856] cons_res: preparing for 1 partitions
[2017-11-29T16:28:30.856] Running as primary controller
[2017-11-29T16:28:30.856] Registering slurmctld at port 6817 with slurmdbd.
[2017-11-29T16:28:31.098] No parameter for mcs plugin, default values set
[2017-11-29T16:28:31.098] mcs: MCSParameters = (null). ondemand set.
[2017-11-29T16:29:31.169] SchedulerParameters=default_queue_depth=100,max_rpc_cnt=0,max_sched_time=2,partition_job_depth=0,sched_max_job_start=0,sched_min_interval=2


El El mié, 29 nov 2017 a las 15:59, Steffen Grunewald <steffen.grunewald at aei.mpg.de<mailto:steffen.grunewald at aei.mpg.de>> escribió:
Hi David,

On Wed, 2017-11-29 at 14:45:06 +0000, david vilanova wrote:
> Hello,
> I have installed latest 7.11 release and my node is shown as down.
> I hava a single physical server with 12 cores so not sure the conf below is
> correct ?? can you help ??
> In slurm.conf the node is configure as follows:
> NodeName=linuxcluster CPUs=1 RealMemory=991 Sockets=12 CoresPerSocket=1
> ThreadsPerCore=1 Feature=local

12 Sockets? Certainly not... 12 Cores per socket, yes.
(IIRC CPUS shouldn't be specified if the detailed topology is given.
You may try CPUs=12 and drop the details.)

> PartitionName=testq Nodes=inuxcluster Default=YES MaxTime=INFINITE State=UP
                           ^^ typo?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171129/910472a0/attachment.html>

More information about the slurm-users mailing list