<div dir="ltr"><div>Mmm... What I did was install all rpms in the calculation nodes (similarly as install all rpms in the controller node), but running on them slurmd only.<br></div><div><br></div><div>I think you're aware that munge should be running in the calculation nodes, the munge.key should be the same in all nodes, slurm configuration files should be the same in all nodes. Directories and files should have the same permissions as in the controller node, the slurm user should have the same UID and GID as in the controller node. Pretty much install slurm in the calculation nodes as if they were the controller node. The difference will be that the slurm.conf will tell slurmctld which is the controller and which are the calculation nodes.</div><div><br></div><div>Example:</div><div>ControlMachine=hostname<br>ControlAddr=XXX.XXX.XXX.0</div><div><br></div><div># the calc nodes<br></div><div>NodeName=n[001-016] NodeAddr=XXX.XXX.XXX.[1-16] CPUs=28 RealMemory=64315 Sockets=2 CoresPerSocket=14 ThreadsPerCore=1 State=IDLE</div><div>PartitionName=slim Nodes=n[001-016] Default=YES MaxTime=INFINITE State=UP</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">El mar., 2 jun. 2020 a las 12:57, Ferran Planas Padros (<<a href="mailto:ferran.padros@su.se">ferran.padros@su.se</a>>) escribió:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div id="gmail-m_-328890577847735989divtagdefaultwrapper" style="font-size:12pt;color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif" dir="ltr">
<p>Hi,</p>
<p><br>
</p>
<p>Thanks for your answer,</p>
<p><br>
</p>
<p>However, I am setting up a calculating node, not the master node, and thus I have not installed slurmctld on it.</p>
<p></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
</p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
<br>
</p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
After some digging, I have found that all these files:</p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
</p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">/run/systemd/generator.late/slurm.service</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">/run/systemd/generator.late/runlevel5.target.wants/slurm.service</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">/run/systemd/generator.late/runlevel4.target.wants/slurm.service</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">/run/systemd/generator.late/runlevel3.target.wants/slurm.service</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">/run/systemd/generator.late/runlevel2.target.wants/slurm.service</span></p>
<br>
<p></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
Which are a copy of each other and are generated by <span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">systemd-sysv-generator, point to
 the slurmctld.pid, not to the slurm.pid</span></span></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
<span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"><br>
</span></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
<span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"></span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">[Unit]</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Documentation=man:systemd-sysv-generator(8)</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">SourcePath=/etc/rc.d/init.d/slurm</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Description=LSB: slurm daemon management</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Before=runlevel2.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Before=runlevel3.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Before=runlevel4.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Before=runlevel5.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Before=shutdown.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">After=remote-fs.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">After=network-online.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">After=munge.service</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">After=nss-lookup.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">After=network-online.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Wants=network-online.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Conflicts=shutdown.target</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">[Service]</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Type=forking</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Restart=no</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">TimeoutSec=5min</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">IgnoreSIGPIPE=no</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">KillMode=process</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">GuessMainPID=no</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">RemainAfterExit=no</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures"><b>PIDFile=/var/run/slurmctld.pid</b></span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">ExecStart=/etc/rc.d/init.d/slurm start</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">ExecStop=/etc/rc.d/init.d/slurm stop</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;color:rgb(94,52,255)">
<span style="font-variant-ligatures:no-common-ligatures">~<span>                                               </span></span></p>
<br>
<p></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
<span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"><br>
</span></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
<span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt">How can I make it to avoid this? Besides editing the files manually, which will go back to the original after reboot.</span></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
<span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt"><br>
</span></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
<span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt">Thanks,</span></p>
<p style="font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols;font-size:16px">
<span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt">Ferran</span></p>
<br>
<p></p>
</div>
<hr style="display:inline-block;width:98%">
<div id="gmail-m_-328890577847735989divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> slurm-users <<a href="mailto:slurm-users-bounces@lists.schedmd.com" target="_blank">slurm-users-bounces@lists.schedmd.com</a>> on behalf of Rodrigo Santibáñez <<a href="mailto:rsantibanez.uchile@gmail.com" target="_blank">rsantibanez.uchile@gmail.com</a>><br>
<b>Sent:</b> Tuesday, June 2, 2020 6:40:48 PM<br>
<b>To:</b> Slurm User Community List<br>
<b>Subject:</b> Re: [slurm-users] Problem with permisions. CentOS 7.8</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div>Yes, you have both daemons, installed with the slurm rpm.The slurmd (all nodes) communicates with slurmctld (runs in the main master node and, optionally, in a backup node).</div>
<div><br>
</div>
<div>You do not need to run slurmd as the slurm user. Use `systemctld enable slurmctld` (and slurmd) followed by `systemclt start slurmctld`. Use restart instead of start if you change the configuration only if `sudo scontrol reconfigure` asks for it.<br>
</div>
<div><br>
</div>
<div>If you run as root `slurmctld -Dvvvv` and `slurmd -Dvvvv` you'll see debug outputs to see further problems with configuration. The slurmd needs slurmctld running or will output "error: Unable to register: Unable to contact slurm controller (connect failure)"</div>
<div><br>
</div>
<div>You should find the services here:</div>
<div>-rw-r--r-- 1 root root 339 may 30 20:18 /usr/lib/systemd/system/slurmctld.service<br>
-rw-r--r-- 1 root root 342 may 30 20:18 /usr/lib/systemd/system/slurmdbd.service<br>
-rw-r--r-- 1 root root 398 may 30 20:18 /usr/lib/systemd/system/slurmd.service</div>
<div><br>
</div>
<div>Feel free to ask for more information,</div>
<div>Best regards<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">El mar., 2 jun. 2020 a las 11:12, Ferran Planas Padros (<<a href="mailto:ferran.padros@su.se" target="_blank">ferran.padros@su.se</a>>) escribió:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div id="gmail-m_-328890577847735989gmail-m_-269440492691839835divtagdefaultwrapper" dir="ltr" style="font-size:12pt;color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<p><br>
</p>
<div dir="ltr">
<div id="gmail-m_-328890577847735989gmail-m_-269440492691839835x_divtagdefaultwrapper" dir="ltr" style="font-size:12pt;color:rgb(0,0,0);font-family:Calibri,Helvetica,sans-serif,Helvetica,EmojiFont,"Apple Color Emoji","Segoe UI Emoji",NotoColorEmoji,"Segoe UI Symbol","Android Emoji",EmojiSymbols">
<p>Hi Ole,</p>
<p><br>
</p>
<p>Thanks for your answer and your time. I'd appreciate if you, or someone else, could make a final look at my case.</p>
<p>After your suggestions and comments, I have re-done the whole installation for Munge and Slurm. I uninstalled and remoced all previous rpms and restarted from scratch. Munge works with no problem, however it does not happen the same with slurm (for which
 I have used the instructions given in the link you attached)</p>
<p><br>
</p>
<p>- If I run /usr/bin/slurmd -D vvvvv as root user, I get the verbose until the line '<span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px">slurmd: debug2: No acct_gather.conf file (/etc/slurm/acct_gather.conf)'
<span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">where the verbose stops. After I do Ctrl+C, I get </span></span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"></span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures"><br>
</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">slurmd: all threads complete</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">slurmd: Consumable Resources (CR) Node Selection plugin shutting down ...</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">slurmd: Munge cryptographic signature plugin unloaded</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">slurmd: Slurmd shutdown completing</span></p>
<br>
<p></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt">- After that, if I run 'systemctl start slurmd' and 'systemctl status slurmd', also as root user, I get:</span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"></span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures;color:rgb(52,188,38)"><b>●</b></span><span style="font-variant-ligatures:no-common-ligatures"> slurmd.service - Slurm node daemon</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Loaded: loaded (/etc/systemd/system/slurmd.service; enabled; vendor preset: disabled)</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures"><span>   </span>Active: </span>
<span style="font-variant-ligatures:no-common-ligatures;color:rgb(52,188,38)"><b>active (running)</b></span><span style="font-variant-ligatures:no-common-ligatures"> since Tue 2020-06-02 16:53:51 CEST; 33s ago</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures"><span>  </span>Process: 2750 ExecStart=/usr/sbin/slurmd -d /usr/sbin/slurmstepd $SLURMD_OPTIONS (code=exited, status=0/SUCCESS)</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures"><span> </span>Main PID: 2752 (slurmd)</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures"><span>   </span>CGroup: /system.slice/slurmd.service</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures"><span>           </span>
└─2752 /usr/sbin/slurmd -d /usr/sbin/slurmstepd</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo;min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Jun 02 16:53:51 <a href="http://roos21.organ.su.se" target="_blank">
roos21.organ.su.se</a> systemd[1]: Starting Slurm node daemon...</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Jun 02 16:53:51 <a href="http://roos21.organ.su.se" target="_blank">
roos21.organ.su.se</a> systemd[1]: Can't open PID file /var/run/slurm/slurmd.pid (yet?) after start: No such file or directory</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Jun 02 16:53:51 <a href="http://roos21.organ.su.se" target="_blank">
roos21.organ.su.se</a> systemd[1]: Started Slurm node daemon.</span></p>
<br>
<p></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px">- <span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt">Next, I kill the slurmd process, and I run, as slurm user, 'systemctl start slurm'. Which does
 not work and returns the following in the journalctl -xe:</span></span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"><span style="font-family:Calibri,Helvetica,sans-serif;font-size:12pt"><br>
</span></span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"></span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Jun 02 16:56:01 <a href="http://roos21.organ.su.se" target="_blank">
roos21.organ.su.se</a> systemd[1]: Starting LSB: slurm daemon management...</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">-- Subject: Unit slurm.service has begun start-up</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">-- Defined-By: systemd</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">-- Support: <a href="http://lists.freedesktop.org/mailman/listinfo/systemd-devel" target="_blank">
http://lists.freedesktop.org/mailman/listinfo/systemd-devel</a></span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">--<span> </span></span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">-- Unit slurm.service has begun starting up.</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Jun 02 16:56:01 <a href="http://roos21.organ.su.se" target="_blank">
roos21.organ.su.se</a> slurm[2805]: starting slurmd: [<span>  </span>OK<span>  </span>
]</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Jun 02 16:56:01 <a href="http://roos21.organ.su.se" target="_blank">
roos21.organ.su.se</a> systemd[1]: Can't open PID file /var/run/slurmctld.pid (yet?) after start: No such file or directory</span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Jun 02 16:56:37 <a href="http://roos21.organ.su.se" target="_blank">
roos21.organ.su.se</a> polkitd[1316]: <b>Unregistered Authentication Agent for unix-process:2792:334647 (system bus name :1.46, object path /org/freedesktop</b></span></p>
<p style="margin-right:0px;margin-left:0px;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;font-size:11px;line-height:normal;font-family:Menlo">
<span style="font-variant-ligatures:no-common-ligatures">Jun 02 16:56:38 <a href="http://roos21.organ.su.se" target="_blank">
roos21.organ.su.se</a> sudo[2790]: pam_unix(sudo:session): session closed for user slurm</span></p>
<br>
<p></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt">Something that I don't really understand because I have not installed slurmctld. The slurmctld.service file does not even exist.</span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt"><br>
</span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt">Any idea?</span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt"><br>
</span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt">Many thanks,</span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Calibri,Helvetica,sans-serif;font-size:12pt">Ferran</span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"><span style="font-size:12pt;font-family:Calibri,Helvetica,sans-serif"></span><br>
</span></p>
<p><span style="font-variant-ligatures:no-common-ligatures;font-family:Menlo;font-size:11px"><br>
</span></p>
<p></p>
</div>
<hr style="display:inline-block;width:98%">
<div id="gmail-m_-328890577847735989gmail-m_-269440492691839835x_divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> slurm-users <<a href="mailto:slurm-users-bounces@lists.schedmd.com" target="_blank">slurm-users-bounces@lists.schedmd.com</a>>
 on behalf of Ole Holm Nielsen <<a href="mailto:Ole.H.Nielsen@fysik.dtu.dk" target="_blank">Ole.H.Nielsen@fysik.dtu.dk</a>><br>
<b>Sent:</b> Tuesday, June 2, 2020 12:03:27 PM<br>
<b>To:</b> Slurm User Community List<br>
<b>Subject:</b> Re: [slurm-users] Problem with permisions. CentOS 7.8</font>
<div> </div>
</div>
</div>
<font size="2"><span style="font-size:10pt">
<div>Hi Ferran,<br>
<br>
Please install Slurm software in the standard way, see<br>
<a href="https://wiki.fysik.dtu.dk/niflheim/Slurm_installation" id="gmail-m_-328890577847735989gmail-m_-269440492691839835LPlnk10833" target="_blank">https://wiki.fysik.dtu.dk/niflheim/Slurm_installation</a><br>
<br>
It seems that you have some unusual way to manage your Linux systems.  In <br>
Stockholm and Sweden there are many Slurm experts at the HPC centers which <br>
might be able to help you more directly.<br>
<br>
Best regards,<br>
Ole<br>
<br>
On 6/2/20 11:58 AM, Ferran Planas Padros wrote:<br>
> I did a fresh installation with the EPEL repo, and installing munge from <br>
> it and it worked. To have the slurm user for munge was definitely a <br>
> problem, but that is the set up we have on the CentOS 6. Now I've learnt <br>
> my lesson for future installations, thanks to everyone!<br>
> <br>
> <br>
> Now, I have a follow up question, if you don't mind. I am now trying to <br>
> run slurm, and it crashes:<br>
> <br>
> <br>
> [root@roos21 ~]# systemctl status slurm.service<br>
> <br>
> *●*slurm.service - LSB: slurm daemon management<br>
> <br>
> Loaded: loaded (/etc/rc.d/init.d/slurm; bad; vendor preset: disabled)<br>
> <br>
> Active: *failed*(Result: protocol) since Tue 2020-06-02 11:45:33 CEST; <br>
> 3min 33s ago<br>
> <br>
> Docs: man:systemd-sysv-generator(8)<br>
> <br>
> <br>
> Jun 02 11:45:33 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: Starting LSB: slurm daemon
<br>
> management...<br>
> <br>
> Jun 02 11:45:33 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> slurm[18223]: starting slurmd: [OK]<br>
> <br>
> Jun 02 11:45:33 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: Can't open PID file
<br>
> /var/run/slurmctld.pid (yet?) after start: No such file or directory<br>
> <br>
> Jun 02 11:45:33 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: *Failed to start LSB: slurm
<br>
> daemon management.*<br>
> <br>
> Jun 02 11:45:33 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: *Unit slurm.service entered
<br>
> failed state.*<br>
> <br>
> Jun 02 11:45:33 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: *slurm.service failed.*<br>
> <br>
> <br>
> <br>
> The thing is that this is a computing node, not the master node, so <br>
> slurmctld is not installed. Why do I get this error?<br>
> <br>
> <br>
> Many thanks, and my apologies for this rather simple questions. I am a <br>
> newbie on this.<br>
> <br>
> <br>
> Best,<br>
> <br>
> Ferran<br>
> <br>
> --------------------------------------------------------------------------<br>
> *From:* slurm-users <<a href="mailto:slurm-users-bounces@lists.schedmd.com" target="_blank">slurm-users-bounces@lists.schedmd.com</a>> on behalf of
<br>
> Renata Maria Dart <<a href="mailto:renata@slac.stanford.edu" target="_blank">renata@slac.stanford.edu</a>><br>
> *Sent:* Friday, May 29, 2020 6:33:58 PM<br>
> *To:* <a href="mailto:Ole.H.Nielsen@fysik.dtu.dk" target="_blank">Ole.H.Nielsen@fysik.dtu.dk</a>; Slurm User Community List<br>
> *Subject:* Re: [slurm-users] Problem with permisions. CentOS 7.8<br>
> Hi, don't know if this might be your problem but I ran into an issue<br>
> on centos 7.8 where /var/run/munge was not being created at boottime<br>
> because I didn't have the munge user in the local password file.  I<br>
> have the munge user in AD and once the system is up I can start munge<br>
> successfully, but AD wasn't available early enough during boot for the<br>
> munge startup to see it.  I added these lines to the munge systemctl<br>
> file:<br>
> <br>
> PermissionsStartOnly=true<br>
> ExecStartPre=-/usr/bin/mkdir -m 0755 -p /var/run/munge<br>
> ExecStartPre=-/usr/bin/chown -R munge:munge /var/run/munge<br>
> <br>
> and my system now starts munge up fine during a reboot.<br>
> <br>
> Renata<br>
> <br>
> On Fri, 29 May 2020, Ole Holm Nielsen wrote:<br>
> <br>
>> Hi Ferran,<br>
>><br>
>> When you have a CentOS 7 system with the EPEL repo enabled, and you have<br>
>> installed the munge RPM from EPEL, then things should be working correctly.<br>
>><br>
>> Since systemctl tells you that Munge service didn't start correctly, then it<br>
>> seems to me that you have a problem in the general configuration of your CentOS<br>
>> 7 system.  You should check /var/log/messages and "journalctl -xe" for munge<br>
>> errors.  It is really hard for other people to guess what may be wrong in your<br>
>> system.<br>
>><br>
>> My 2 cents worth: Maybe you could make a fresh CentOS 7.8 installation on a<br>
>> test system and install the Munge service (and nothing else) according to<br>
>> instructions in <a href="https://wiki.fysik.dtu.dk/niflheim/Slurm_installation" id="gmail-m_-328890577847735989gmail-m_-269440492691839835LPlnk332420" target="_blank">
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation</a>.  This<br>
>> *really* has got to work!<br>
>><br>
>> /Ole<br>
>><br>
>><br>
>> On 29-05-2020 10:23, Ferran Planas Padros wrote:<br>
>>> Hello everyone,<br>
>>><br>
>>><br>
>>> Here it comes everything I've done.<br>
>>><br>
>>><br>
>>> - About Ole's answer:<br>
>>><br>
>>> Yes, we have slurm as the user to control munge. Following your comment, I<br>
>>> have changed the ownership of the munge files and tried to start munge as<br>
>>> munge user. However, it also failed.<br>
>>><br>
>>> Also, I first installed munge from a repository. I've seen your suggestion of<br>
>>> installing from EPEL. So I uninstalled and installed again. Same result<br>
>>><br>
>>> - About SELinux: It is disables<br>
>>><br>
>>> - The output of ps -ef | grep munge is:<br>
>>><br>
>>><br>
>>> root534051530 10:18 pts/000:00:00 grep --color=auto *munge*<br>
>>><br>
>>><br>
>>> - The outputs of munge -n is:<br>
>>><br>
>>><br>
>>> Failed to access "/var/run/munge/munge.socket.2": No such file or directory<br>
>>><br>
>>><br>
>>> - Same for unmunge<br>
>>><br>
>>><br>
>>> - Output for sudo systemctl status --full munge<br>
>>><br>
>>><br>
>>> *?*munge.service - MUNGE authentication service<br>
>>><br>
>>> Loaded: loaded (/usr/lib/systemd/system/munge.service; enabled; vendor preset:<br>
>>> disabled)<br>
>>><br>
>>> Active: *failed*(Result: exit-code) since Fri 2020-05-29 10:15:52 CEST; 4min<br>
>>> 18s ago<br>
>>><br>
>>> Docs: man:munged(8)<br>
>>><br>
>>> Process: 5333 ExecStart=/usr/sbin/munged *(code=exited, status=1/FAILURE)*<br>
>>><br>
>>><br>
>>> May 29 10:15:52 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: Starting MUNGE authentication<br>
>>> service...<br>
>>><br>
>>> May 29 10:15:52 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: *munge.service: control process<br>
>>> exited, code=exited status=1*<br>
>>><br>
>>> May 29 10:15:52 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: *Failed to start MUNGE<br>
>>> authentication service.*<br>
>>><br>
>>> May 29 10:15:52 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: *Unit munge.service entered<br>
>>> failed state.*<br>
>>><br>
>>> May 29 10:15:52 <a href="http://roos21.organ.su.se" target="_blank">roos21.organ.su.se</a> systemd[1]: *munge.service failed.*<br>
>>><br>
>>><br>
>>> - Regarding NTP, I get this message:<br>
>>><br>
>>><br>
>>> Unable to talk to NTP daemon. Is it running?<br>
>>><br>
>>><br>
>>> It is the same message I get in the nodes that DO work. All nodes are sync in<br>
>>> time and date with the central node<br>
>>><br>
>>><br>
>>> ------------------------------------------------------------------------<br>
>>> *From:* slurm-users <<a href="mailto:slurm-users-bounces@lists.schedmd.com" target="_blank">slurm-users-bounces@lists.schedmd.com</a>> on behalf of Ole<br>
>>> Holm Nielsen <<a href="mailto:Ole.H.Nielsen@fysik.dtu.dk" target="_blank">Ole.H.Nielsen@fysik.dtu.dk</a>><br>
>>> *Sent:* Friday, May 29, 2020 9:56:10 AM<br>
>>> *To:* <a href="mailto:slurm-users@lists.schedmd.com" target="_blank">slurm-users@lists.schedmd.com</a><br>
>>> *Subject:* Re: [slurm-users] Problem with permisions. CentOS 7.8<br>
>>> On 29-05-2020 08:46, Sudeep Narayan Banerjee wrote:<br>
>>>> also check:<br>
>>>> a) whether NTP has been setup and communicating with master node<br>
>>>> b) iptables may be flushed (iptables -L)<br>
>>>> c) SeLinux to disabled, to check :<br>
>>>> getenforce<br>
>>>> vim /etc/sysconfig/selinux<br>
>>>> (change SELINUX=enforcing to SELINUX=disabled and save the file and reboot)<br>
>>><br>
>>> There is no reason to disable SELinux for running the Munge service.<br>
>>> It's a pretty bad idea to lower the security just for the sake of<br>
>>> convenience!<br>
>>><br>
>>> /Ole<br>
>>><br>
>>><br>
>>>> On Fri, May 29, 2020 at 12:08 PM Sudeep Narayan Banerjee<br>
>>>> <<a href="mailto:snbanerjee@iitgn.ac.in" target="_blank">snbanerjee@iitgn.ac.in</a> <<a href="mailto:snbanerjee@iitgn.ac.in" target="_blank">mailto:snbanerjee@iitgn.ac.in</a>>> wrote:<br>
>>>><br>
>>>>      I have not checked on the CentOS7.8<br>
>>>>      a) if /var/run/munge folder does not exist then please double check<br>
>>>>      whether munge has been installed or not<br>
>>>>      b) user root or sudo user to do<br>
>>>>      ps -ef | grep munge<br>
>>>>      kill -9 <PID> //where PID is the Process ID for munge (if the<br>
>>>>      process is running at all); else<br>
>>>><br>
>>>>      which munged<br>
>>>>      /etc/init.d/munge start<br>
>>>><br>
>>>>      please let me know the the output of:<br>
>>>><br>
>>>>      |$ munge -n|<br>
>>>><br>
>>>>      |$ munge -n | unmunge|<br>
>>>><br>
>>>>      |$ sudo systemctl status --full munge<br>
>>>><br>
>>>>      |<br>
>>>><br>
>>>>      Thanks & Regards,<br>
>>>>      Sudeep Narayan Banerjee<br>
>>>>      System Analyst | Scientist B<br>
>>>>      Indian Institute of Technology Gandhinagar<br>
>>>>      Gujarat, INDIA<br>
>>>><br>
>>>><br>
>>>>      On Fri, May 29, 2020 at 11:55 AM Bjørn-Helge Mevik<br>
>>>>      <<a href="mailto:b.h.mevik@usit.uio.no" target="_blank">b.h.mevik@usit.uio.no</a> <<a href="mailto:b.h.mevik@usit.uio.no" target="_blank">mailto:b.h.mevik@usit.uio.no</a>>> wrote:<br>
>>>><br>
>>>>          Ferran Planas Padros <<a href="mailto:ferran.padros@su.se" target="_blank">ferran.padros@su.se</a><br>
>>>>          <<a href="mailto:ferran.padros@su.se" target="_blank">mailto:ferran.padros@su.se</a>>> writes:<br>
>>>><br>
>>>>           > I run the command as slurm user, and the /var/log/munge<br>
>>>>          folder does belong to slurm.<br>
>>>><br>
>>>>          For security reasons, I strongly advise that you run munged as a<br>
>>>>          separate user, which is unprivileged and not used for anything else.<br>
>>>><br>
>>>>          --          Regards,<br>
>>>>          Bjørn-Helge Mevik, dr. scient,<br>
>>>>          Department for Research Computing, University of Oslo<br>
<br>
</div>
</span></font></div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote></div>