<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.hoenzb
{mso-style-name:hoenzb;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri",sans-serif;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Thanks; I just set <span style="font-family:Consolas">StateSaveLocation=/var/spool/slurm.state</span>, and that went away. Of course, another error popped up:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-family:Consolas">Apr 11 11:19:24 psy-slurm slurmctld[1772]: fatal: Invalid node names in partition slurm<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Here’s the relevant section from slurm.conf; IP address changed to protect the innocent. This is a single-node cluster that I’m using just to make a working proof-of-concept.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-family:Consolas"># COMPUTE NODES<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-family:Consolas">NodeName=psy-slurm NodeAddr=192.0.2.157<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-family:Consolas">PartitionName=slurm Nodes= Default=YES MaxTime=INFINITE State=UP<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Matt Hohmeister<o:p></o:p></p>
<p class="MsoNormal">Systems and Network Administrator<o:p></o:p></p>
<p class="MsoNormal">Department of Psychology<o:p></o:p></p>
<p class="MsoNormal">Florida State University<o:p></o:p></p>
<p class="MsoNormal">PO Box 3064301<o:p></o:p></p>
<p class="MsoNormal">Tallahassee, FL 32306-4301<o:p></o:p></p>
<p class="MsoNormal">Phone: +1 850 645 1902<o:p></o:p></p>
<p class="MsoNormal">Fax: +1 850 644 7739<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><b>From:</b> slurm-users <slurm-users-bounces@lists.schedmd.com>
<b>On Behalf Of </b>Douglas Jacobsen<br>
<b>Sent:</b> Wednesday, April 11, 2018 10:40 AM<br>
<b>To:</b> Slurm User Community List <slurm-users@lists.schedmd.com><br>
<b>Subject:</b> Re: [slurm-users] Slurm setup question<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">It looks like your slurm.conf is specifying /var/spool as your Save state directory, and `<span style="font-size:9.5pt;font-family:Consolas;color:#222222;background:white">fatal: Incorrect permissions on state save loc: /var/spool` indicates
that SlurmUser (another configuration in slurm.conf) does not have access to write to it. It might be a good to make a directory dedicated for this purpose, e.g. /var/spool/slurm/<clustername>_state, and then make sure that the SlurmUser (usually either "slurm"
or root, depending on your needs), can access that directory.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><br clear="all">
<o:p></o:p></p>
<div>
<div>
<div>
<div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:7.5pt;font-family:"Courier New"">----</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial",sans-serif">Doug Jacobsen, Ph.D.</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:7.5pt;font-family:"Arial",sans-serif">NERSC Computer Systems Engineer</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:7.5pt;font-family:"Arial",sans-serif"><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.nersc.gov&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=vB5kEptc0F3Hjfyf88mDCPQ_3BeootNbrc5vZ6VtPtM&s=3JKzB9CTmI7XjmapV74NhKTJ4VywZ8_8VMGsjnV5H5k&e=" target="_blank">National
Energy Research Scientific Computing Center</a></span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:7.5pt"><a href="mailto:dmjacobsen@lbl.gov" target="_blank"><span style="font-family:"Arial",sans-serif">dmjacobsen@lbl.gov</span></a></span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:7.5pt;font-family:"Courier New";color:#888888">------------- __o<br>
---------- _ '\<,_<br>
----------(_)/ (_)__________________________</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:12.0pt;font-family:"Arial",sans-serif"><o:p> </o:p></span></p>
</div>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">On Wed, Apr 11, 2018 at 5:44 AM, Ole Holm Nielsen <<a href="mailto:Ole.H.Nielsen@fysik.dtu.dk" target="_blank">Ole.H.Nielsen@fysik.dtu.dk</a>> wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal">Hi Matt,<br>
<br>
You might want to take a look at my Slurm Wiki, which focuses on CentOS/RHEL 7: <a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__wiki.fysik.dtu.dk_niflheim_SLURM&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=vB5kEptc0F3Hjfyf88mDCPQ_3BeootNbrc5vZ6VtPtM&s=F-iggDAdLMvraK3g3jfyopytOTXy3HGv53ym-0MQgpg&e=" target="_blank">
https://wiki.fysik.dtu.dk/niflheim/SLURM</a>. Complete instructions for Slurm installation, configuration, etc. is in the Wiki.<span style="color:#888888"><br>
<br>
<span class="hoenzb">/Ole</span></span><o:p></o:p></p>
<div>
<div>
<p class="MsoNormal"><br>
<br>
On 04/11/2018 02:26 PM, Matt Hohmeister wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal">I’m brand-new to Slurm, and setting it up on a single RHEL 7.4 VM as a proof of concept before I deploy it. After following the instructions on
<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__www.slothparadise.com_how-2Dto-2Dinstall-2Dslurm-2Don-2Dcentos-2D7-2Dcluster_&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=vB5kEptc0F3Hjfyf88mDCPQ_3BeootNbrc5vZ6VtPtM&s=bUTMEN-IG50GLAc7wvp5c7nuoJpgT_byZVnQTwG9RDw&e=" target="_blank">
https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/</a> (sorry, site not working now), I can get slurmd to start perfectly, but slurmctld fails to start with the following journalctl -xe; I was wondering if anyone has run into this or could
shed some light on this…thanks in advance!<br>
<br>
Apr 11 08:18:30 psy-slurm polkitd[680]: Registered Authentication Agent for unix-process:1779:31362 (system bus name :1.26 [/usr/bin/pkttyagent --notify-fd 5 --fallbac<br>
<br>
Apr 11 08:18:30 psy-slurm systemd[1]: Starting Slurm controller daemon...<br>
<br>
-- Subject: Unit slurmctld.service has begun start-up<br>
<br>
-- Defined-By: systemd<br>
<br>
-- Support: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_systemd-2Ddevel&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=vB5kEptc0F3Hjfyf88mDCPQ_3BeootNbrc5vZ6VtPtM&s=4J5mZcw-p1Dy62M54PcatiHx-_PqsYZEsCsDVhYBybE&e=" target="_blank">
http://lists.freedesktop.org/mailman/listinfo/systemd-devel</a><br>
<br>
--<br>
<br>
-- Unit slurmctld.service has begun starting up.<br>
<br>
Apr 11 08:18:30 psy-slurm systemd[1]: PID file /var/run/slurmctld.pid not readable (yet?) after start.<br>
<br>
Apr 11 08:18:30 psy-slurm systemd[1]: Started Slurm controller daemon.<br>
<br>
-- Subject: Unit slurmctld.service has finished start-up<br>
<br>
-- Defined-By: systemd<br>
<br>
-- Support: <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_systemd-2Ddevel&d=DwMFaQ&c=HPMtquzZjKY31rtkyGRFnQ&r=Y7_jKRiyJUHl8NulOtnzB4UPVSMWmGk9Sds6aXi7m3U&m=vB5kEptc0F3Hjfyf88mDCPQ_3BeootNbrc5vZ6VtPtM&s=4J5mZcw-p1Dy62M54PcatiHx-_PqsYZEsCsDVhYBybE&e=" target="_blank">
http://lists.freedesktop.org/mailman/listinfo/systemd-devel</a><br>
<br>
--<br>
<br>
-- Unit slurmctld.service has finished starting up.<br>
<br>
--<br>
<br>
-- The start-up result is done.<br>
<br>
Apr 11 08:18:30 psy-slurm polkitd[680]: Unregistered Authentication Agent for unix-process:1779:31362 (system bus name :1.26, object path /org/freedesktop/PolicyKit1/A<br>
<br>
Apr 11 08:18:30 psy-slurm slurmctld[1787]: fatal: Incorrect permissions on state save loc: /var/spool<br>
<br>
Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service: main process exited, code=exited, status=1/FAILURE<br>
<br>
Apr 11 08:18:30 psy-slurm systemd[1]: Unit slurmctld.service entered failed state.<br>
<br>
Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service failed.<br>
<br>
Matt Hohmeister<br>
<br>
Systems and Network Administrator<br>
<br>
Department of Psychology<br>
<br>
Florida State University<br>
<br>
PO Box 3064301<br>
<br>
Tallahassee, FL 32306-4301<br>
<br>
Phone: +1 850 645 1902<br>
<br>
Fax: +1 850 644 7739<o:p></o:p></p>
</blockquote>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</body>
</html>