[slurm-users] FSU & Slurm
Sean Caron
scaron at umich.edu
Wed Apr 11 13:35:35 MDT 2018
Hi Matt,
As a protest to asking questions on this list and getting solicitations for
pay-for support, let me give you some advice for free :)
If you look at your slurm.conf you'll see there are two directories that
your slurm user and group need to have write access to.
One is whatever you configure as SlurmdSpoolDir. This needs to be available
on all worker nodes that are running slurmd. Set ownership to slurm user
and slurm group and mode 755.
The other is StateSaveLocation. This needs to be present just on your
controller (where slurmctld runs). Again, this should have ownership of
slurm user and slurm group and mode 755.
You probably want to use something more specific than just /var/spool for
your StateSaveLocation
Best,
Sean
On Wed, Apr 11, 2018 at 1:48 PM, Jess Arrington <jess at schedmd.com> wrote:
> Hi Matt,
>
> I hope your day is treating you well.
>
>
> Thank you for your posts on the Slurm user list.
>
>
> By chance, do you work with Paul Van Der Mark?
>
>
> Would there be interest on your side to see a Slurm support contract for
> your systems at FSU?
>
> Sites running Slurm with support give us feedback that support is
> invaluable and a great return back to the organization with much better
> system utilization with optimized configs by our experts (which pays for
> the support contract in and of itself) and their sites not having to rely
> on in-house best effort support hacks that get very expensive and turn into
> complicated chaos and potential down systems.
>
>
> Additionally, support keeps the Slurm project alive and going strong
>
>
> Please let me know your thoughts or if you would like me to reach out to
> another contact at FSU to chat about this further.
>
>
>
> Take care,
>
>
>
> *Jess Arrington*
> Director of Sales | 801-616-7823
> 204 N 1200 E #203 Lehi, UT 84043
> <https://maps.google.com/?q=204+N+1200+E+%23203+Lehi,+UT+84043&entry=gmail&source=g>
>
>
> On Wed, Apr 11, 2018 at 6:26 AM, Matt Hohmeister <hohmeister at psy.fsu.edu>
> wrote:
>
>> I’m brand-new to Slurm, and setting it up on a single RHEL 7.4 VM as a
>> proof of concept before I deploy it. After following the instructions on
>> https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/
>> (sorry, site not working now), I can get slurmd to start perfectly, but
>> slurmctld fails to start with the following journalctl -xe; I was
>> wondering if anyone has run into this or could shed some light on
>> this…thanks in advance!
>>
>>
>>
>> Apr 11 08:18:30 psy-slurm polkitd[680]: Registered Authentication Agent
>> for unix-process:1779:31362 (system bus name :1.26 [/usr/bin/pkttyagent
>> --notify-fd 5 --fallbac
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: Starting Slurm controller daemon...
>>
>> -- Subject: Unit slurmctld.service has begun start-up
>>
>> -- Defined-By: systemd
>>
>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>
>> --
>>
>> -- Unit slurmctld.service has begun starting up.
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: PID file /var/run/slurmctld.pid not
>> readable (yet?) after start.
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: Started Slurm controller daemon.
>>
>> -- Subject: Unit slurmctld.service has finished start-up
>>
>> -- Defined-By: systemd
>>
>> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
>>
>> --
>>
>> -- Unit slurmctld.service has finished starting up.
>>
>> --
>>
>> -- The start-up result is done.
>>
>> Apr 11 08:18:30 psy-slurm polkitd[680]: Unregistered Authentication Agent
>> for unix-process:1779:31362 (system bus name :1.26, object path
>> /org/freedesktop/PolicyKit1/A
>>
>> Apr 11 08:18:30 psy-slurm slurmctld[1787]: fatal: Incorrect permissions
>> on state save loc: /var/spool
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service: main process
>> exited, code=exited, status=1/FAILURE
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: Unit slurmctld.service entered
>> failed state.
>>
>> Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service failed.
>>
>>
>>
>> Matt Hohmeister
>>
>> Systems and Network Administrator
>>
>> Department of Psychology
>>
>> Florida State University
>>
>> PO Box 3064301
>>
>> Tallahassee, FL 32306-4301
>>
>> Phone: +1 850 645 1902
>>
>> Fax: +1 850 644 7739
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180411/4d5547f6/attachment.html>
More information about the slurm-users
mailing list