[slurm-users] Disable --no-allocate support for a node/SlurmD
René Sitt
sittr at hrz.uni-marburg.de
Wed Jun 14 15:32:13 UTC 2023
Hi,
> Thanks for the suggestion.
>
> However as I understand it this requires additionally trusting the
> node where those scripts are running on,
> which is, I guess, the one running SlurmCtlD.
>
> The reason we are using Prolog scripts is that they are running on the
> very node the job will be running on.
> So we make that one "secure" (or at least harden it by e.g. disabling
> SSH access and restricting any other connections).
> Then anything running on this node has a high trust level, e.g. the
> SlurmD and the Prolog script.
> If required the node could be rebooted with a fixed image after each
> job removing any potential compromise.
> That isn't feasible for the SlurmCtlD as that would affect the whole
> cluster and unrelated jobs.
>
> Hence the checks (for example filtering out interactive jobs, but also
> some additional authentication) should be done on the hardened node(s).
>
> It would work if there wasn't a way to circumvent the Prolog. So
> ideally I'd like to have a configuration option for the SlurmD such
> that it doesn't accept such jobs.
> As the SlurmD config is on the node it can also be considered secure.
>
> So while I fully agree that those plugins are better suited and likely
> easier to use
> I fear that it is much easier to prevent them from running and hence
> bypass those restrictions
> than having something (local) at the level of the SlurmD.
>
> Please correct me if I misunderstood anything.
Ah okay, so your requirements include completely insulating (some) jobs
from outside access, including root? I've seen this kind of requirements
on e.g. working non-defaced medical data - generally a tough problem imo
because this level of data security seems more or less incompatible with
the idea of a multi-user HPC system.
I remember that this year's ZKI-AK Supercomputing spring meeting had
Sebastian Krey from GWDG presenting the KISSKI ("KI-Servicezentrum für
Sensible und Kritische Infrastrukturen", https://kisski.gwdg.de/ )
project, which works in this problem domain, are you involved in that?
The setup with containerization and 'node hardening' sounds very similar
to me.
Re "preventing the scripts from running": I'd say it's about as easy as
to otherwise manipulate any job submission that goes through slurmctld
(e.g. by editing slurm.conf), so without knowing your exact use case and
requirements, I can't think of a simple solution.
Kind regards,
René Sitt
--
Dipl.-Chem. René Sitt
Hessisches Kompetenzzentrum für Hochleistungsrechnen
Philipps-Universität Marburg
Hans-Meerwein-Straße
35032 Marburg
Tel. +49 6421 28 23523
sittr at hrz.uni-marburg.de
www.hkhlr.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4239 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230614/95d55016/attachment.bin>
More information about the slurm-users
mailing list