[slurm-users] Heterogeneous HPC

Stijn De Weirdt stijn.deweirdt at ugent.be
Fri Sep 20 08:34:33 UTC 2019


hi michael,

very intersting feedback!
have you ever tried/looked at https://github.com/eth-cscs/sarus?

stijn

On 9/20/19 9:11 AM, Mahmood Naderan wrote:
> I appreciate the repplies.
> I will try to test Charliecloud to see what is what...
> 
> 
> On Fri, Sep 20, 2019, 10:37 Fulcomer, Samuel <samuel_fulcomer at brown.edu>
> wrote:
> 
>>
>>
>> Thanks! and I'll watch the video...
>>
>> Privileged containers!.... never!....
>>
>> On Thu, Sep 19, 2019 at 9:06 PM Michael Jennings <mej at lanl.gov> wrote:
>>
>>> On Thursday, 19 September 2019, at 19:27:38 (-0400),
>>> Fulcomer, Samuel wrote:
>>>
>>>> I obviously haven't been keeping up with any security concerns over the
>>> use
>>>> of Singularity. In a 2-3 sentence nutshell, what are they?
>>>
>>> So before I do that, if you have a few minutes, I do think you'll find
>>> it worth your time to go to https://youtu.be/H6VrjowOOF4?t=2361 (it'll
>>> start about 39 minutes in) and watch at least those next 8 or so minutes.
>>> I go into some detail about the security track records of multiple
>>> container runtimes and provide factual data so that folks can make their
>>> own risk assessments rather than just giving my personal opinion.  (The
>>> video does cut off the right side of the slides, but the slide deck is
>>> available at
>>> https://permalink.lanl.gov/object/tr?what=info:lanl-repo/lareport/LA-UR-19-22663
>>> for anyone interested.)
>>>
>>> If you really don't want to watch the video, though, I can provide a few
>>> of the data points.
>>>
>>> First off, if you have not read it before, you really should read
>>> Matthias Gerstner's assessment after doing a code review and security
>>> audit on Singularity 2.6.0 to see if it could be packaged for SuSE:
>>> https://www.openwall.com/lists/oss-security/2018/12/12/2
>>> The quotes I used on the slide for my talk came from comments he made in
>>> the linked SuSE Bugzilla bug -- which, for unknown reasons, was
>>> re-locked by SuSE after previously being unlocked once the bug report
>>> was public! -- regarding whether or not, and under what constraints, to
>>> include and support Singularity on SuSE.  Matthias is a widely respected
>>> security expert in the OSS community, so I trust his assessment and
>>> insight.  And his audit alone found 5 or 6 CVE-worthy vulnerabilities at
>>> once.
>>>
>>> Additionally, as I mentioned in the video, during the 3-year period
>>> 2016-2018, there were at least 17 different vulnerabilities found in
>>> Singularity.  Also, of the 9 releases they did during 2018, 7 of those
>>> were security releases to fix vulnerabilities (and frequently more than
>>> 1 at a time).  That's...not great.  Especially in an environment like
>>> ours where saying "security is important" is an understatement of
>>> nuclear proportions! ;-)
>>>
>>> And finally, while we were hopeful that the rewrite in Go (version 3.0
>>> and above) would correct the security failings in the code, there've
>>> already been multiple serious vulnerabilities (all grouped together
>>> under a single CVE identifier, CVE-2019-11328), at least one of which
>>> was essentially a replica of one of the flaws fixed in 2.6.0 under
>>> CVE-2018-12021!  And you don't need to take my word for it, either:
>>> https://www.openwall.com/lists/oss-security/2019/05/16/1
>>>
>>> It's hard to say if the above trend will continue...but not all sites
>>> can afford to take those kinds of risks.
>>>
>>> And while Shifter's security track record is spotless to date, I would
>>> still summarize the overall lesson to be learned as, "Don't use
>>> privileged container runtimes.  Use user namespaces.  That's what
>>> they're there for."  And before anyone yells at me, yes I know
>>> Singularity advertises user namespace support and non-setuid operation.
>>> But it doesn't seem to be very widely used or adequately exercised, and
>>> AFAICT the default mode of operation in both RPMs and build-from-src is
>>> via setuid binaries.  So using a natively unprivileged runtime still
>>> seems the less risky choice, in my personal assessment.
>>>
>>> Yes, I know that was more than a "2-3 sentence nutshell," but hopefully
>>> it was helpful anyway! :-)
>>>
>>> Michael
>>>
>>> --
>>> Michael E. Jennings <mej at lanl.gov>
>>> HPC Systems Team, Los Alamos National Laboratory
>>> Bldg. 03-2327, Rm. 2341     W: +1 (505) 606-0605
>>>
>>
> 



More information about the slurm-users mailing list