[slurm-users] Heterogeneous HPC

Michael Jennings mej at lanl.gov
Thu Sep 19 15:30:07 UTC 2019


On Thursday, 19 September 2019, at 12:38:43 (+0430),
Mahmood Naderan wrote:

> The question is not directly related to Slurm, but is actually related to
> the people in this community.
> 
> For heterogeneous environments, where different operating systems,
> application and library versions are needed for HPC users, I would like to
> know it using docker/containers is better than yielding virtual machines?
> 
> Actually, it is lighter than VM, however, I haven't seen a docker image for
> Matlab for example. If that is possible, can Slurm be used to schedule
> containers?
> If someone has any experience using docker in HPC clusters, please let me
> know.

Docker is the wrong choice for HPC, at least today.  But Podman, from
Red Hat's CRI-O project, is a drop-in replacement for Docker which
doesn't use the client-server model of Docker and therefore addresses
many of the challenges with trying to run Docker for HPC user jobs.

There's also LANL's Charliecloud, which is a highly optimized
container runtime that (unlike the other options in this space, save
Podman) DOES NOT require any root privileges whatsoever, not even at
install time.  For (hopefully obvious) security reasons, you are far
safer using one of the unprivileged options.

Here at Los Alamos, we use both Charliecloud and Podman/Buildah along
with the Spokeo and umoci tools.  While we do not permit Singularity
on our systems for security reasons and don't run Shifter because it
requires privilege, we have had Charliecloud deployed and actively
used on both our Classified and Open Science systems for well over a
year now, and we are in the process of getting Podman/Buildah and
friends into the Secure systems as we speak.

(Note that all of the above require RHEL7 or higher; if you need RHEL6
support, you'll want to check out Shifter.)

Here are some videos of talks that might help you get up-to-speed on
this subject:

"LISA18 - Containers and Security on Planet X"
(https://youtu.be/F3qCvZMzUtE) - Why containers matter for HPC, what
makes HPC so different from the typical Docker/AppC use cases, and how
to choose the right solution for your site.

"Charliecloud - Unprivileged Containers for HPC"
(https://youtu.be/ESsZgcaP-ZQ) - What containers actually are under
the hood, how they work, what they are good for, and how to get up and
running with Charliecloud in under 5 minutes.

"Container Mythbusters" (https://youtu.be/FFyXdgWXD3A) - Dispelling
common misconceptions and debunking propaganda around containers,
container runtime security, and when/how you should (and should NOT)
use containers.

Hope those help!
Michael

-- 
Michael E. Jennings <mej at lanl.gov>
HPC Systems Team, Los Alamos National Laboratory
Bldg. 03-2327, Rm. 2341     W: +1 (505) 606-0605



More information about the slurm-users mailing list