[slurm-users] slurm and singularity
Carl Ponder
cponder at nvidia.com
Wed Feb 8 04:00:52 UTC 2023
Take a look at this extension to SLURM:
https://github.com/NVIDIA/pyxis
You put the container path on the srun command-line and each rank runs
inside it's own copy of the image.
------------------------------------------------------------------------
Subject: [slurm-users] slurm and singularity
Date: Tue, 7 Feb 2023 17:31:45 +0000
From: Groner, Rob <rug262 at psu.edu>
Reply-To: Slurm User Community List <slurm-users at lists.schedmd.com>
To: slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>
*External email: Use caution opening links or attachments*
I'm trying to setup the capability where a user can execute:
$: sbatch <parameters> script_to_run.sh
and the end result is that a job is created on a node, and that job will
execute "singularity exec <path to container> script_to_run.sh"
Also, that they could execute:
$: salloc <parameters>
and would end up on a node per their parameters, and instead of a bash
prompt, they have the singularity prompt because they're inside a
running container.
Oddly, I ran: salloc <parameters> /usr/bin/singularity shell <path to
sif> and that allocated and said the node was ready and gave me an
apptainer prompt...cool! But when I asked it what hostname I was on, I
was NOT on the node that it had said was ready, I was still on the
submit node. When I exit out of the apptainer shell, it ends my
allocation. Sooo...it gave me the allocation and started the apptainer
shell, but somehow I was still on the submit node.
As far as the job, I've done some experiments with using job_submit.lua
to replace the script with one that has a singularity call in it
instead, and that might hold some promise. But I'd have to write the
passed-in script to a temp file or something, and then have singularity
exec that. That MIGHT work.
The results for "slurm and singularity" do not describe what I'm trying
to do. The closest thing I can find is what slurm touts on their
website, a leftover from Slurm 2017 talking about a spank plugin that,
as near as I can figure, doesn't exist. I read through the OCI docs on
the slurm website, but it shows that using singularity with that
requires all commands to have sudo. That's not going to work.
I'm running out of ideas here.
Thanks,
Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230207/3f7a6521/attachment-0001.htm>
More information about the slurm-users
mailing list