[slurm-users] slurm and singularity

Carl Ponder cponder at nvidia.com
Wed Feb 8 04:00:52 UTC 2023


Take a look at this extension to SLURM:

    https://github.com/NVIDIA/pyxis

You put the container path on the srun command-line and each rank runs 
inside it's own copy of the image.

------------------------------------------------------------------------
Subject: 	[slurm-users] slurm and singularity
Date: 	Tue, 7 Feb 2023 17:31:45 +0000
From: 	Groner, Rob <rug262 at psu.edu>
Reply-To: 	Slurm User Community List <slurm-users at lists.schedmd.com>
To: 	slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>



*External email: Use caution opening links or attachments*


I'm trying to setup the capability where a user can execute:

$: sbatch <parameters> script_to_run.sh

and the end result is that a job is created on a node, and that job will 
execute "singularity exec <path to container> script_to_run.sh"

Also, that they could execute:

$: salloc <parameters>

and would end up on a node per their parameters, and instead of a bash 
prompt, they have the singularity prompt because they're inside a 
running container.

Oddly, I ran:  salloc <parameters> /usr/bin/singularity shell <path to 
sif> and that allocated and said the node was ready and gave me an 
apptainer prompt...cool!  But when I asked it what hostname I was on, I 
was NOT on the node that it had said was ready, I was still on the 
submit node.  When I exit out of the apptainer shell, it ends my 
allocation. Sooo...it gave me the allocation and started the apptainer 
shell, but somehow I was still on the submit node.

As far as the job, I've done some experiments with using job_submit.lua 
to replace the script with one that has a singularity call in it 
instead, and that might hold some promise.  But I'd have to write the 
passed-in script to a temp file or something, and then have singularity 
exec that. That MIGHT work.

The results for "slurm and singularity" do not describe what I'm trying 
to do.  The closest thing I can find is what slurm touts on their 
website, a leftover from Slurm 2017 talking about a spank plugin that, 
as near as I can figure, doesn't exist.  I read through the OCI docs on 
the slurm website, but it shows that using singularity with that 
requires all commands to have sudo. That's not going to work.

I'm running out of ideas here.

Thanks,

Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230207/3f7a6521/attachment-0001.htm>


More information about the slurm-users mailing list