[slurm-users] Reserving cores without immediately launching tasks on all of them

Mccall, Kurt E. (MSFC-EV41) kurt.e.mccall at nasa.gov
Fri Nov 26 19:18:48 UTC 2021


Mike,

I’m working through your suggestions.   I tried

$ salloc –ntasks=20 --cpus-per-task=24 --verbose myscript.bash

but salloc says that the resources are not available:

salloc: defined options
salloc: -------------------- --------------------
salloc: cpus-per-task       : 24
salloc: ntasks              : 20
salloc: verbose             : 1
salloc: -------------------- --------------------
salloc: end of defined options
salloc: Linear node selection plugin loaded with argument 4
salloc: select/cons_res loaded with argument 4
salloc: Cray/Aries node selection plugin loaded
salloc: select/cons_tres loaded with argument 4
salloc: Granted job allocation 34299
srun: error: Unable to create step for job 34299: Requested node configuration is not available

$ scontrol show nodes  /* oddly says that there is one core per socket.  could our nodes be misconfigured? */

NodeName=n020 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=24 CPULoad=0.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=n020 NodeHostName=n020 Version=20.02.3
   OS=Linux 4.18.0-305.7.1.el8_4.x86_64 #1 SMP Mon Jun 14 17:25:42 EDT 2021
   RealMemory=1 AllocMem=0 FreeMem=126431 Sockets=24 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal,low,high
   BootTime=2021-11-18T08:43:44 SlurmdStartTime=2021-11-18T08:44:31
   CfgTRES=cpu=24,mem=1M,billing=24
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s



From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Renfro, Michael
Sent: Friday, November 26, 2021 8:15 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [EXTERNAL] Re: [slurm-users] Reserving cores without immediately launching tasks on all of them

The end of the MPICH section at [1] shows an example using salloc [2].

Worst case, you should be able to use the output of “scontrol show hostnames” [3] and use that data to make mpiexec command parameters to run one rank per node, similar to what’s shown at the end of the synopsis section of [4].

[1] https://slurm.schedmd.com/mpi_guide.html#mpich2<https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fmpi_guide.html%23mpich2&data=04%7C01%7Ckurt.e.mccall%40nasa.gov%7Ce6f6860268d745f9bde108d9b0e992ea%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637735339520105658%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=awAJ7NLxanv3WsH0h9O%2BA5zemiBPbGfQZ9PZfPRux%2Bk%3D&reserved=0>
[2] https://slurm.schedmd.com/salloc.html<https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fsalloc.html&data=04%7C01%7Ckurt.e.mccall%40nasa.gov%7Ce6f6860268d745f9bde108d9b0e992ea%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637735339520115614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=DN2HFdOgRQD2PysTTpxwuyAvue%2FsNXR%2F2Is%2BDGiNoZ4%3D&reserved=0>
[3] https://slurm.schedmd.com/scontrol.html<https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fscontrol.html&data=04%7C01%7Ckurt.e.mccall%40nasa.gov%7Ce6f6860268d745f9bde108d9b0e992ea%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637735339520115614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=FwouWVlL08O2kgUidxL9MLJQZJ7g5frTYccTlwmX6O4%3D&reserved=0>
[4] https://www.mpich.org/static/docs/v3.1/www1/mpiexec.html<https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mpich.org%2Fstatic%2Fdocs%2Fv3.1%2Fwww1%2Fmpiexec.html&data=04%7C01%7Ckurt.e.mccall%40nasa.gov%7Ce6f6860268d745f9bde108d9b0e992ea%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637735339520125570%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=3sPjiFGuDEGGbgFMwj2jqUTyzMFVpsURrQyH9Z%2B0yWs%3D&reserved=0>
--
Mike Renfro, PhD  / HPC Systems Administrator, Information Technology Services
931 372-3601<tel:931%20372-3601>      / Tennessee Tech University


On Nov 25, 2021, at 12:45 PM, Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov<mailto:kurt.e.mccall at nasa.gov>> wrote:


External Email Warning

This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.

________________________________
I want to launch an MPICH job with sbatch with one task per node (each a manager), while also reserving a certain number of cores on each node for the managers to fill up with spawned workers (via MPI_Comm_spawn).   I’d like to avoid using –exclusive.

I tried the arguments –ntasks=20 –cpus-per-task=24, but it appears that 20 * 24 tasks will be launched.   Is there a way to reserve cores without immediately launching tasks on them?   Thanks for any help.

sbatch: defined options
sbatch: -------------------- --------------------
sbatch: cpus-per-task       : 24
sbatch: ignore-pbs          : set
sbatch: ntasks              : 20
sbatch: test-only           : set
sbatch: verbose             : 1
sbatch: -------------------- --------------------
sbatch: end of defined options
sbatch: Linear node selection plugin loaded with argument 4
sbatch: select/cons_res loaded with argument 4
sbatch: Cray/Aries node selection plugin loaded
sbatch: select/cons_tres loaded with argument 4
sbatch: Job 34274 to start at 2021-11-25T12:15:05 using 480 processors on nodes n[001-020] in partition normal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211126/d2b8e30d/attachment-0001.htm>


More information about the slurm-users mailing list