[slurm-users] [EXT] Distribute the node resources in multiple partitions and regarding job submission script

Ozeryan, Vladimir Vladimir.Ozeryan at jhuapl.edu
Tue Apr 12 10:05:54 UTC 2022


1.       I don’t see where you specifying a “Default” partition (DEFAULT=yes)

2.       In “NodeName=* ” you have Gres=gpu:2 (All nodes on that line have 2 GPUs.) Create another “NodeName” line below and list your non-gpu nodes there without the GRES flag.

From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Purvesh Parmar
Sent: Tuesday, April 12, 2022 5:49 AM
To: slurm-users at lists.schedmd.com
Subject: [EXT] [slurm-users] Distribute the node resources in multiple partitions and regarding job submission script

APL external email warning: Verify sender slurm-users-bounces at lists.schedmd.com<mailto:slurm-users-bounces at lists.schedmd.com> before clicking links or attachments



Hello,

I am using slurm 21.08. I am stuck with the following.

Q1 : I have 8 nodes with 2 gpus each and 128 cores with 512 GB RAM. I want to distribute each node's resources in 2 partitions so that "par1" partition  will have 2 gpus with 64 cores and 256 GB ram of the node and the other partition "par 2" will have the remaining  64 cores and remaining 256 gb ram and no gpus of the same node.

par1 should be the default partition.

I have used MaxCPUsPerNode and also listed each node in both par1 and par2 .However, while job submission, if i give par2 as the partition name and use gres:gpu, still the job is getting submitted and is going for run (in spite of par2 not having gpus).

slurm.conf (something like this)

########################
NodeName=comp1,comp2......comp8 Sockets=1 CPUs=64 CoresPerSocket=64 ThreadsPerCore=1 Gres=gpu:2
PartitionName=par1 State=UP Nodes=comp1,comp2......comp8 MaxCPUsPerNode=64
PartitionName=par1 State=UP Nodes=comp1,comp2......comp8 MaxCPUsPerNode=64
PartitionName=par2 State=UP Nodes=comp1,comp2......comp8 MaxCPUsPerNode=64
########################

Where are the things going wrong?

Q2 : How to save the job scripts permanently? I have given
SlurmdSpoolDir=/usr/local/slurm/var/spool/slurmd
AccountingStorageEnforce=safe
AccountingStoreFlags=job_script,job_env

Regards,
Purvesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220412/04823da9/attachment-0001.htm>


More information about the slurm-users mailing list