i cannot send jobs to nodes with one gpu, i don't find the bug in my configuration. can someone help me ?
in slurm.conf GresTypes=gpu is set
this are some nodes in slurm.conf
NodeName=gpu-[001-003] CPUs=8 SocketsPerBoard=1 CoresPerSocket=4 RealMemory=31000 Gres=gpu:1080:1 NodeName=gpu-[010-019] CPUs=16 SocketsPerBoard=1 CoresPerSocket=8 RealMemory=64000 Gres=gpu:1080:2
the partition for this gpu nodes is
# General GPU partitions PartitionName=GPU Nodes=gpu-[001-003,010-019] AllowAccounts=staff PreemptMode=REQUEUE PriorityTier=0 DefMemPerGPU=32000 DefCpuPerGPU=8 CpuBind=none TRESBillingWeights="GRES/gpu=1000" GraceTime=300
this are the entries for some nodes in gres.conf
NodeName=gpu-[001-003] Name=gpu Type=1080 File=/dev/nvidia0 NodeName=gpu-[010-019] Name=gpu Type=1080 File=/dev/nvidia[0-1]
when i send a job with sbatch to gpu-001
#SBATCH --job-name=hello #SBATCH --ntasks-per-node=1 #SBATCH --output=hello_%A.out #SBATCH --time=00:10:00 #SBATCH --mail-type=ALL #SBATCH --mail-user=striewski@ismll.de #SBATCH --partition=GPU #SBATCH --nodelist=gpu-001 #SBATCH --gres=gpu:1
[...]
i get the error
sbatch: error: Batch job submission failed: Requested node configuration is not available
when i send the job to a node with 2 gpu's it runs with no error, just setting --nodelist=gpu-12
has someone a hint what i made wrong ?
Mit freundlichen Grüßen / kind regards