[slurm-users] Array jobs vs Fairshare

Riebs, Andy andy.riebs at hpe.com
Wed Oct 21 14:07:11 UTC 2020


Thanks for the additional information, Stephan!

At this point, I’ll have to ask for anyone with more job array experience than I have (because I have none!) to speak up.

Remember that we’re all in this together(*), so any help that anyone can offer will be good!

Andy

(*) Well, actually, I’m retiring at the end of the week, so I’m not sure that I’ll have a lot of Slurm in my life, going forward ☺

From: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com] On Behalf Of Stephan Schott
Sent: Wednesday, October 21, 2020 9:40 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] Array jobs vs Fairshare

And I forgot to mention, things are running in a Qlustar cluster based on Ubuntu 18.04.4 LTS Bionic. 😬

El mié., 21 oct. 2020 a las 15:38, Stephan Schott (<schottve at hhu.de<mailto:schottve at hhu.de>>) escribió:
Oh, sure, sorry.
We are using slurm 18.08.8, with a backfill scheduler. The jobs are being assigned to the same partition, which limits gpus and cpus to 1 via QOS. Here some of the main flags:

SallocDefaultCommand="srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL"
TaskPlugin=task/affinity,task/cgroup
TaskPluginParam=Sched
MinJobAge=300
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory
PreemptType=preempt/qos
PreemptMode=requeue
PriorityType=priority/multifactor
PriorityFlags=FAIR_TREE
PriorityFavorSmall=YES
FairShareDampeningFactor=5
PriorityWeightAge=1000
PriorityWeightFairshare=5000
PriorityWeightJobSize=1000
PriorityWeightPartition=1000
PriorityWeightQOS=5000
PriorityWeightTRES=gres/gpu=1000
AccountingStorageEnforce=limits,qos,nosteps
AccountingStorageTRES=gres/gpu
AccountingStorageHost=localhost
AccountingStorageType=accounting_storage/slurmdbd
JobCompType=jobcomp/none
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/cgroup

Any ideas?

Cheers,

El mié., 21 oct. 2020 a las 15:17, Riebs, Andy (<andy.riebs at hpe.com<mailto:andy.riebs at hpe.com>>) escribió:
Also, of course, any of the information that you can provide about how the system is configured: scheduler choices, QOS options, and the like, would also help in answering your question.

From: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com<mailto:slurm-users-bounces at lists.schedmd.com>] On Behalf Of Riebs, Andy
Sent: Wednesday, October 21, 2020 9:02 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com<mailto:slurm-users at lists.schedmd.com>>
Subject: Re: [slurm-users] Array jobs vs Fairshare

Stephan (et al.),

There are probably 6 versions of Slurm in common use today, across multiple versions each of Debian/Ubuntu, SuSE/SLES, and RedHat/CentOS/Fedora. You are more likely to get a good answer if you offer some hints about what you are running!

Regards,
Andy

From: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com] On Behalf Of Stephan Schott
Sent: Wednesday, October 21, 2020 8:37 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com<mailto:slurm-users at lists.schedmd.com>>
Subject: [slurm-users] Array jobs vs Fairshare

Hi everyone,
I am having doubts regarding array jobs. To me it seems that the JobArrayTaskLimit has precedence over the Fairshare, as users with a way lower priority seem to get constant allocations for their array jobs, compared to users with "normal" jobs. Can someone confirm this?
Cheers,

--
Stephan Schott Verdugo
Biochemist

Heinrich-Heine-Universitaet Duesseldorf
Institut fuer Pharm. und Med. Chemie
Universitaetsstr. 1
40225 Duesseldorf
Germany


--
Stephan Schott Verdugo
Biochemist

Heinrich-Heine-Universitaet Duesseldorf
Institut fuer Pharm. und Med. Chemie
Universitaetsstr. 1
40225 Duesseldorf
Germany


--
Stephan Schott Verdugo
Biochemist

Heinrich-Heine-Universitaet Duesseldorf
Institut fuer Pharm. und Med. Chemie
Universitaetsstr. 1
40225 Duesseldorf
Germany
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201021/6d52195d/attachment-0001.htm>


More information about the slurm-users mailing list