[slurm-users] trying to add gres

Wed Jan 6 00:58:31 UTC 2021

Important notes...

If requesting more than one core and not using "-N 1", equal numbers of
GPUs will be allocated on each node where the cores are allocated. (i.e. if
requesting 1 GPU for a 2-core job, if one core is allocated on each of two
nodes, one GPU will be allocated on each node).

If you are running node exclusive, all GPUs on the node will be allocated
to the job, regardless of how many are used.

On Tue, Jan 5, 2021 at 7:30 PM Erik Bryer <ebryer at isi.edu> wrote:

> I made the gres.conf the same on both nodes and Slurm started without
> error. I'm now seeing another error.
>
> There are 4 GPUs defined per node. If I start 2 jobs with
> #SBATCH --gpus=foolsgold:4
> it runs one job in each of the 2 nodes. If I scancel those and run 4 jobs
> with the script reading
> #SBATCH --gpus=foolsgold:1
> I get 2 queued and 2 running jobs. It seems allocating 1 gpu allocates all
> 4, not just 1. But why would this be so?
>
> Thanks,
> Erik
> ------------------------------
> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Chris Samuel <chris at csamuel.org>
> *Sent:* Thursday, December 24, 2020 5:44 PM
> *To:* slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>
> *Subject:* Re: [slurm-users] trying to add gres
>
> On 24/12/20 4:42 pm, Erik Bryer wrote:
>
> > I made sure my slurm.conf is synchronized across machines. My intention
> > is to add some arbitrary gres for testing purposes.
>
> Did you update your gres.conf on all the nodes to match?
>
> All the best,
> Chris
> --
> Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210105/cee38359/attachment.htm>