[slurm-users] changes in slurm.
navin srivastava
navin.altair at gmail.com
Fri Jul 10 15:10:49 UTC 2020
Thanks either I can use which slurmd -C gives because I see same set of
node giving different value.or I can also choose the available memory I
mean 251*1024
Regards
Navin
On Fri, Jul 10, 2020, 20:34 Stephan Roth <stephan.roth at ee.ethz.ch> wrote:
> It's recommended to round RealMemory down to the next lower gigabyte
> value to prevent nodes from entering a drain state after rebooting with
> a bios- or kernel-update.
>
> Source: https://slurm.schedmd.com/SLUG17/FieldNotes.pdf, "Node
> configuration"
>
> Stephan
>
> On 10.07.20 13:46, Sarlo, Jeffrey S wrote:
> > If you run slurmd -C on the compute node, it should tell you what
> > slurm thinks the RealMemory number is.
> >
> > Jeff
> >
> > ------------------------------------------------------------------------
> > *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf
> of
> > navin srivastava <navin.altair at gmail.com>
> > *Sent:* Friday, July 10, 2020 6:24 AM
> > *To:* Slurm User Community List <slurm-users at lists.schedmd.com>
> > *Subject:* Re: [slurm-users] changes in slurm.
> > Thank you for the answers.
> >
> > is the RealMemory will be decided on the Total Memory value or total
> > usable memory value.
> >
> > i mean if a node having 256GB RAM but free -g will tell about only 251
> GB.
> > deda1x1591:~ # free -g
> > total used free shared buffers
> cached
> > Mem: 251 67 184 6 0 47
> >
> > so we can add the value is 251*1024 MB or 256*1024MB. or is there any
> > slurm command which will provide me the value to add.
> >
> > Regards
> > Navin.
> >
> >
> >
> > On Thu, Jul 9, 2020 at 8:01 PM Brian Andrus <toomuchit at gmail.com
> > <mailto:toomuchit at gmail.com>> wrote:
> >
> > Navin,
> >
> > 1. you will need to restart slurmctld when you make changes to the
> > physical definition of a node. This can be done without affecting
> > running jobs.
> >
> > 2. You can have a node in more than one partition. That will not hurt
> > anything. Jobs are allocated to nodes, not partitions, the partition
> is
> > used to determine which node(s) and filter/order jobs. You should add
> > the node to the new partition, but also leave it in the 'test'
> > partition. If you are looking to remove the 'test' partition, set it
> to
> > down and once all the running jobs that are in it finish, then
> > remove it.
> >
> > Brian Andrus
> >
> > On 7/8/2020 10:57 PM, navin srivastava wrote:
> > > Hi Team,
> > >
> > > i have 2 small query.because of the lack of testing environment i
> am
> > > unable to test the scenario. working on to set up a test
> environment.
> > >
> > > 1. In my environment i am unable to pass #SBATCH --mem-2GB option.
> > > i found the reason is because there is no RealMemory entry in the
> > node
> > > definition of the slurm.
> > >
> > > NodeName=Node[1-12] NodeHostname=deda1x[1450-1461]
> > NodeAddr=Node[1-12]
> > > Sockets=2 CoresPerSocket=10 State=UNKNOWN
> > >
> > > if i add the RealMemory it should be able to pick. So my
> query here
> > > is, is it possible to add RealMemory in the definition anytime
> while
> > > the jobs are in progres and execute the scontrol reconfigure and
> > > reload the daemon on client node? or do we need to take a
> > > downtime?(which i don't think so)
> > >
> > > 2. Also I would like to know what will happen if some jobs are
> > running
> > > in a partition(say test) and I will move the associated node to
> some
> > > other partition(say normal) without draining the node.or if i
> > suspend
> > > the job and then change the node partition and will resume the
> > job. I
> > > am not deleting the partition here.
> > >
> > > Regards
> > > Navin.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
>
>
> -------------------------------------------------------------------
> Stephan Roth | ISG.EE D-ITET ETH Zurich | http://www.isg.ee.ethz.ch
> +4144 632 30 59 | ETF D 104 | Sternwartstrasse 7 | 8092 Zurich
> -------------------------------------------------------------------
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200710/4835e82b/attachment-0001.htm>
More information about the slurm-users
mailing list