[slurm-users] Ubuntu Cluster with Slurm
Renfro, Michael
Renfro at tntech.edu
Wed May 13 14:05:21 UTC 2020
I’d compare the RealMemory part of ’scontrol show node abhi-HP-EliteBook-840-G2’ to the RealMemory part of your slurm.conf:
> Nodes which register to the system with less than the configured resources (e.g. too little memory), will be placed in the "DOWN" state to avoid scheduling jobs on them.
— https://slurm.schedmd.com/slurm.conf.html
As far as GPUs go, it looks like you have Intel graphics on the Lenovo and a Radeon R7 on the HP? If so, then nothing is CUDA-compatible, but you might be able to make something work with OpenCL. No idea if that would give performance improvements over the CPUs, though.
--
Mike Renfro, PhD / HPC Systems Administrator, Information Technology Services
931 372-3601 / Tennessee Tech University
> On May 13, 2020, at 8:42 AM, Abhinandan Patil <abhinandan_patil_1414 at yahoo.com> wrote:
>
> Dear All,
>
> Preamble
> ----------
> I want to form simple cluster with three laptops:
> abhi-Latitude-E6430 //This serves as the controller
> abhi-Lenovo-ideapad-330-15IKB //Compute Node
> abhi-HP-EliteBook-840-G2 //Compute Node
>
>
> Aim
> -------------
> I want to make use of CPU+GPU+RAM on all the machines when I execute JAVA programs or Python programs.
>
>
> Implementation
> ------------------------
> Now let us look at the slurm.conf
>
> On Machine abhi-Latitude-E6430
>
> ClusterName=linux
> ControlMachine=abhi-Latitude-E6430
> SlurmUser=abhi
> SlurmctldPort=6817
> SlurmdPort=6818
> AuthType=auth/munge
> SwitchType=switch/none
> StateSaveLocation=/tmp
> MpiDefault=none
> ProctrackType=proctrack/pgid
> NodeName=abhi-Lenovo-ideapad-330-15IKB RealMemory=12000 CPUs=2
> NodeName=abhi-HP-EliteBook-840-G2 RealMemory=14000 CPUs=2
> PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP
>
> Same slurm.conf is copied to all the Machines.
>
>
> Observations
> --------------------------------------
> Now when I do
> abhi at abhi-HP-EliteBook-840-G2:~$ service slurmd status
> ● slurmd.service - Slurm node daemon
> Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
> Active: active (running) since Wed 2020-05-13 18:50:01 IST; 1min 49s ago
> Docs: man:slurmd(8)
> Process: 98235 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=0/SUCCESS)
> Main PID: 98253 (slurmd)
> Tasks: 2
> Memory: 2.2M
> CGroup: /system.slice/slurmd.service
> └─98253 /usr/sbin/slurmd
>
> abhi at abhi-Lenovo-ideapad-330-15IKB:~$ service slurmd status
> ● slurmd.service - Slurm node daemon
> Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
> Active: active (running) since Wed 2020-05-13 18:50:20 IST; 8s ago
> Docs: man:slurmd(8)
> Process: 71709 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=0/SUCCESS)
> Main PID: 71734 (slurmd)
> Tasks: 2
> Memory: 2.0M
> CGroup: /system.slice/slurmd.service
> └─71734 /usr/sbin/slurmd
>
> abhi at abhi-Latitude-E6430:~$ service slurmctld status
> ● slurmctld.service - Slurm controller daemon
> Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
> Active: active (running) since Wed 2020-05-13 18:48:58 IST; 4min 56s ago
> Docs: man:slurmctld(8)
> Process: 97114 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS (code=exited, status=0/SUCCESS)
> Main PID: 97116 (slurmctld)
> Tasks: 7
> Memory: 2.6M
> CGroup: /system.slice/slurmctld.service
> └─97116 /usr/sbin/slurmctld
>
>
> However abhi at abhi-Latitude-E6430:~$ sinfo
> PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
> debug* up infinite 1 down* abhi-Lenovo-ideapad-330-15IKB
>
>
> Advice needed
> ------------------------
> Please let me know Why I am seeing only one node.
> Further how the total memory is calculated? Can Slurm make use of GPU processing power as well
> Please let me know if I have missed something in configuration or explanation.
>
> Thank you all
>
> Best Regards,
> Abhinandan H. Patil, +919886406214
> https://www.AbhinandanHPatil.info
>
>
More information about the slurm-users
mailing list