[slurm-users] Ubuntu Cluster with Slurm

Wed May 13 14:05:21 UTC 2020

I’d compare the RealMemory part of ’scontrol show node abhi-HP-EliteBook-840-G2’ to the RealMemory part of your slurm.conf:

> Nodes which register to the system with less than the configured resources (e.g. too little memory), will be placed in the "DOWN" state to avoid scheduling jobs on them.

— https://slurm.schedmd.com/slurm.conf.html

As far as GPUs go, it looks like you have Intel graphics on the Lenovo and a Radeon R7 on the HP? If so, then nothing is CUDA-compatible, but you might be able to make something work with OpenCL. No idea if that would give performance improvements over the CPUs, though.

-- 
Mike Renfro, PhD / HPC Systems Administrator, Information Technology Services
931 372-3601     / Tennessee Tech University

> On May 13, 2020, at 8:42 AM, Abhinandan Patil <abhinandan_patil_1414 at yahoo.com> wrote:
> 
> Dear All,
> 
> Preamble
> ----------
> I want to form simple cluster with three laptops:
> abhi-Latitude-E6430  //This serves as the controller
> abhi-Lenovo-ideapad-330-15IKB //Compute Node
> abhi-HP-EliteBook-840-G2 //Compute Node
> 
> 
> Aim
> -------------
> I want to make use of CPU+GPU+RAM on all the machines when I execute JAVA programs or Python programs.
> 
> 
> Implementation
> ------------------------
> Now let us look at the slurm.conf
> 
> On Machine abhi-Latitude-E6430
> 
> ClusterName=linux
> ControlMachine=abhi-Latitude-E6430
> SlurmUser=abhi
> SlurmctldPort=6817
> SlurmdPort=6818
> AuthType=auth/munge
> SwitchType=switch/none
> StateSaveLocation=/tmp
> MpiDefault=none
> ProctrackType=proctrack/pgid
> NodeName=abhi-Lenovo-ideapad-330-15IKB RealMemory=12000 CPUs=2
> NodeName=abhi-HP-EliteBook-840-G2 RealMemory=14000 CPUs=2
> PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP
> 
> Same slurm.conf is copied to all the Machines.
> 
> 
> Observations
> --------------------------------------
> Now when I do
> abhi at abhi-HP-EliteBook-840-G2:~$ service slurmd status
> ● slurmd.service - Slurm node daemon
>      Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
>      Active: active (running) since Wed 2020-05-13 18:50:01 IST; 1min 49s ago
>        Docs: man:slurmd(8)
>     Process: 98235 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=0/SUCCESS)
>    Main PID: 98253 (slurmd)
>       Tasks: 2
>      Memory: 2.2M
>      CGroup: /system.slice/slurmd.service
>              └─98253 /usr/sbin/slurmd
> 
> abhi at abhi-Lenovo-ideapad-330-15IKB:~$ service slurmd status
> ● slurmd.service - Slurm node daemon
>      Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
>      Active: active (running) since Wed 2020-05-13 18:50:20 IST; 8s ago
>        Docs: man:slurmd(8)
>     Process: 71709 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=0/SUCCESS)
>    Main PID: 71734 (slurmd)
>       Tasks: 2
>      Memory: 2.0M
>      CGroup: /system.slice/slurmd.service
>              └─71734 /usr/sbin/slurmd
> 
> abhi at abhi-Latitude-E6430:~$ service slurmctld status 
> ● slurmctld.service - Slurm controller daemon
>      Loaded: loaded (/lib/systemd/system/slurmctld.service; enabled; vendor preset: enabled)
>      Active: active (running) since Wed 2020-05-13 18:48:58 IST; 4min 56s ago
>        Docs: man:slurmctld(8)
>     Process: 97114 ExecStart=/usr/sbin/slurmctld $SLURMCTLD_OPTIONS (code=exited, status=0/SUCCESS)
>    Main PID: 97116 (slurmctld)
>       Tasks: 7
>      Memory: 2.6M
>      CGroup: /system.slice/slurmctld.service
>              └─97116 /usr/sbin/slurmctld
> 
>              
> However  abhi at abhi-Latitude-E6430:~$ sinfo
> PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
> debug*       up   infinite      1  down* abhi-Lenovo-ideapad-330-15IKB
> 
> 
> Advice needed
> ------------------------
> Please let me know Why I am seeing only one node. 
> Further how the total memory is calculated? Can Slurm make use of GPU processing power as well
> Please let me know if I have missed something in configuration or explanation.
> 
> Thank you all
> 
> Best Regards,
> Abhinandan H. Patil, +919886406214
> https://www.AbhinandanHPatil.info
> 
>