[slurm-users] Defining constraints for job dispatching

Renfro, Michael Renfro at tntech.edu
Thu Sep 20 07:48:11 MDT 2018


Partitions have an ExclusiveUser setting. Not exclusive per job as I’d mis-remembered, but exclusive per user.

In any case, none of my few Fluent users run graphically on the HPC. They do their pre- and post-processing on local workstations, copying their .cas.gz and .dat.gz files to the HPC and running Fluent in a non-graphical batch manner:

Bash functions that everyone sources for a Fluent run:

=====

function fluent_make_nodelist() {
    > nodelist.${SLURM_JOBID}
    for n in `echo $SLURM_NODELIST | scontrol show hostnames`; do
        LINE="${n}.hpcib.tntech.edu"
        echo "${LINE}" >> nodelist.${SLURM_JOBID}
    done
}

function fluent_setup() {
    module load fluent
    # Calculate final iteration value
    END=$(printf "%05d" $(expr ${START} + ${STEPS}))
    if [ ${SLURM_NNODES} -gt 1 ]; then
        INTERCONNECT=infiniband
        fluent_make_nodelist
        EXTRA_ARGS="-cnf=nodelist.${SLURM_JOBID}"
    else
        INTERCONNECT=shmem
        EXTRA_ARGS=""
    fi
}

function fluent_run() {
    # Remove output file if it exists
    if [ -f ${JOBNAME}-${END}.dat.gz ]; then
        rm -f ${JOBNAME}-${END}.dat.gz
    fi
    fluent -g ${SOLVER} -t${SLURM_NTASKS} -p${INTERCONNECT} ${EXTRA_ARGS} <<EOD
rc ${JOBNAME}.cas.gz
rd ${JOBNAME}-${START}.dat.gz
solve/it/${STEPS}
wd ${JOBNAME}-${END}.dat.gz
exit
EOD
rm -f nodelist.${SLURM_JOBID}
}

=====

Typical slurm script:

=====

#!/bin/bash
#SBATCH --nodes=1 --ntasks-per-node=28
#SBATCH --time=1-00:00:00

# Given a case and data file with a common prefix, a hyphen, and a 5-digit
# value for the starting iteration count:
JOBNAME=FFF-1-1
START=00000

# How many additional iterations should be run?
STEPS=3000

# Which solver style to use?
# 2d (2d single precision), 2ddp (2d double precision),
# 3d (3d single precision), 3ddp (3d double precision)
SOLVER=3ddp

# Shouldn't have to edit anything below here. A new data file will be written
# under the name ${JOBNAME}-${START+STEPS}.dat.gz
source /cm/shared/apps/ansys_inc/fluent_functions
fluent_setup
fluent_run

=====

> On Sep 20, 2018, at 2:50 AM, Mahmood Naderan <mahmood.nt at gmail.com> wrote:
> 
> Hi Michael,
> Sorry for the late response. Do you mean supplying --exclusive to the
> srun command? Or I have to do something else for partitions? Currently
> they use
> 
> srun -n 1 -c 6 --x11 -A monthly -p CAT --mem=32GB ./fluent.sh
> 
> where fluent.sh is
> 
> #!/bin/bash
> unset SLURM_GTIDS
> /state/partition1/ansys_inc/v140/fluent/bin/fluent
> 
> 
> Regards,
> Mahmood
> 
> 
> 
> 
> On Sat, Sep 1, 2018 at 7:45 PM Renfro, Michael <Renfro at tntech.edu> wrote:
>> 
>> Depending on the scale (what percent are Fluent users, how many nodes you have), you could use exclusive mode on either a per-partition or per-job basis.
>> 
>> Here, my (currently few) Fluent users do all their GUI work off the cluster, and just submit batch jobs using the generated case and data files.
>> 
>> --
>> Mike Renfro  / HPC Systems Administrator, Information Technology Services
>> 931 372-3601 / Tennessee Tech University
>> 
>>> On Sep 1, 2018, at 9:53 AM, Mahmood Naderan <mahmood.nt at gmail.com> wrote:
>>> 
>>> Hi,
>>> I have found that when user A is running a fluent job (some 100% processes in top) and user B decides to run a fluent job for his own, the console window of fluent shows some messages that another fluent process is running and it can not set affinity. This is not an error, but I see that the speed is somehow low.
>>> 
>>> Think that when a user runs "srun --x11 .... script" where script launches some fluent processes and slurm put that job on compute-0-0, there should be a way that another "script" from another user goes to compute-0-1 even if compute-0-0 has free cores.
>>> 
>>> Is there any way in slurm configuration to set such a constraint? If slurm wants to dispatch a job, first see if process X is running there or not.
>>> 
>>> 
>>> Regards,
>>> Mahmood
>>> 
>>> 
>> 
>> 
> 



More information about the slurm-users mailing list