Hello,
My name is Mihai and a have an issue with a small GPU cluster manage
with slurm 22.05.11. I got 2 different output when I'm trying to find
out the name of the nodes(one correct and one wrong). The script is:
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --output=/data/mihai/res.txt
#SBATCH --partition=eli
#SBATCH --nodes=2
srun echo Running on host: $(hostname)
srun hostname
srun sleep 15
And the output look like this:
cat res.txt
Running on host: mihai-x8640
Running on host: mihai-…
[View More]x8640
mihaigpu2
mihai-x8640
As you can see the output of the command 'srun echo Running on host:
$(hostname)' is the same, as the jobs was running twice on the same
node, while command 'srun hostname' it's giving me the correct output.
Do you have any idea why the outputs of the 2 commands are different?
Thank you,
Mihai
[View Less]
Slurm User Group (SLUG) 2024 is set for September 12-13 at the
University of Oslo in Oslo, Norway.
Registration information and a high-level schedule can be found
here:https://slug24.splashthat.com/ The last day to register at the
early bird pricing is this Friday, May 31st.
Friday is also the deadline to submit a presentation abstract. We do
not intend to extend this deadline.
If you are interested in presenting your own usage, developments, site
report, tutorial, etc about Slurm, please …
[View More]fill out the following
form:https://forms.gle/N7bFo5EzwuTuKkBN7
Notifications of final presentations accepted will go out by Friday, June 14th.
--
Victoria Hobson
SchedMD LLC
Vice President of Marketing
[View Less]
My organization needs to access historic job information records for metric reporting and resource forecasting. slurmdbd is archiving only the job information, which should be sufficient for our numbers, but is using the default archive script. In retrospect, this data should have been migrated to a secondary MariaDB instance, but that train has passed.
The format of the archive files is not well documented. Does anyone have a program (python/C/whatever) that will read a job_table_archive file …
[View More]and decode it into a parsable structure?
Douglas O'Neal, Ph.D. (contractor)
Manager, HPC Systems Administration Group, ITOG
Frederick National Laboratory for Cancer Research
Leidos Biomedical Research, Inc.
Phone: 301-228-4656
Email: Douglas.O'Neal(a)nih.gov<mailto:Doug%20O'Neal%20%3cDouglas.O'Neal(a)nih.gov%3e>
[View Less]
---------- Forwarded message ---------
From: Hermann Schwärzler <hermann.schwaerzler(a)uibk.ac.at>
Date: Tue, May 28, 2024 at 4:10 PM
Subject: Re: [slurm-users] Re: Performance Discrepancy between Slurm
and Direct mpirun for VASP Jobs.
To: Hongyi Zhao <hongyi.zhao(a)gmail.com>
Hi Zhao,
On 5/28/24 03:08, Hongyi Zhao wrote:
[...]
>
> What's the complete content of cli_filter.lua and where should I put this file?
[...]
Below you find the complete content of our cli_filter.lua.…
[View More]
It has to be put into the same directory as "slurm.conf".
--------------------------------- 8< ---------------------------------
-- see
https://github.com/SchedMD/slurm/blob/master/etc/cli_filter.lua.example
function slurm_cli_pre_submit(options, pack_offset)
return slurm.SUCCESS
end
function slurm_cli_setup_defaults(options, early_pass)
-- Make --hint=nomultithread the default behavior
-- if users specify an other --hint=XX option then
-- it will override the setting done here
options['hint'] = 'nomultithread'
return slurm.SUCCESS
end
function slurm_cli_post_submit(offset, job_id, step_id)
return slurm.SUCCESS
end
--------------------------------- >8 ---------------------------------
Hopefully this helps...
Regards,
Hermann
--
Assoc. Prof. Hongsheng Zhao <hongyi.zhao(a)gmail.com>
Theory and Simulation of Materials
Hebei Vocational University of Technology and Engineering
No. 473, Quannan West Street, Xindu District, Xingtai, Hebei province
[View Less]
Dear Slurm Users,
I am experiencing a significant performance discrepancy when running
the same VASP job through the Slurm scheduler compared to running it
directly with mpirun. I am hoping for some insights or advice on how
to resolve this issue.
System Information:
Slurm Version: 21.08.5
OS: Ubuntu 22.04.4 LTS (Jammy)
Job Submission Script:
#!/usr/bin/env bash
#SBATCH -N 1
#SBATCH -D .
#SBATCH --output=%j.out
#SBATCH --error=%j.err
##SBATCH --time=2-00:00:00
#SBATCH --ntasks=36
#SBATCH -…
[View More]-mem=64G
echo '#######################################################'
echo "date = $(date)"
echo "hostname = $(hostname -s)"
echo "pwd = $(pwd)"
echo "sbatch = $(which sbatch | xargs realpath -e)"
echo ""
echo "WORK_DIR = $WORK_DIR"
echo "SLURM_SUBMIT_DIR = $SLURM_SUBMIT_DIR"
echo "SLURM_JOB_NUM_NODES = $SLURM_JOB_NUM_NODES"
echo "SLURM_NTASKS = $SLURM_NTASKS"
echo "SLURM_NTASKS_PER_NODE = $SLURM_NTASKS_PER_NODE"
echo "SLURM_CPUS_PER_TASK = $SLURM_CPUS_PER_TASK"
echo "SLURM_JOBID = $SLURM_JOBID"
echo "SLURM_JOB_NODELIST = $SLURM_JOB_NODELIST"
echo "SLURM_NNODES = $SLURM_NNODES"
echo "SLURMTMPDIR = $SLURMTMPDIR"
echo '#######################################################'
echo ""
module purge > /dev/null 2>&1
module load vasp
ulimit -s unlimited
mpirun vasp_std
Performance Observation:
When running the job through Slurm:
werner@x13dai-t:~/Public/hpc/servers/benchmark/Cr72_3x3x3K_350eV_10DAV$
grep LOOP OUTCAR
LOOP: cpu time 14.4893: real time 14.5049
LOOP: cpu time 14.3538: real time 14.3621
LOOP: cpu time 14.3870: real time 14.3568
LOOP: cpu time 15.9722: real time 15.9018
LOOP: cpu time 16.4527: real time 16.4370
LOOP: cpu time 16.7918: real time 16.7781
LOOP: cpu time 16.9797: real time 16.9961
LOOP: cpu time 15.9762: real time 16.0124
LOOP: cpu time 16.8835: real time 16.9008
LOOP: cpu time 15.2828: real time 15.2921
LOOP+: cpu time 176.0917: real time 176.0755
When running the job directly with mpirun:
werner@x13dai-t:~/Public/hpc/servers/benchmark/Cr72_3x3x3K_350eV_10DAV$
mpirun -n 36 vasp_std
werner@x13dai-t:~/Public/hpc/servers/benchmark/Cr72_3x3x3K_350eV_10DAV$
grep LOOP OUTCAR
LOOP: cpu time 9.0072: real time 9.0074
LOOP: cpu time 9.0515: real time 9.0524
LOOP: cpu time 9.1896: real time 9.1907
LOOP: cpu time 10.1467: real time 10.1479
LOOP: cpu time 10.2691: real time 10.2705
LOOP: cpu time 10.4330: real time 10.4340
LOOP: cpu time 10.9049: real time 10.9055
LOOP: cpu time 9.9718: real time 9.9714
LOOP: cpu time 10.4511: real time 10.4470
LOOP: cpu time 9.4621: real time 9.4584
LOOP+: cpu time 110.0790: real time 110.0739
Could you provide any insights or suggestions on what might be causing
this performance issue? Are there any specific configurations or
settings in Slurm that I should check or adjust to align the
performance more closely with the direct mpirun execution?
Thank you for your time and assistance.
Best regards,
Zhao
--
Assoc. Prof. Hongsheng Zhao <hongyi.zhao(a)gmail.com>
Theory and Simulation of Materials
Hebei Vocational University of Technology and Engineering
No. 473, Quannan West Street, Xindu District, Xingtai, Hebei province
[View Less]
We have several nodes, most of which have different Linux distributions
(distro for short). Controller has a different distro as well. The only
common thing between controller and all the does is that all of them ar
x86_64.
I can install Slurm using package manager on all the machines but this will
not work because controller will have a different version of Slurm compared
to the nodes (21.08 vs 23.11)
If I build from source then I see two solutions:
- build a deb package
- build a custom …
[View More]package (./configure, make, make install)
Building a debian package on the controller and then distributing the
binaries on nodes won't work either because that binary will start looking
for the shared libraries that it was built for and those don't exist on the
nodes.
So the only solution I have is to build a static binary using a custom
package. Am I correct or is there another solution here?
[View Less]
Hi,
We are trying out slurm having been running grid engine for a long while.
In grid engine, the cgroups peak memory and max_rss are generated at the end of a job and recorded. It logs the information from the cgroup hierarchy as well as doing a getrusage call right at the end on the parent pid of the whole job "container" before cleaning up.
With slurm it seems that the only way memory is recorded is by the acct gather polling. I am trying to add something in an epilog script to get the …
[View More]memory.peak but It looks like the cgroup hierarchy has been destroyed by the time the epilog is run.
Where in the code is the cgroup hierarchy cleared up ? Is there no way to add something in so that the accounting is updated during the job cleanup process so that peak memory usage can be accurately logged ?
I can reduce the polling interval from 30s to 5s but don't know if this causes a lot of overhead and in any case this seems to not be a sensible way to get values that should just be determined right at the end by an event rather than using polling.
Many thanks,
Emyr
[View Less]
We are pleased to announce the availability of Slurm version 23.11.7.
The 23.11.7 release fixes a few potential crashes in slurmctld when
using less common options on job submission, slurmrestd compatibility
with auth/slurm, and some additional minor and moderate severity bugs.
Slurm can be downloaded from https://www.schedmd.com/downloads.php .
-Marshall
> -- slurmrestd - Correct OpenAPI specification for
> 'GET /slurm/v0.0.40/jobs/state' having response as null.
> -- …
[View More]Allow running jobs on overlapping partitions if jobs don't specify -s.
> -- Fix segfault when requesting a shared gres along with an exclusive
> allocation.
> -- Fix regression in 23.02 where afternotok and afterok dependencies were
> rejected for federated jobs not running on the origin cluster of the
> submitting job.
> -- slurmctld - Disable job table locking while job state cache is active when
> replying to `squeue --only-job-state` or `GET /slurm/v0.0.40/jobs/state`.
> -- Fix sanity check when setting tres-per-task on the job allocation as well as
> the step.
> -- slurmrestd - Fix compatiblity with auth/slurm.
> -- Fix issue where TRESRunMins gets off correct value if using
> QOS UsageFactor != 1.
> -- slurmrestd - Require `user` and `association_condition` fields to be
> populated for requests to 'POST /slurmdb/v0.0.40/users_association'.
> -- Avoid a slurmctld crash with extra_constraints enabled when a job requests
> certain invalid --extra values.
> -- `scancel --ctld` and `DELETE /slurm/v0.0/40/jobs` - Fix support for job
> array expressions (e.g. 1_[3-5]). Also fix signaling a single pending array
> task (e.g. 1_10), which previously signaled the whole array job instead.
> -- Fix a possible slurmctld segfault when at some point we failed to create an
> external launcher step.
> -- Allow the slurmctld to open a connection to the slurmdbd if the first
> attempt fails due to a protocol error.
> -- mpi/cray_shasta - Fix launch for non-het-steps within a hetjob.
> -- sacct - Fix "gpuutil" TRES usage output being incorrect when using --units.
> -- Fix a rare deadlock on slurmctld shutdown or reconfigure.
> -- Fix issue that only left one thread on each core available when "CPUs=" is
> configured to total thread count on multi-threaded hardware and no other
> topology info ("Sockets=", "CoresPerSocket", etc.) is configured.
> -- Fix the external launcher step not being allocated a VNI when requested.
> -- jobcomp/kafka - Fix payload length when producing and sending a message.
> -- scrun - Avoid a crash if RunTimeDelete is called before the container
> finishes.
> -- Save the slurmd's cred_state while reconfiguring to prevent the loss job
> credentials.
[View Less]