December 2024 - slurm-users - lists.schedmd.com

job_container/tmpfs and srun.
by Phill Harvey-Smith 17 Jan '25

17 Jan '25

Hi all, On our setup we are using job_container/tmpfs to give each job it's own temp space. Since our compute nodes have reasonably sized disks for tasks that do a lot of disk I/O on user's data we have asked users to copy their data to the local disk at the beginning of the task and (if needed) copy it back at the end. This saves lots of NFS thrashing slowing down both the task and the NFS servers. However some of our users are having problems with this, their initial sbatch script will create a temp directory in their private /tmp copy their data to it and then try to srun a program. The srun will fall over as it doesn't seem to have have access to the copied data. I suspect this is because the srun task is getting it's own private /tmp. So my question is, is there a way to have the srun task inherit the /tmp of the initial sbatch? I'll include a sample of the script our user is using below. If any further information is required please feel free to ask. Cheers. Phill. #!/usr/bin/bash #SBATCH --nodes 1 #SBATCH --ntasks-per-node=1 #SBATCH --cpus-per-task=1 #SBATCH --time=00:00:10 #SBATCH --mem-per-cpu=3999 #SBATCH --output=script_out.log #SBATCH --error=script_error.log # The above options puts the STDOUT and STDERR of sbatch in # log files prefixed with 'script_'. # Create a randomly-named directory under /tmp jobtmpdir=$(mktemp -d) # Register a function to try and cleanup in case of job failure cleanup_handler() { echo "Cleaning up ${jobtmpdir}" rm -rf ${jobtmpdir} } trap 'cleanup_handler' SIGTERM EXIT # Change working directory to this directory cd ${jobtmpdir} # Copy the executable and input files from # where the job was submitted to the temporary directory. cp ${SLURM_SUBMIT_DIR}/a.out . cp ${SLURM_SUBMIT_DIR}/input.txt . # Run the executable, handling the collection of stdout # and stderr ourselves by redirecting to file srun ./a.out 2> task_error.log > task_out.log # Copy output data back to the submit directory. cp output.txt ${SLURM_SUBMIT_DIR} cp task_out.log ${SLURM_SUBMIT_DIR} cp task_error.log ${SLURM_SUBMIT_DIR} # Cleanup cd ${SLURM_SUBMIT_DIR} cleanup_handler

2 1

Permission denied for slurmdbd.conf
by sportlecon＠gmail.com 07 Jan '25

07 Jan '25

ls -ls /usr/local/slurm/etc/slurmdbd.conf 4 -rw------- 1 slurm slurm 497 Dec 28 16:34 /usr/local/slurm/etc/slurmdbd.conf sudo -u slurm /usr/local/slurm/sbin/slurmdbd -Dvvv slurmdbd: error: s_p_parse_file: unable to read "/usr/local/slurm/etc/slurmdbd.conf": Permission denied slurmdbd: fatal: Could not open/read/parse slurmdbd.conf file /usr/local/slurm/etc/slurmdbd.conf

3 2

Resource guarantees
by christopher.furbee＠jhuapl.edu 23 Dec '24

23 Dec '24

Hello, Long time SGE admin, new SLURM admin here. I recently started the transition of all my clusters from SGE to SLURM and everything was great until I hit the "Taco Bell" cluster (fake name). Taco Bell supports 4 projects and under SGE we had a priority system setup using projects to balance the queue. For the life of me I have been unable to replicate this in SLURM. We are looking to configure guaranteed resources based on the project. I had thought we could accomplish this with QOS and accounts but so far we have failed. What we would like to end up with is; When project Gordita is running uncontested 100% of the cluster is available. While Gordita is running, if Crunchwrap submits their jobs we want the scheduler to prioritize those jobs until a 75% Gordita, 25% Crunchwrap balance of jobs is reached. No preempting or priority overriding, just as a Gordita job finishes, if Crunchwrap is less than 25%, start a Crunchwrap job. And then maintain that balance until one of the projects jobs are 100% completed. Any assistance or guidance is greatly appreciated.

1 0

Slurm plugin for custom hardware allocation
by Laura Zharmukhametova 23 Dec '24

23 Dec '24

Hello, Is there an existing Slurm plugin for FPGA allocation? If not, can someone please point me in the right direction for how to approach it. Many thanks

2 1

sending mails wit smail on rocky9
by Marcus Wagner 19 Dec '24

19 Dec '24

Hi all, I have a problem with sending mails on rocky 9 via Slurm. One needs to install s-nail to have "/bin/mail" being available. There are some caveats in smail. In the second part (for the message, when the job began) one need to pipe ( eg echo "") into $MAIL, even in a script with no input, s-nail wants to be interactive. but it suffices to echo an empty text to snail. Nonetheless, I don't get any mail through. it seems the, mailprog for some reason gets killed or errors out for some other reason. While it is perfectionally working if run from the console :/ I all the time get in the slurmctld.log the following: 27212:[2024-12-19T15:54:54.935] slurmscriptd: error: run_command: killing MailProg operation on shutdown 27213:[2024-12-19T15:54:54.945] slurmscriptd: _run_script: JobId=0 MailProg killed by signal 9 27214:[2024-12-19T15:54:54.945] error: MailProg returned error, it's output was '' 27395:[2024-12-19T15:55:55.540] slurmscriptd: error: run_command: killing MailProg operation on shutdown 27396:[2024-12-19T15:55:55.551] slurmscriptd: _run_script: JobId=0 MailProg killed by signal 9 27397:[2024-12-19T15:55:55.551] error: MailProg returned error, it's output was '' 27438:[2024-12-19T15:56:55.981] slurmscriptd: error: run_command: killing MailProg operation on shutdown 27439:[2024-12-19T15:56:55.981] slurmscriptd: error: run_command: killing MailProg operation on shutdown 27440:[2024-12-19T15:56:55.992] slurmscriptd: _run_script: JobId=0 MailProg killed by signal 9 27441:[2024-12-19T15:56:55.992] slurmscriptd: _run_script: JobId=0 MailProg killed by signal 9 27442:[2024-12-19T15:56:55.992] error: MailProg returned error, it's output was '' 27443:[2024-12-19T15:56:55.992] error: MailProg returned error, it's output was '' 27450:[2024-12-19T15:56:58.849] slurmscriptd: error: run_command: killing MailProg operation on shutdown 27451:[2024-12-19T15:56:58.859] slurmscriptd: _run_script: JobId=0 MailProg killed by signal 0 any hints? Best Marcus -- Dipl.-Inf. Marcus Wagner stellv. Gruppenleitung IT Center Gruppe: Server, Storage, HPC Abteilung: Systeme und Betrieb RWTH Aachen University Seffenter Weg 23 52074 Aachen Tel: +49 241 80 24383 wagner(a)itc.rwth-aachen.de www.itc.rwth-aachen.de Social-Media-Kanäle des IT Centers: https://blog.rwth-aachen.de/itc/ https://www.facebook.com/itcenterrwth https://www.linkedin.com/company/itcenterrwth https://twitter.com/ITCenterRWTH https://www.youtube.com/c/ITCenterRWTHAachen

1 0

lua, glib, gtk, kafka not found for slurm 24.11 and Alma 8
by Bernd Melchers 17 Dec '24

17 Dec '24

Dear all, i tried to rpmbuild slurm-24.11.0 for Alma Linux 8. Build failed because some installed Packages are not found by slurms configure script: rdkafka, glib, gtp and lua But all these packages are installed and they are found by slurm-24.05.x: librdkafka-1.6.1-1.el8.x86_64 librdkafka-devel-1.6.1-1.el8.x86_64 lua-5.3.4-12.el8.x86_64 lua-devel-5.3.4-12.el8.x86_64 glib2-2.56.4-165.el8_10.x86_64 glib2-devel-2.56.4-165.el8_10.x86_64 gtk2-2.24.32-5.el8.x86_64 gtk2-devel-2.24.32-5.el8.x86_64 gtk3-3.22.30-12.el8_10.x86_64 gtk3-devel-3.22.30-12.el8_10.x86_64 Mit freundlichen Grüßen Bernd Melchers -- Archiv- und Backup-Service | fab-service(a)zedat.fu-berlin.de Freie Universität Berlin | Tel. +49-30-838-55905

2 3

AllocNode:Sid in scontrol but not sacct?
by Chris Taylor 17 Dec '24

17 Dec '24

Does the accounting database keep this? Maybe I'm missing something but I don't see a way to query for it in sacct. Chris

1 1

job_container/tmpfs, TmpFS and TmpDisk usage
by Paul Musset 16 Dec '24

16 Dec '24

Hello, I have multiple questions about the usage of job_container/tmpfs, and the TmpFS and TmpDisk variables 1) If my job_container.conf files contains: ``` BasePath=/mnt/slurm_tmp Shared=true ``` is it important what I set TmpFS to in slurm.conf ? Should I set it to '/mnt/slurm_tmp' or '/tmp' ? 2) What size should I put in TmpDisk ? the size advertised by df ? 3) Finally, is there any recommended file system for the partition used as the job_container/tmpfs BasePath ? Best regards, Paul Musset Max Planck Institute for Brain Research

1 0

Difference between sreport's cpu time usage and sacct -j usage
by KK 14 Dec '24

14 Dec '24

Hi all, I have observed a significant discrepancy in CPU usage time calculations between sreport and sacct, and I would like to understand the underlying cause. Let me share the specific case I encountered when calculating CPU usage time for user zt23132881r from November 1, 2024, to November 30, 2024. 1. sreport Results (995,171 minutes): -------------------------------------------------------------------------------- *[root@master ~]# sreport Cluster UserUtilizationByAccount user=zt23132881r start=2024-11-01 end=2024-11-30--------------------------------------------------------------------------------Cluster/User/Account Utilization 2024-11-01T00:00:00 - 2024-11-29T23:59:59 (2505600 secs)Usage reported in CPU Minutes-------------------------------------------------------------------------------- Cluster Login Proper Name Account Used Energy--------- --------- --------------- --------------- -------- ---------djhpc-po+ zt231328+ zt23132881r zt+ zt23132881r_ba+ 995171 6294875* 2. sacct Results: # Without truncate (1,019,927 minutes / 61,195,668 seconds) *[root@master ~]# sacct -u zt23132881r -S 2024-11-01 -E 2024-11-30 -o "jobid,partition,account,user,alloccpus,cputimeraw,state" -X |awk 'BEGIN{total=0}{total+=$6}END{print total}'61195668* # With truncate (967,165 minutes / 58,029,908 seconds) *[root@master ~]# sacct -u zt23132881r -S 2024-11-01 -E 2024-11-30 -o "jobid,partition,account,user,alloccpus,cputimeraw,state" -X --truncate |awk 'BEGIN{total=0}{total+=$6}END{print total}'58029908* # No -X *[root@master ~]# sacct -u zt23132881r -S 2024-11-01 -E 2024-11-30 -o "jobid,partition,account,user,alloccpus,cputimeraw,state" |awk 'BEGIN{total=0}{total+=$6}END{print total}'61195668* The results show three different values: - *sreport: 995,171 minutes* - *sacct (without truncate): 1,019,927 minutes* - *sacct (with truncate): 967,165 minutes* I would appreciate if someone could explain: - Which of these results is more accurate? - How does sreport calculate CPU usage time? - Why does the --truncate option in sacct lead to different results? Thank you for your assistance in clarifying these discrepancies. Best regards

2 1

Node configuration unavailable when using --mem-per-gpu , for specific GPU type
by Matthew R. Baney 13 Dec '24

13 Dec '24

Hi all, I'm seeing some odd behavior when using the --mem-per-gpu flag instead of the --mem flag to request memory when also requesting all available CPUs on a node (in this example, all available nodes have 32 CPUs): $ srun --ntasks-per-node=8 --cpus-per-task=4 --gpus-per-node=gtx1080ti:1 --mem-per-gpu=1g --pty bash srun: error: Unable to allocate resources: Requested node configuration is not available $ srun --ntasks-per-node=8 --cpus-per-task=4 --gpus-per-node=gtx1080ti:1 --mem=1g --pty bash srun: job 3479971 queued and waiting for resources srun: job 3479971 has been allocated resources $ The nodes in this partition have a mix of gtx1080ti and rtx2080ti GPUs, but only one type of GPU is in any one node. The same behavior does not occur when requesting a (node with a) rtx2080ti instead. Is there something I'm missing that would cause the --mem-per-gpu flag to not be working in this example? Thanks, Matthew

1 0

2025

2024

slurm-users December 2024