Glen,
I don't think I see it in your message, but are you pointing to the plugin in slurm.conf with JobSubmitPlugins=? I assume you are but it's worth checking.
Ryan
On 4/9/24 10:19, Glen MacLachlan via slurm-users wrote:
Hi,
We have a plugin in Lua that mostly does what we want but there are features available in the C extension that are not available to lua. For that reason, we are attempting to convert to C using the guidance found here: https://slurm.schedmd.com/job_submit_plugins.html#building. We arrived here because the lua plugins don't seem to stretch enough to cover the use case we were looking at, i.e., branching off of the value of alloc_id or, for that matter, get_sid().
The goal is to disallow interactive allocations (i.e., salloc) on specific partitions while allowing it on others. However, we've run into an issue with our C plugin right out of the gate and I've included a minimal reproducer as an example which is basically a "Hello World" type of test (job_submit_disallow_salloc.c, see attached).
*Expectation* What we expect to happen is a sort of hello-world result with a message being written to a /tmp/min_repo.log but that does not occur. It seems that the plugin does not get run at all when jobs are submitted. Jobs still run as expected but the plugin seems to be ignored.
*Steps* We compile gcc -fPIC -DHAVE_CONFIG_H -I /modules/source/slurm-23.02.4 -g -O2 -pthread -fno-gcse -Werror -Wall -g -O0 -fno-strict-aliasing -MT job_submit_disallow_salloc.lo -MD -MP -MF .deps/job_submit_disallow_salloc.Tpo -c job_submit_disallow_salloc.c -o .libs/job_submit_disallow_salloc.o
mv .deps/job_submit_disallow_salloc.Tpo .deps/job_submit_disallow_salloc.Plo
and link gcc -shared -fPIC -DPIC .libs/job_submit_disallow_salloc.o -O2 -pthread -O0 -pthread -Wl,-soname -Wl,job_submit_disallow_salloc.so -o job_submit_disallow_salloc.so
Check links after copying to /usr/lib64/slurm: ldd /usr/lib64/slurm/job_submit_disallow_salloc.so linux-vdso.so.1 (0x00007ffe467aa000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1c02095000) libc.so.6 => /lib64/libc.so.6 (0x00007f1c01cd0000) /lib64/ld-linux-x86-64.so.2 (0x00007f1c024b7000)
Can someone point out what we are doing incorrectly or how we might troubleshoot this issue?
Kindest regards, Glen
*Reproducer* The minimal reproducer is basically a "hello world" for C extensions which I've pasted below (I've also attached it for convenience):
#include <slurm/slurm.h> #include <slurm/slurm_errno.h> #include <stdio.h> #include "src/slurmctld/slurmctld.h"
const char plugin_name[] = "Min Reproducer"; const char plugin_type[] = "job_submit/disallow_salloc"; const uint32_t plugin_version = SLURM_VERSION_NUMBER;
extern int job_submit(job_desc_msg_t *job_desc, uint32_t submit_uid, char **err_msg) { FILE *fp; fp = fopen("/tmp/min_repo.log", "w"); fprintf(fp,"Hello!");
fclose(fp); return SLURM_SUCCESS; }
int job_modify(job_desc_msg_t *job_desc, job_record_t *job_ptr, uint32_t submit_uid, char **err_msg) { return SLURM_SUCCESS; }