We switched over from using systemctl for tmp.mount and change to zram, e.g.,
modprobe zram
echo 20GB > /sys/block/zram0/disksize
mkfs.xfs /dev/zram0
mount -o discard /dev/zram0 /tmp


srun with --x11 was working before changing this. We're on RHEL 9.

slurmctld logs show this whenever --x11 is used with srun:
[2024-02-23T20:22:43.442] [529.extern] error: setup_x11_forward: failed to create temporary XAUTHORITY file: Permission denied
[2024-02-23T20:22:43.442] [529.extern] error: x11 port forwarding setup failed
[2024-02-23T20:22:43.442] error: _forkexec_slurmstepd: slurmstepd failed to send return code got 0: Resource temporarily unavailable
[2024-02-23T20:22:43.443] Could not launch job 529 and not able to requeue it, cancelling job
[2024-02-23T20:26:15.881] [530.extern] error: setup_x11_forward: failed to create temporary XAUTHORITY file: Permission denied
[2024-02-23T20:26:15.881] [530.extern] error: x11 port forwarding setup failed
[2024-02-23T20:26:15.882] error: _forkexec_slurmstepd: slurmstepd failed to send return code got 0: Resource temporarily unavailable
[2024-02-23T20:26:15.883] Could not launch job 530 and not able to requeue it, cancelling job


slurmd log entries from a node:
[2024-02-23T20:26:15.859] sched: _slurm_rpc_allocate_resources JobId=530 NodeList=2402-node005 usec=1800
[2024-02-23T20:26:15.882] _slurm_rpc_requeue: Requeue of JobId=530 returned an error: Only batch jobs are accepted or processed
[2024-02-23T20:26:15.883] _slurm_rpc_kill_job: REQUEST_KILL_JOB JobId=530 uid 0
[2024-02-23T20:26:15.962] _slurm_rpc_complete_job_allocation: JobId=530 error Job/step already completing or completed

srun -v --pty  -t 0-4:00 --x11 --mem=10g
srun: defined options
srun: -------------------- --------------------
srun: account             : me
srun: mem                 : 10G
srun: nodelist            : our-node
srun: pty                 :
srun: time                : 04:00:00
srun: verbose             : 1
srun: x11                 : all
srun: -------------------- --------------------
srun: end of defined options
srun: Waiting for resource configuration
srun: error: Nodes our-node are still not ready
srun: error: Something is wrong with the boot of the nodes.


slurm.conf has PrologFlags=x11 set. /usr/bin/xauth is installed on each compute node.

Is this a known issue with zram or is that just a red herring and there's something else wrong?