On 4/16/26 14:05, Pharthiphan Asokan via slurm-users wrote:
Hi team, We’re observing job aborts on Intel-based nodes immediately after a | slurmctld| reload. AMD nodes remain stable and jobs continue unaffected. No system or Slurm configuration changes were made before the issue started. Error observed:
|error: Aborting JobID=1288 due to change in socket/core configuration of allocated nodes |
What's your Slurm version? Please run "slurmd -C" on each type of node, and verify that your slurm.conf NodeName=... lines agrees with this output. Any deviation could cause the problem that you're experiencing. Example output: $ slurmd -C NodeName=a045 CPUs=40 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=1 RealMemory=385045 IHTH, Ole -- Ole Holm Nielsen PhD, Senior HPC Officer Department of Physics, Technical University of Denmark