[slurm-users] Wrong hwloc detected?

Diego Zuccato diego.zuccato at unibo.it
Fri Nov 5 14:38:29 UTC 2021


They aren't using modules so it must be something system-wide :(
But not all jobs are impacted. And it seems it's a bit random (doesn't 
happen always).
I'm out of ideas, currently :(

Il 05/11/2021 13:10, Ole Holm Nielsen ha scritto:
> On 11/5/21 12:47, Diego Zuccato wrote:
>> Some users are reporting this error:
>>
>> slurmstepd-str957-mtx-01: error: hwloc_get_obj_below_by_type() 
>> failing, task/affinity plugin may be required to address bug fixed in 
>> HWLOC version 1.11.5
>> slurmstepd-str957-mtx-01: error: task[0] unable to set taskset '0x0'
>>
>> I checked on that node and hwloc is newer:
>> diego.zuccato at str957-mtx-01:~$ hwloc-info --version
>> hwloc-info 2.4.1
>>
>> How can Slurm detect such an old HWLOC version?
> 
> Maybe the user loads a software module which also loads an old hwloc 
> module?   The user should do "module list" in the job to verify this.
> 
> My 2 cents,
> Ole
> 

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786



More information about the slurm-users mailing list