[slurm-users] Generating OPA topology.conf
Jeffrey Frey
frey at udel.edu
Wed Jun 13 14:35:09 MDT 2018
Intel's OPA doesn't include the old IB net discovery library/API; instead, they have their own library to enumerate nodes, links, etc. I've started a rewrite of ye olde "ib2slurm" utility to make use of Intel's new enumeration library.
https://gitlab.com/jtfrey/opa2slurm
E.g.
$ opa2slurm --help
usage:
opa2slurm {options}
[HFI selection]
-N, --hfi-name <hfi_name> use the named HFI (e.g. hfi1_0)
-n, --hfi-num <#> use the HFI by integer index (0 = first active)
-P, --hfi-port <#> use the given port number on the HFI
-G, --port-guid <guid> use the port with the given GUID
(e.g. 0x00117500d9000140)
-o, --output <path> write output topology configuration
to the file at the given path
-C, --no-comments do not emit comments in the generated
topology configuration
-R, --no-ranged-lists do not produce ranged name lists a'la SLURM
-L, --linkspeed include LinkSpeed values for switches
-r, --no-redundancy-removal do not remove references to non-leaf switches from
leaf switches
-v, --verbose display additional information to stderr
[version 0.1]
$ opa2slurm --no-comments --linkspeed
SwitchName=r02-opa-s1 Switches=r00-opa-l[0-1],r01-opa-l[0-1],r02-opa-l0 LinkSpeed=16
SwitchName=r00-opa-l1 Nodes=r00n[25-56],r00oss0 LinkSpeed=16
SwitchName=r02-opa-s4 Switches=r00-opa-l[0-1],r01-opa-l[0-1],r02-opa-l0 LinkSpeed=16
SwitchName=r02-opa-s3 Switches=r00-opa-l[0-1],r01-opa-l[0-1],r02-opa-l0 LinkSpeed=16
SwitchName=r00-opa-l0 Nodes=r00n[00-24],r00oss1 LinkSpeed=16
SwitchName=r02-opa-s2 Switches=r00-opa-l[0-1],r01-opa-l[0-1],r02-opa-l0 LinkSpeed=16
SwitchName=r02-opa-s5 Switches=r00-opa-l[0-1],r01-opa-l[0-1],r02-opa-l0 LinkSpeed=16
SwitchName=r02-opa-l0 Nodes=r02login[00-01],r02mds[0-1],r02mgmt[00-02],r02s[00-01] LinkSpeed=16
SwitchName=r01-opa-l0 Nodes=r01n[00-24],r01oss1 LinkSpeed=16
SwitchName=r02-opa-s6 Switches=r00-opa-l[0-1],r01-opa-l[0-1],r02-opa-l0 LinkSpeed=16
SwitchName=r01-opa-l1 Nodes=r01n[25-56],r01oss0 LinkSpeed=16
When querying for links between nodes using Intel's API, both directions are returned (e.g. LID 1 -> 2 and 2 -> 1). The program currently looks for any non-leaf switches and removes references to them from leaf switches -- very simple. The LinkSpeed values are the (semi-arbitrary) product of the API's link width and base link speed enumerations as reported for a switch (maximum value across all ports on the switch).
::::::::::::::::::::::::::::::::::::::::::::::::::::::
Jeffrey T. Frey, Ph.D.
Systems Programmer V / HPC Management
Network & Systems Services / College of Engineering
University of Delaware, Newark DE 19716
Office: (302) 831-6034 Mobile: (302) 419-4976
::::::::::::::::::::::::::::::::::::::::::::::::::::::
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180613/0c5f1d98/attachment.html>
More information about the slurm-users
mailing list