Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 01 Mar 2025 14:19:44 -0800
From:      Ravi Pokala <rpokala@freebsd.org>
To:        Warner Losh <imp@bsdimp.com>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   Re: PCI topology-based hints
Message-ID:  <94A47C59-46D7-40E3-B680-8364558BF623@panasas.com>
In-Reply-To: <CANCZdfp2NWGiitbGvgYZnHWZziOuyM8RKAmym=AABAiiraxPTw@mail.gmail.com>
References:  <60B4BAAA-E333-4219-99BE-D6C1B198E0BD@freebsd.org> <CANCZdfp2NWGiitbGvgYZnHWZziOuyM8RKAmym=AABAiiraxPTw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

--B_3823683588_540832194
Content-type: text/plain;
	charset="UTF-8"
Content-transfer-encoding: quoted-printable

> Yes. You can use what's already there, but maybe not documented or is at =
the very least underdocumented. You can wire devices to the UEFI path, which=
 is guaranteed to be unique and avoid all these problems.

>=20

> hint.nvme.77.at=3D"UEFI:PcieRoot(2)/Pci(0x1,0x1)/Pci(0x0,0x0)"

>=20

> Which is on pcie root complex 2, then follow device 1 function 1 on that =
bus to device 0 function 0 on the second zero. `devctl getpath UEFI nvme0` w=
ill do all the heavy lifting for you. TaDa! No bus numbers.

>=20

> I added this several years ago to solve exactly this problem, or what hap=
pens when you lose a riser card, etc.

>=20

> Warner

=20

Sweet! Thanks Warner, that=E2=80=99s exactly what I=E2=80=99m looking for. :-)

=20

You=E2=80=99re right that it=E2=80=99s under-documented. I think it should be relativel=
y easy to find a list of buses which support wiring; I think this search sho=
uld find them:

=20

| grep -Erl 'DEVMETHOD.*hint' /usr/src/sys

=20

And then make sure that the bus=E2=80=99 manpage describes the hinting mechanism,=
 and add cross-refs between the bus=E2=80=99 manpage and device.hints.5

=20

If that sounds right, I=E2=80=99ll see if I can find some time to do that in the =
near future.

=20

Thanks again!

=20

-Ravi (rpokala@)

=20

=20

From: Warner Losh <imp@bsdimp.com>
Date: Saturday, March 1, 2025 at 13:23
To: Ravi Pokala <rpokala@freebsd.org>
Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject: Re: PCI topology-based hints

=20

=20

=20

On Fri, Feb 28, 2025 at 12:43=E2=80=AFAM Ravi Pokala <rpokala@freebsd.org> wrote:

Hi folks,

Setting up device attachment hints based on PCI address is easy; it's right=
 there in the manual (pci.4):

| DEVICE WIRING
|      You can wire the device unit at a given location with device.hints.
|      Entries of the form hints.<name>.<unit>.at=3D"pci<B>:<S>:<F>" or
|      hints.<name>.<unit>.at=3D"pci<D>:<B>:<S>:<F>" will force the driver na=
me to
|      probe and attach at unit unit for any PCI device found to match the
|      specification, where:
| ...
|    Examples
|      Given the following lines in /boot/device.hints:
|      hint.nvme.3.at=3D"pci6:0:0" hint.igb.8.at=3D"pci14:0:0" If there is a de=
vice
|      that supports igb(4) at PCI bus 14 slot 0 function 0, then it will b=
e
|      assigned igb8 for probe and attach.  Likewise, if there is an nvme(4=
)

That's all well and good in a world without pluggable and hot-swappable dev=
ices, but things get tricker when devices can appear and disappear.

We have systems which have multiple U.2 bays, which take NVMe PCIe devices.=
 Across multiple reboots, the <D, B, S, F> address assigned to the device in=
 each of those bays was consistent. Great! We set up wring hints for those d=
evices, and confirmed that the wiring worked when devices were swapped ...

.. until we added NIC into the hot-swap OCP slot and rebooted.

While things continued to work before the reboot, upon reboot, many address=
es changed. It looks like the slot into which the NIC was installed, is on t=
he same segment of the bus as the U.2 bays. When that segment was enumerated=
, the addresses got shuffled to include the NIC.

So, we can't necessarily rely on the PCI <D, B, S, F> address. But the PCIe=
 topology is consistent, even when devices are added and removed -- it's the=
 physical wiring between the root complex, bridges, devices, and expansion s=
lots.

The `lspci' utility -- ubiquitous on Linux, and available via the "sysutils=
/pciutils" port on FreeBSD -- can show the topology. For example, consider t=
hree NVMe devices, reported by `pciconf', and by `lspci's tree view (device =
details redacted):

| % pciconf -l | tr '@' ' ' | sort -V -k2 | grep nvme
| nvme2 pci0:65:0:0: ...
| nvme0 pci0:133:0:0: ...
| nvme1 pci0:137:0:0: ...
| %=20
| % lspci -vt | grep -C2 -E '^..-|NVMe'
| -+-[0000:00]-+-00.0  Root Complex
|  |           +-00.2  ...
|  |           +-00.3  ...
| --
|  |           +-18.6  ...
|  |           \-18.7  ...
|  +-[0000:40]-+-00.0  Root Complex
|  |           +-00.2  ...
|  |           +-00.3  ...
|  |           +-01.0  ...
|  |           +-01.1-[41]----00.0  ${VENDOR} NVMe
|  |           +-01.3-[42-43]--
|  |           +-01.4-[44-45]--
| --
|  |           |            \-00.1  ...
|  |           \-07.2  ...
|  +-[0000:80]-+-00.0  Root Complex
|  |           +-00.2  ...
|  |           +-00.3  ...
| --
|  |           +-03.0  ...
|  |           +-03.1-[83-84]--
|  |           +-03.2-[85-86]----00.0  ${VENDOR} NVMe
|  |           +-03.3-[87-88]--
|  |           +-03.4-[89-8a]----00.0  ${VENDOR} NVMe
|  |           +-04.0  ...
|  |           +-05.0  ...
| --
|  |           |            \-00.1  ...
|  |           \-07.2  ...
|  \-[0000:c0]-+-00.0  Root Complex
|              +-00.2  ...
|              +-00.3  ...

The first set of xdigits, "[0000:n0]" are a "domain" and "bus", which are o=
nly shown for the Root Complex devices. The second set of xdigits, "xy.z", a=
re either an endpoint's "slot" and "function", or else a bridge device's (ad=
dress?) and (slot?). If there is a bridge, there is a set of xdigits in brac=
kets next to each (slot?), which becomes the "bus" of the attached endpoint,=
 and then "xy.z", which is the endpoint's "slot" and "function".

Thus, we can see from the tree that the NVMe devices are "0000:41:00.0", "0=
000:85:00.0", and "0000:89:00.0". (Which, if you convert to decimal, is the =
same as reported by `pciconf': "pci0:65:0:0", "pci0:133:0:0", "pci0:137:0:0"=
.) It is also apparent that the latter two devices are connected to the same=
 bridge, which in turn is connected to a different root complex than the fir=
st device.

The problem is, depending on what devices are connected to a given root com=
plex, the "bus" component which is associated with a bridge slot can change.=
 In the example above, with the current population of devices in the "0000:8=
0" portion of the tree, the "bus" components associated with bridge "03" are=
 "83", "85", "87", and "89". But add another device to "0000:80" and reboot,=
 and the addresses associated with bridge "03" become "84", "86", "88", and =
"8a".

The question is this: How do I indicate that I would like a certain device =
unit to be wired to a specific bridge device address and slot -- which canno=
t change -- rather than to a specific <D, B, S, F>, where the "B" component =
can change.

Any thoughts?

=20

Yes. You can use what's already there, but maybe not documented or is at th=
e very least underdocumented. You can wire devices to the UEFI path, which i=
s guaranteed to be unique and avoid all these problems.

=20

hint.nvme.77.at=3D"UEFI:PcieRoot(2)/Pci(0x1,0x1)/Pci(0x0,0x0)"

=20

Which is on pcie root complex 2, then follow device 1 function 1 on that bu=
s to device 0 function 0 on the second zero. `devctl getpath UEFI nvme0` wil=
l do all the heavy lifting for you. TaDa! No bus numbers.

=20

I added this several years ago to solve exactly this problem, or what happe=
ns when you lose a riser card, etc.

=20

Warner

=20


--B_3823683588_540832194
Content-type: text/html;
	charset="UTF-8"
Content-transfer-encoding: quoted-printable

<html xmlns:o=3D"urn:schemas-microsoft-com:office:office" xmlns:w=3D"urn:schema=
s-microsoft-com:office:word" xmlns:m=3D"http://schemas.microsoft.com/office/20=
04/12/omml" xmlns=3D"http://www.w3.org/TR/REC-html40"><head><meta http-equiv=3DC=
ontent-Type content=3D"text/html; charset=3Dutf-8"><meta name=3DGenerator content=3D=
"Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Aptos;
	panose-1:2 11 0 4 2 2 2 2 2 4;}
@font-face
	{font-family:Monaco;
	panose-1:2 0 5 0 0 0 0 0 0 0;}
@font-face
	{font-family:"Times New Roman \(Body CS\)";
	panose-1:2 2 6 3 5 4 5 2 3 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	font-size:12.0pt;
	font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
span.EmailStyle18
	{mso-style-type:personal-reply;
	font-family:Monaco;
	font-variant:normal !important;
	color:windowtext;
	text-transform:none;
	position:relative;
	top:0pt;
	mso-text-raise:0pt;
	letter-spacing:0pt;
	mso-contextual-alternates:no;
	font-weight:normal;
	font-style:normal;
	text-decoration:none none;
	vertical-align:baseline;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;
	mso-ligatures:none;}
@page WordSection1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
	{page:WordSection1;}
--></style></head><body lang=3DEN-US link=3Dblue vlink=3Dpurple style=3D'word-wrap:=
break-word'><div class=3DWordSection1><p class=3DMsoNormal><span style=3D'font-siz=
e:10.0pt;font-family:Monaco'>&gt; Yes. You can use what's already there, but=
 maybe not documented or is at the very least underdocumented. You can wire =
devices to the UEFI path, which is guaranteed to be unique and avoid all the=
se problems.<o:p></o:p></span></p><p class=3DMsoNormal><span style=3D'font-size:=
10.0pt;font-family:Monaco'>&gt; <o:p></o:p></span></p><p class=3DMsoNormal><sp=
an style=3D'font-size:10.0pt;font-family:Monaco'>&gt; hint.nvme.77.at=3D&quot;UE=
FI:PcieRoot(2)/Pci(0x1,0x1)/Pci(0x0,0x0)&quot;<o:p></o:p></span></p><p class=
=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monaco'>&gt; <o:p></o:p=
></span></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Mon=
aco'>&gt; Which is on pcie root complex 2, then follow device 1 function 1 o=
n that bus to device 0 function 0 on the second zero. `devctl getpath UEFI n=
vme0` will do all the heavy lifting for you. TaDa! No bus numbers.<o:p></o:p=
></span></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Mon=
aco'>&gt; <o:p></o:p></span></p><p class=3DMsoNormal><span style=3D'font-size:10=
.0pt;font-family:Monaco'>&gt; I added this several years ago to solve exactl=
y this problem, or what happens when you lose a riser card, etc.<o:p></o:p><=
/span></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monac=
o'>&gt; <o:p></o:p></span></p><p class=3DMsoNormal><span style=3D'font-size:10.0=
pt;font-family:Monaco'>&gt; Warner<o:p></o:p></span></p><p class=3DMsoNormal><=
span style=3D'font-size:10.0pt;font-family:Monaco'><o:p>&nbsp;</o:p></span></p=
><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monaco'>Sweet!=
 Thanks Warner, that=E2=80=99s exactly what I=E2=80=99m looking for. :-)<o:p></o:p></spa=
n></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monaco'><=
o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;=
font-family:Monaco'>You=E2=80=99re right that it=E2=80=99s under-documented. I think it =
should be relatively easy to find a list of buses which support wiring; I th=
ink this search should find them:<o:p></o:p></span></p><p class=3DMsoNormal><s=
pan style=3D'font-size:10.0pt;font-family:Monaco'><o:p>&nbsp;</o:p></span></p>=
<p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monaco'>| grep =
-Erl 'DEVMETHOD.*hint' /usr/src/sys<o:p></o:p></span></p><p class=3DMsoNormal>=
<span style=3D'font-size:10.0pt;font-family:Monaco'><o:p>&nbsp;</o:p></span></=
p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monaco'>And t=
hen make sure that the bus=E2=80=99 manpage describes the hinting mechanism, and a=
dd cross-refs between the bus=E2=80=99 manpage and device.hints.5<o:p></o:p></span=
></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monaco'><o=
:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;f=
ont-family:Monaco'>If that sounds right, I=E2=80=99ll see if I can find some time =
to do that in the near future.<o:p></o:p></span></p><p class=3DMsoNormal><span=
 style=3D'font-size:10.0pt;font-family:Monaco'><o:p>&nbsp;</o:p></span></p><p =
class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monaco'>Thanks aga=
in!<o:p></o:p></span></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;fo=
nt-family:Monaco'><o:p>&nbsp;</o:p></span></p><p class=3DMsoNormal><span style=
=3D'font-size:10.0pt;font-family:Monaco'>-Ravi (rpokala@)<o:p></o:p></span></p=
><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-family:Monaco'><o:p>&=
nbsp;</o:p></span></p><p class=3DMsoNormal><span style=3D'font-size:10.0pt;font-=
family:Monaco'><o:p>&nbsp;</o:p></span></p><div style=3D'border:none;border-to=
p:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'><p class=3DMsoNormal style=3D'm=
argin-left:.5in'><b><span style=3D'font-family:"Calibri",sans-serif;color:blac=
k'>From: </span></b><span style=3D'font-family:"Calibri",sans-serif;color:blac=
k'>Warner Losh &lt;imp@bsdimp.com&gt;<br><b>Date: </b>Saturday, March 1, 202=
5 at 13:23<br><b>To: </b>Ravi Pokala &lt;rpokala@freebsd.org&gt;<br><b>Cc: <=
/b>&quot;freebsd-hackers@freebsd.org&quot; &lt;freebsd-hackers@freebsd.org&g=
t;<br><b>Subject: </b>Re: PCI topology-based hints<o:p></o:p></span></p></di=
v><div><p class=3DMsoNormal style=3D'margin-left:.5in'><o:p>&nbsp;</o:p></p></di=
v><div><div><p class=3DMsoNormal style=3D'margin-left:.5in'><o:p>&nbsp;</o:p></p=
></div><p class=3DMsoNormal style=3D'margin-left:.5in'><o:p>&nbsp;</o:p></p><div=
><div><p class=3DMsoNormal style=3D'margin-left:.5in'>On Fri, Feb 28, 2025 at 12=
:43<span style=3D'font-family:"Arial",sans-serif'>=E2=80=AF</span>AM Ravi Pokala &lt=
;<a href=3D"mailto:rpokala@freebsd.org">rpokala@freebsd.org</a>&gt; wrote:<o:p=
></o:p></p></div><blockquote style=3D'border:none;border-left:solid #CCCCCC 1.=
0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in'><p class=3DM=
soNormal style=3D'margin-left:.5in'>Hi folks,<br><br>Setting up device attachm=
ent hints based on PCI address is easy; it's right there in the manual (pci.=
4):<br><br>| DEVICE WIRING<br>|&nbsp; &nbsp; &nbsp; You can wire the device =
unit at a given location with device.hints.<br>|&nbsp; &nbsp; &nbsp; Entries=
 of the form hints.&lt;name&gt;.&lt;unit&gt;.at=3D&quot;pci&lt;B&gt;:&lt;S&gt;=
:&lt;F&gt;&quot; or<br>|&nbsp; &nbsp; &nbsp; hints.&lt;name&gt;.&lt;unit&gt;=
.at=3D&quot;pci&lt;D&gt;:&lt;B&gt;:&lt;S&gt;:&lt;F&gt;&quot; will force the dr=
iver name to<br>|&nbsp; &nbsp; &nbsp; probe and attach at unit unit for any =
PCI device found to match the<br>|&nbsp; &nbsp; &nbsp; specification, where:=
<br>| ...<br>|&nbsp; &nbsp; Examples<br>|&nbsp; &nbsp; &nbsp; Given the foll=
owing lines in /boot/device.hints:<br>|&nbsp; &nbsp; &nbsp; <a href=3D"http://=
hint.nvme.3.at" target=3D"_blank">hint.nvme.3.at</a>=3D&quot;pci6:0:0&quot; <a h=
ref=3D"http://hint.igb.8.at" target=3D"_blank">hint.igb.8.at</a>=3D&quot;pci14:0:0=
&quot; If there is a device<br>|&nbsp; &nbsp; &nbsp; that supports igb(4) at=
 PCI bus 14 slot 0 function 0, then it will be<br>|&nbsp; &nbsp; &nbsp; assi=
gned igb8 for probe and attach.&nbsp; Likewise, if there is an nvme(4)<br><b=
r>That's all well and good in a world without pluggable and hot-swappable de=
vices, but things get tricker when devices can appear and disappear.<br><br>=
We have systems which have multiple U.2 bays, which take NVMe PCIe devices. =
Across multiple reboots, the &lt;D, B, S, F&gt; address assigned to the devi=
ce in each of those bays was consistent. Great! We set up wring hints for th=
ose devices, and confirmed that the wiring worked when devices were swapped =
...<br><br>.. until we added NIC into the hot-swap OCP slot and rebooted.<br=
><br>While things continued to work before the reboot, upon reboot, many add=
resses changed. It looks like the slot into which the NIC was installed, is =
on the same segment of the bus as the U.2 bays. When that segment was enumer=
ated, the addresses got shuffled to include the NIC.<br><br>So, we can't nec=
essarily rely on the PCI &lt;D, B, S, F&gt; address. But the PCIe topology i=
s consistent, even when devices are added and removed -- it's the physical w=
iring between the root complex, bridges, devices, and expansion slots.<br><b=
r>The `lspci' utility -- ubiquitous on Linux, and available via the &quot;sy=
sutils/pciutils&quot; port on FreeBSD -- can show the topology. For example,=
 consider three NVMe devices, reported by `pciconf', and by `lspci's tree vi=
ew (device details redacted):<br><br>| % pciconf -l | tr '@' ' ' | sort -V -=
k2 | grep nvme<br>| nvme2 pci0:65:0:0: ...<br>| nvme0 pci0:133:0:0: ...<br>|=
 nvme1 pci0:137:0:0: ...<br>| % <br>| % lspci -vt | grep -C2 -E '^..-|NVMe'<=
br>| -+-[0000:00]-+-00.0&nbsp; Root Complex<br>|&nbsp; |&nbsp; &nbsp; &nbsp;=
 &nbsp; &nbsp; &nbsp;+-00.2&nbsp; ...<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp=
; &nbsp; &nbsp;+-00.3&nbsp; ...<br>| --<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nb=
sp; &nbsp; &nbsp;+-18.6&nbsp; ...<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &n=
bsp; &nbsp;\-18.7&nbsp; ...<br>|&nbsp; +-[0000:40]-+-00.0&nbsp; Root Complex=
<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-00.2&nbsp; ...<br>|&=
nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-00.3&nbsp; ...<br>|&nbsp; =
|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-01.0&nbsp; ...<br>|&nbsp; |&nbsp=
; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-01.1-[41]----00.0&nbsp; ${VENDOR} NVMe=
<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-01.3-[42-43]--<br>|&=
nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-01.4-[44-45]--<br>| --<br>=
|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|&nbsp; &nbsp; &nbsp; &nbs=
p; &nbsp; &nbsp; \-00.1&nbsp; ...<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &n=
bsp; &nbsp;\-07.2&nbsp; ...<br>|&nbsp; +-[0000:80]-+-00.0&nbsp; Root Complex=
<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-00.2&nbsp; ...<br>|&=
nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-00.3&nbsp; ...<br>| --<br>=
|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-03.0&nbsp; ...<br>|&nbsp=
; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-03.1-[83-84]--<br>|&nbsp; |&nb=
sp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-03.2-[85-86]----00.0&nbsp; ${VENDOR}=
 NVMe<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-03.3-[87-88]--<=
br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-03.4-[89-8a]----00.0&=
nbsp; ${VENDOR} NVMe<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-=
04.0&nbsp; ...<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;+-05.0&n=
bsp; ...<br>| --<br>|&nbsp; |&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|&nbsp=
; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; \-00.1&nbsp; ...<br>|&nbsp; |&nbsp; &nb=
sp; &nbsp; &nbsp; &nbsp; &nbsp;\-07.2&nbsp; ...<br>|&nbsp; \-[0000:c0]-+-00.=
0&nbsp; Root Complex<br>|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; +-=
00.2&nbsp; ...<br>|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; +-00.3&n=
bsp; ...<br><br>The first set of xdigits, &quot;[0000:n0]&quot; are a &quot;=
domain&quot; and &quot;bus&quot;, which are only shown for the Root Complex =
devices. The second set of xdigits, &quot;xy.z&quot;, are either an endpoint=
's &quot;slot&quot; and &quot;function&quot;, or else a bridge device's (add=
ress?) and (slot?). If there is a bridge, there is a set of xdigits in brack=
ets next to each (slot?), which becomes the &quot;bus&quot; of the attached =
endpoint, and then &quot;xy.z&quot;, which is the endpoint's &quot;slot&quot=
; and &quot;function&quot;.<br><br>Thus, we can see from the tree that the N=
VMe devices are &quot;0000:41:00.0&quot;, &quot;0000:85:00.0&quot;, and &quo=
t;0000:89:00.0&quot;. (Which, if you convert to decimal, is the same as repo=
rted by `pciconf': &quot;pci0:65:0:0&quot;, &quot;pci0:133:0:0&quot;, &quot;=
pci0:137:0:0&quot;.) It is also apparent that the latter two devices are con=
nected to the same bridge, which in turn is connected to a different root co=
mplex than the first device.<br><br>The problem is, depending on what device=
s are connected to a given root complex, the &quot;bus&quot; component which=
 is associated with a bridge slot can change. In the example above, with the=
 current population of devices in the &quot;0000:80&quot; portion of the tre=
e, the &quot;bus&quot; components associated with bridge &quot;03&quot; are =
&quot;83&quot;, &quot;85&quot;, &quot;87&quot;, and &quot;89&quot;. But add =
another device to &quot;0000:80&quot; and reboot, and the addresses associat=
ed with bridge &quot;03&quot; become &quot;84&quot;, &quot;86&quot;, &quot;8=
8&quot;, and &quot;8a&quot;.<br><br>The question is this: How do I indicate =
that I would like a certain device unit to be wired to a specific bridge dev=
ice address and slot -- which cannot change -- rather than to a specific &lt=
;D, B, S, F&gt;, where the &quot;B&quot; component can change.<br><br>Any th=
oughts?<o:p></o:p></p></blockquote><div><p class=3DMsoNormal style=3D'margin-lef=
t:.5in'><o:p>&nbsp;</o:p></p></div><div><p class=3DMsoNormal style=3D'margin-lef=
t:.5in'>Yes. You can use what's already there, but maybe not documented or i=
s at the very least underdocumented. You can wire devices to the UEFI path, =
which is guaranteed to be unique and avoid all these problems.<o:p></o:p></p=
></div><div><p class=3DMsoNormal style=3D'margin-left:.5in'><o:p>&nbsp;</o:p></p=
></div><div><p class=3DMsoNormal style=3D'margin-left:.5in'><a href=3D"http://hint=
.nvme.77.at">hint.nvme.77.at</a>=3D&quot;UEFI:PcieRoot(2)/Pci(0x1,0x1)/Pci(0x0=
,0x0)&quot;<o:p></o:p></p></div><div><p class=3DMsoNormal style=3D'margin-left:.=
5in'><o:p>&nbsp;</o:p></p></div><div><p class=3DMsoNormal style=3D'margin-left:.=
5in'>Which is on pcie root complex 2, then follow device 1 function 1 on tha=
t bus to device 0 function 0 on the second zero. `devctl getpath UEFI nvme0`=
 will do all the heavy lifting for you. TaDa! No bus numbers.<o:p></o:p></p>=
</div><div><p class=3DMsoNormal style=3D'margin-left:.5in'><o:p>&nbsp;</o:p></p>=
</div><div><p class=3DMsoNormal style=3D'margin-left:.5in'>I added this several =
years ago to solve exactly this problem, or what happens when you lose a ris=
er card, etc.<o:p></o:p></p></div><div><p class=3DMsoNormal style=3D'margin-left=
:.5in'><o:p>&nbsp;</o:p></p></div><div><p class=3DMsoNormal style=3D'margin-left=
:.5in'>Warner<o:p></o:p></p></div><div><p class=3DMsoNormal style=3D'margin-left=
:.5in'>&nbsp;<o:p></o:p></p></div></div></div></div></body></html>

--B_3823683588_540832194--





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?94A47C59-46D7-40E3-B680-8364558BF623>