Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Mar 2023 22:26:40 +0100
From:      Attila Nagy <nagy.attila@gmail.com>
To:        =?UTF-8?B?WXZlcyBHdcOpcmlu?= <yvesguerin@yahoo.ca>
Cc:        "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>
Subject:   Re: Kernel DHCP unpredictable/fails (PXE boot), userspace DHCP works just fine
Message-ID:  <CAM2hQG9XucjqM763CcCivtZufc4BYQi5BDdjmzAAMccBVEy2hA@mail.gmail.com>
In-Reply-To: <132303943.191443.1679001265318@mail.yahoo.com>
References:  <CAM2hQG-p=bfSh_nxuah9zcTBbz7HQ9pYyvOR2f6rC=CUGePKsg@mail.gmail.com> <CAM2hQG-oDRsoccg3S1LykyUF=joWbdJz=GSPOnUroDRxjZ2_iQ@mail.gmail.com> <132303943.191443.1679001265318@mail.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000080e78305f70b1ed9
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Hey,

Sure. We're talking about 30 machines, all behave the same (either bad or
good). I'm pretty sure it's not a cabling issue. :)

Yves Gu=C3=A9rin <yvesguerin@yahoo.ca> ezt =C3=ADrta (id=C5=91pont: 2023. m=
=C3=A1rc. 16., Cs,
22:14):

> Dear Attila,
>
> May be I will add some noise to your thread, sorry in advance, I am just =
a
> sysadmin and I faced the same problem with one of my old hp g7 the networ=
k
> card was broken (malfunctionning) , sometime it works and sometime not wh=
en
> I used pxe and dhcpd (take to much time to answer to the dhcp so the
> motherboard decided to reboot, etc. (infinite loop)).  The card works
> perfectly when it's setup by an OS.
>
> May be it's a stupid question or two: do you check the network cable ?  (=
I
> faced some defective cables and it ruin my day...) in the same way what
> about the hub/router attached to this server (configuration, etc.), Do yo=
u
> switched a good one by a bad one ? (same network cable, hub/router, etc.)
>
> I spend too much nights in the lab...
>
> Regards,
>
> Yves Guerin
>
>
> Le jeudi 16 mars 2023 =C3=A0 16:44:49 UTC=E2=88=924, Attila Nagy <nagy.at=
tila@gmail.com>
> a =C3=A9crit :
>
>
> Hi,
>
> As this is super annoying, I'm willing to pay a $500 bounty for solving
> this issue (whomever is first, however I don't anticipate a big competiti=
on
> :) Having an invoice would be best, but I'm willing to accept individuals
> as well).
> I can't give remote access, but can run debug builds with serial console.
> stable/13 branch.
>
> I have a bunch of netbooted machines, one set in a cluster is older (HP
> DL80 G9, 2x8C, Intel I350 -igb- NICs), the other set is newer (HP XL225n
> G10, AMD EPYC2x16C, BCM57412 -bnxt- NICs).
> All of these boot from the network, which is basically:
> - get IP and options with DHCP with the help of the NIC's PXE stack
> - get the loader and kernel, start it
> - do another round of DHCP from the kernel (bootp_subr.c)
> - mount the root via NFS and let everything work as usual
>
> The problem is that the newer machines take an indefinite time to boot.
> The older ones (with igb NIC) work reliably, they always boot fast.
> The process of getting an IP address via DHCP (bootpc_call from
> bootp_subr.c) either succeeds normally (in a few seconds), or takes a lot
> of time.
> Common (measured) times to boot range from 10s of minutes to anywhere
> between a few hours (1-6).
> Sometimes it just gets stuck and couldn't get past bootpc_call (getting
> the DHCP lease).
>
> What I've already tried:
> - we have a redundant set of DHCP servers which offer static leases (so
> there are two DHCPOFFERs), so I tried to turn off one of them, nothing ha=
s
> changed
> - tried to disable SMP, the effect is the same
> - tried to see whether it's a network issue. The NIC's PXE stack always
> gets the lease quickly and booting FreeBSD from an ISO and issuing dhclie=
nt
> on the same interface is also fast. After the machines have booted, there
> are no network issues, they work reliably (since more than a year for 20+
> machines, so not just a few hours)
>
> This issue wasn't so bad previously (only a few mins to tens of minutes
> delay), but recently it got pretty unbearable, even making some machines
> unbootable for days...
>
> First I thought it might be a packet loss (or more exactly packet deliver=
y
> from the DHCP server to the receiving socket), either in the network or i=
n
> the NIC/kernel itself, so I placed a few random printfs into bootp_subr.c
> and udp_usrreq.c.
>
> After spending some time trying to understand the problem it feels like a
> race condition in
> bootpc_call, but I don't know the code well enough to effectively verify
> that.
>
> Here are the modified bootp_subr.c and udp_usrreq.c:
>
> https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c515=
7a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/bootp_subr.c
>
> https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad=
e8af252f618c84a46da2452d557ebc5078ac/udp_usrreq.c
> (modified from stable/13 branch from a few weeks earlier)
>
> This is the output with the always working DL80 (igb) machine:
>
> https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad=
e8af252f618c84a46da2452d557ebc5078ac/DL80%2520igb%2520good.txt
>
> This is the console output from a working boot for the XL225n (bnxt)
> machine:
>
> https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad=
e8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520good.txt
> as you can see, it's much slower than the DL80 (which also isn't that
> fast...)
>
> And this one is a longer output, without success to that point (2 minutes
> without completing the DHCP flow):
> https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw
> <https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8a=
de8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt>
> /
> <https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8a=
de8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt>
> a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt
> <https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8a=
de8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt>
>
> For the latter, here's an excerpt from the DHCP log:
>
> https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c515=
7a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/dhcp_log.txt
>
> It seems the DHCP state always gets reset to IF_DHCP_UNRESOLVED even if
> there's answers from the DHCP server.
>
> Here's another, longer console log, which succeeded after spending 236
> seconds in the loop:
>
> https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a77f=
52f5e83c699b38a7c2d3acdc52d26ceeba71/XL225n%2520bnxt%2520long%2520good.txt
>
> Any ideas about this?
>
>

--00000000000080e78305f70b1ed9
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hey,</div><div><br></div><div>Sure. We&#39;re talking=
 about 30 machines, all behave the same (either bad or good). I&#39;m prett=
y sure it&#39;s not a cabling issue. :)<br></div></div><br><div class=3D"gm=
ail_quote"><div dir=3D"ltr" class=3D"gmail_attr">Yves Gu=C3=A9rin &lt;<a hr=
ef=3D"mailto:yvesguerin@yahoo.ca">yvesguerin@yahoo.ca</a>&gt; ezt =C3=ADrta=
 (id=C5=91pont: 2023. m=C3=A1rc. 16., Cs, 22:14):<br></div><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid r=
gb(204,204,204);padding-left:1ex"><div><div style=3D"font-family:Helvetica =
Neue,Helvetica,Arial,sans-serif;font-size:16px"><div><div dir=3D"ltr">Dear =
Attila,</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">May be I will add =
some noise to your thread, sorry in advance, I am just a sysadmin and I fac=
ed the same problem with one of my old hp g7 the network card was broken (m=
alfunctionning) , sometime it works and sometime not when I used pxe and dh=
cpd (take to much time to answer to the dhcp so the motherboard decided to =
reboot, etc. (infinite loop)).=C2=A0 The card works perfectly when it&#39;s=
 setup by an OS.</div><div dir=3D"ltr"><br></div><div dir=3D"ltr">May be it=
&#39;s a stupid question or two: do you check the network cable ?=C2=A0 (I =
faced some defective cables and it ruin my day...) in the same way what abo=
ut the hub/router attached to this server (configuration, etc.), Do you swi=
tched a good one by a bad one ? (same network cable, hub/router, etc.)</div=
><div dir=3D"ltr"><br></div><div dir=3D"ltr">I spend too much nights in the=
 lab...<br></div><div dir=3D"ltr"><br></div><div dir=3D"ltr">Regards, <br><=
/div><div><br></div><div>Yves Guerin</div></div>
        <div><br></div><div><br></div>
       =20
        </div><div id=3D"m_8289234250849919298yahoo_quoted_9406432147">
            <div style=3D"font-family:&quot;Helvetica Neue&quot;,Helvetica,=
Arial,sans-serif;font-size:13px;color:rgb(38,40,42)">
               =20
                <div>
                    Le jeudi 16 mars 2023 =C3=A0 16:44:49 UTC=E2=88=924, At=
tila Nagy &lt;<a href=3D"mailto:nagy.attila@gmail.com" target=3D"_blank">na=
gy.attila@gmail.com</a>&gt; a =C3=A9crit :
                </div>
                <div><br></div>
                <div><br></div>
                <div><div id=3D"m_8289234250849919298yiv5749293741"><div di=
r=3D"ltr">Hi,<div><div dir=3D"ltr"><div><br></div><div>As this is super ann=
oying, I&#39;m willing to pay a $500 bounty for solving this issue (whomeve=
r is first, however I don&#39;t anticipate a big competition :) Having an i=
nvoice would be best, but I&#39;m willing to accept individuals as well).</=
div><div>I can&#39;t give remote access, but can run debug builds with seri=
al console. stable/13 branch.<br></div><div><br></div><div>I have a bunch o=
f netbooted machines, one set in a cluster is older (HP DL80 G9, 2x8C, Inte=
l I350 -igb- NICs), the other set is newer (HP XL225n G10, AMD EPYC2x16C, B=
CM57412 -bnxt- NICs).</div><div>All of these boot from the network, which i=
s basically:</div><div>- get IP and options with DHCP with the help of the =
NIC&#39;s PXE stack</div><div>- get the loader and kernel, start it</div><d=
iv>- do another round of DHCP from the kernel (bootp_subr.c)</div><div>- mo=
unt the root via NFS and let everything work as usual</div><div><br></div><=
div>The problem is that the newer machines take an indefinite time to boot.=
 The older ones (with igb NIC) work reliably, they always boot fast.<br></d=
iv><div>The process of getting an IP address via DHCP (bootpc_call from boo=
tp_subr.c) either succeeds normally (in a few seconds), or takes a lot of t=
ime.</div><div>Common (measured) times to boot range from 10s of minutes to=
 anywhere between a few hours (1-6).</div><div>Sometimes it just gets stuck=
 and couldn&#39;t get past bootpc_call (getting the DHCP lease).</div><div>=
<br></div><div>What I&#39;ve already tried:</div><div>- we have a redundant=
 set of DHCP servers which offer static leases (so there are two DHCPOFFERs=
), so I tried to turn off one of them, nothing has changed<br></div><div>- =
tried to disable SMP, the effect is the same<br></div><div>- tried to see w=
hether it&#39;s a network issue. The NIC&#39;s PXE stack always gets the le=
ase quickly and booting FreeBSD from an ISO and issuing dhclient on the sam=
e interface is also fast. After the machines have booted, there are no netw=
ork issues, they work reliably (since more than a year for 20+ machines, so=
 not just a few hours)<br></div><div><br></div><div>This issue wasn&#39;t s=
o bad previously (only a few mins to tens of minutes delay), but recently i=
t got pretty unbearable, even making some machines unbootable for days...</=
div><div><br></div><div>First I thought it might be a packet loss (or more =
exactly packet delivery from the DHCP server to the receiving socket), eith=
er in the network or in the NIC/kernel itself, so I placed a few random pri=
ntfs into bootp_subr.c and udp_usrreq.c.</div><div><br></div><div>After spe=
nding some time trying to understand the problem it feels like a race condi=
tion in <br></div><div>bootpc_call, but I don&#39;t know the code well enou=
gh to effectively verify that.<br></div><div><br></div><div>Here are the mo=
dified  bootp_subr.c and udp_usrreq.c:</div><div><a rel=3D"nofollow noopene=
r noreferrer" href=3D"https://gist.githubusercontent.com/bra-fsn/128ae9a3bb=
c0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/bootp_s=
ubr.c" target=3D"_blank">https://gist.githubusercontent.com/bra-fsn/128ae9a=
3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/boot=
p_subr.c</a></div><div><a rel=3D"nofollow noopener noreferrer" href=3D"http=
s://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af25=
2f618c84a46da2452d557ebc5078ac/udp_usrreq.c" target=3D"_blank">https://gist=
.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84=
a46da2452d557ebc5078ac/udp_usrreq.c</a></div><div>(modified from stable/13 =
branch from a few weeks earlier)<br></div><div><br></div><div>This is the o=
utput with the always working DL80 (igb) machine:</div><div><a rel=3D"nofol=
low noopener noreferrer" href=3D"https://gist.github.com/bra-fsn/128ae9a3bb=
c0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/DL80%25=
20igb%2520good.txt" target=3D"_blank">https://gist.github.com/bra-fsn/128ae=
9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/DL=
80%2520igb%2520good.txt</a></div><div><br></div><div>This is the console ou=
tput from a working boot for the XL225n (bnxt) machine:</div><div><a rel=3D=
"nofollow noopener noreferrer" href=3D"https://gist.github.com/bra-fsn/128a=
e9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/X=
L225n%2520bnxt%2520good.txt" target=3D"_blank">https://gist.github.com/bra-=
fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc=
5078ac/XL225n%2520bnxt%2520good.txt</a></div><div>as you can see, it&#39;s =
much slower than the DL80 (which also isn&#39;t that fast...)</div><div><br=
></div><div>And this one is a longer output, without success to that point =
(2 minutes without completing the DHCP flow):</div><div><a rel=3D"nofollow =
noopener noreferrer" href=3D"https://gist.github.com/bra-fsn/128ae9a3bbc0db=
dbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520=
bnxt%2520long.txt" target=3D"_blank">https://gist.github.com/bra-fsn/128ae9=
a3bbc0dbdbb2f6f4b3e2c5157a/raw</a><a rel=3D"nofollow noopener noreferrer" h=
ref=3D"https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw=
/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt" tar=
get=3D"_blank">/</a><a rel=3D"nofollow noopener noreferrer" href=3D"https:/=
/gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f6=
18c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt" target=3D"_blank"=
>a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt</a><=
/div><div><br></div><div>For the latter, here&#39;s an excerpt from the DHC=
P log:<br></div><div><a rel=3D"nofollow noopener noreferrer" href=3D"https:=
//gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a=
8ade8af252f618c84a46da2452d557ebc5078ac/dhcp_log.txt" target=3D"_blank">htt=
ps://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/ra=
w/a8ade8af252f618c84a46da2452d557ebc5078ac/dhcp_log.txt</a></div><div><br><=
/div><div>It seems the DHCP state always gets reset to IF_DHCP_UNRESOLVED e=
ven if there&#39;s answers from the DHCP server.<br></div><div><br></div><d=
iv>Here&#39;s another, longer console log, which succeeded after spending 2=
36 seconds in the loop:<br></div><div><a rel=3D"nofollow noopener noreferre=
r" href=3D"https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a=
/raw/a77f52f5e83c699b38a7c2d3acdc52d26ceeba71/XL225n%2520bnxt%2520long%2520=
good.txt" target=3D"_blank">https://gist.github.com/bra-fsn/128ae9a3bbc0dbd=
bb2f6f4b3e2c5157a/raw/a77f52f5e83c699b38a7c2d3acdc52d26ceeba71/XL225n%2520b=
nxt%2520long%2520good.txt</a></div><div><br></div><div>Any ideas about this=
?</div><div><br></div></div>
</div></div>
</div></div>
            </div>
        </div></div></blockquote></div>

--00000000000080e78305f70b1ed9--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM2hQG9XucjqM763CcCivtZufc4BYQi5BDdjmzAAMccBVEy2hA>