Date: Thu, 16 Mar 2023 22:26:40 +0100 From: Attila Nagy <nagy.attila@gmail.com> To: =?UTF-8?B?WXZlcyBHdcOpcmlu?= <yvesguerin@yahoo.ca> Cc: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org> Subject: Re: Kernel DHCP unpredictable/fails (PXE boot), userspace DHCP works just fine Message-ID: <CAM2hQG9XucjqM763CcCivtZufc4BYQi5BDdjmzAAMccBVEy2hA@mail.gmail.com> In-Reply-To: <132303943.191443.1679001265318@mail.yahoo.com> References: <CAM2hQG-p=bfSh_nxuah9zcTBbz7HQ9pYyvOR2f6rC=CUGePKsg@mail.gmail.com> <CAM2hQG-oDRsoccg3S1LykyUF=joWbdJz=GSPOnUroDRxjZ2_iQ@mail.gmail.com> <132303943.191443.1679001265318@mail.yahoo.com>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] Hey, Sure. We're talking about 30 machines, all behave the same (either bad or good). I'm pretty sure it's not a cabling issue. :) Yves Guérin <yvesguerin@yahoo.ca> ezt írta (időpont: 2023. márc. 16., Cs, 22:14): > Dear Attila, > > May be I will add some noise to your thread, sorry in advance, I am just a > sysadmin and I faced the same problem with one of my old hp g7 the network > card was broken (malfunctionning) , sometime it works and sometime not when > I used pxe and dhcpd (take to much time to answer to the dhcp so the > motherboard decided to reboot, etc. (infinite loop)). The card works > perfectly when it's setup by an OS. > > May be it's a stupid question or two: do you check the network cable ? (I > faced some defective cables and it ruin my day...) in the same way what > about the hub/router attached to this server (configuration, etc.), Do you > switched a good one by a bad one ? (same network cable, hub/router, etc.) > > I spend too much nights in the lab... > > Regards, > > Yves Guerin > > > Le jeudi 16 mars 2023 à 16:44:49 UTC−4, Attila Nagy <nagy.attila@gmail.com> > a écrit : > > > Hi, > > As this is super annoying, I'm willing to pay a $500 bounty for solving > this issue (whomever is first, however I don't anticipate a big competition > :) Having an invoice would be best, but I'm willing to accept individuals > as well). > I can't give remote access, but can run debug builds with serial console. > stable/13 branch. > > I have a bunch of netbooted machines, one set in a cluster is older (HP > DL80 G9, 2x8C, Intel I350 -igb- NICs), the other set is newer (HP XL225n > G10, AMD EPYC2x16C, BCM57412 -bnxt- NICs). > All of these boot from the network, which is basically: > - get IP and options with DHCP with the help of the NIC's PXE stack > - get the loader and kernel, start it > - do another round of DHCP from the kernel (bootp_subr.c) > - mount the root via NFS and let everything work as usual > > The problem is that the newer machines take an indefinite time to boot. > The older ones (with igb NIC) work reliably, they always boot fast. > The process of getting an IP address via DHCP (bootpc_call from > bootp_subr.c) either succeeds normally (in a few seconds), or takes a lot > of time. > Common (measured) times to boot range from 10s of minutes to anywhere > between a few hours (1-6). > Sometimes it just gets stuck and couldn't get past bootpc_call (getting > the DHCP lease). > > What I've already tried: > - we have a redundant set of DHCP servers which offer static leases (so > there are two DHCPOFFERs), so I tried to turn off one of them, nothing has > changed > - tried to disable SMP, the effect is the same > - tried to see whether it's a network issue. The NIC's PXE stack always > gets the lease quickly and booting FreeBSD from an ISO and issuing dhclient > on the same interface is also fast. After the machines have booted, there > are no network issues, they work reliably (since more than a year for 20+ > machines, so not just a few hours) > > This issue wasn't so bad previously (only a few mins to tens of minutes > delay), but recently it got pretty unbearable, even making some machines > unbootable for days... > > First I thought it might be a packet loss (or more exactly packet delivery > from the DHCP server to the receiving socket), either in the network or in > the NIC/kernel itself, so I placed a few random printfs into bootp_subr.c > and udp_usrreq.c. > > After spending some time trying to understand the problem it feels like a > race condition in > bootpc_call, but I don't know the code well enough to effectively verify > that. > > Here are the modified bootp_subr.c and udp_usrreq.c: > > https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/bootp_subr.c > > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/udp_usrreq.c > (modified from stable/13 branch from a few weeks earlier) > > This is the output with the always working DL80 (igb) machine: > > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/DL80%2520igb%2520good.txt > > This is the console output from a working boot for the XL225n (bnxt) > machine: > > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520good.txt > as you can see, it's much slower than the DL80 (which also isn't that > fast...) > > And this one is a longer output, without success to that point (2 minutes > without completing the DHCP flow): > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw > <https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt> > / > <https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt> > a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt > <https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt> > > For the latter, here's an excerpt from the DHCP log: > > https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/dhcp_log.txt > > It seems the DHCP state always gets reset to IF_DHCP_UNRESOLVED even if > there's answers from the DHCP server. > > Here's another, longer console log, which succeeded after spending 236 > seconds in the loop: > > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a77f52f5e83c699b38a7c2d3acdc52d26ceeba71/XL225n%2520bnxt%2520long%2520good.txt > > Any ideas about this? > > [-- Attachment #2 --] <div dir="ltr"><div>Hey,</div><div><br></div><div>Sure. We're talking about 30 machines, all behave the same (either bad or good). I'm pretty sure it's not a cabling issue. :)<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Yves Guérin <<a href="mailto:yvesguerin@yahoo.ca">yvesguerin@yahoo.ca</a>> ezt írta (időpont: 2023. márc. 16., Cs, 22:14):<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="font-family:Helvetica Neue,Helvetica,Arial,sans-serif;font-size:16px"><div><div dir="ltr">Dear Attila,</div><div dir="ltr"><br></div><div dir="ltr">May be I will add some noise to your thread, sorry in advance, I am just a sysadmin and I faced the same problem with one of my old hp g7 the network card was broken (malfunctionning) , sometime it works and sometime not when I used pxe and dhcpd (take to much time to answer to the dhcp so the motherboard decided to reboot, etc. (infinite loop)). The card works perfectly when it's setup by an OS.</div><div dir="ltr"><br></div><div dir="ltr">May be it's a stupid question or two: do you check the network cable ? (I faced some defective cables and it ruin my day...) in the same way what about the hub/router attached to this server (configuration, etc.), Do you switched a good one by a bad one ? (same network cable, hub/router, etc.)</div><div dir="ltr"><br></div><div dir="ltr">I spend too much nights in the lab...<br></div><div dir="ltr"><br></div><div dir="ltr">Regards, <br></div><div><br></div><div>Yves Guerin</div></div> <div><br></div><div><br></div> </div><div id="m_8289234250849919298yahoo_quoted_9406432147"> <div style="font-family:"Helvetica Neue",Helvetica,Arial,sans-serif;font-size:13px;color:rgb(38,40,42)"> <div> Le jeudi 16 mars 2023 à 16:44:49 UTC−4, Attila Nagy <<a href="mailto:nagy.attila@gmail.com" target="_blank">nagy.attila@gmail.com</a>> a écrit : </div> <div><br></div> <div><br></div> <div><div id="m_8289234250849919298yiv5749293741"><div dir="ltr">Hi,<div><div dir="ltr"><div><br></div><div>As this is super annoying, I'm willing to pay a $500 bounty for solving this issue (whomever is first, however I don't anticipate a big competition :) Having an invoice would be best, but I'm willing to accept individuals as well).</div><div>I can't give remote access, but can run debug builds with serial console. stable/13 branch.<br></div><div><br></div><div>I have a bunch of netbooted machines, one set in a cluster is older (HP DL80 G9, 2x8C, Intel I350 -igb- NICs), the other set is newer (HP XL225n G10, AMD EPYC2x16C, BCM57412 -bnxt- NICs).</div><div>All of these boot from the network, which is basically:</div><div>- get IP and options with DHCP with the help of the NIC's PXE stack</div><div>- get the loader and kernel, start it</div><div>- do another round of DHCP from the kernel (bootp_subr.c)</div><div>- mount the root via NFS and let everything work as usual</div><div><br></div><div>The problem is that the newer machines take an indefinite time to boot. The older ones (with igb NIC) work reliably, they always boot fast.<br></div><div>The process of getting an IP address via DHCP (bootpc_call from bootp_subr.c) either succeeds normally (in a few seconds), or takes a lot of time.</div><div>Common (measured) times to boot range from 10s of minutes to anywhere between a few hours (1-6).</div><div>Sometimes it just gets stuck and couldn't get past bootpc_call (getting the DHCP lease).</div><div><br></div><div>What I've already tried:</div><div>- we have a redundant set of DHCP servers which offer static leases (so there are two DHCPOFFERs), so I tried to turn off one of them, nothing has changed<br></div><div>- tried to disable SMP, the effect is the same<br></div><div>- tried to see whether it's a network issue. The NIC's PXE stack always gets the lease quickly and booting FreeBSD from an ISO and issuing dhclient on the same interface is also fast. After the machines have booted, there are no network issues, they work reliably (since more than a year for 20+ machines, so not just a few hours)<br></div><div><br></div><div>This issue wasn't so bad previously (only a few mins to tens of minutes delay), but recently it got pretty unbearable, even making some machines unbootable for days...</div><div><br></div><div>First I thought it might be a packet loss (or more exactly packet delivery from the DHCP server to the receiving socket), either in the network or in the NIC/kernel itself, so I placed a few random printfs into bootp_subr.c and udp_usrreq.c.</div><div><br></div><div>After spending some time trying to understand the problem it feels like a race condition in <br></div><div>bootpc_call, but I don't know the code well enough to effectively verify that.<br></div><div><br></div><div>Here are the modified bootp_subr.c and udp_usrreq.c:</div><div><a rel="nofollow noopener noreferrer" href="https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/bootp_subr.c" target="_blank">https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/bootp_subr.c</a></div><div><a rel="nofollow noopener noreferrer" href="https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/udp_usrreq.c" target="_blank">https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/udp_usrreq.c</a></div><div>(modified from stable/13 branch from a few weeks earlier)<br></div><div><br></div><div>This is the output with the always working DL80 (igb) machine:</div><div><a rel="nofollow noopener noreferrer" href="https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/DL80%2520igb%2520good.txt" target="_blank">https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/DL80%2520igb%2520good.txt</a></div><div><br></div><div>This is the console output from a working boot for the XL225n (bnxt) machine:</div><div><a rel="nofollow noopener noreferrer" href="https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520good.txt" target="_blank">https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520good.txt</a></div><div>as you can see, it's much slower than the DL80 (which also isn't that fast...)</div><div><br></div><div>And this one is a longer output, without success to that point (2 minutes without completing the DHCP flow):</div><div><a rel="nofollow noopener noreferrer" href="https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt" target="_blank">https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw</a><a rel="nofollow noopener noreferrer" href="https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt" target="_blank">/</a><a rel="nofollow noopener noreferrer" href="https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt" target="_blank">a8ade8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt</a></div><div><br></div><div>For the latter, here's an excerpt from the DHCP log:<br></div><div><a rel="nofollow noopener noreferrer" href="https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/dhcp_log.txt" target="_blank">https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/dhcp_log.txt</a></div><div><br></div><div>It seems the DHCP state always gets reset to IF_DHCP_UNRESOLVED even if there's answers from the DHCP server.<br></div><div><br></div><div>Here's another, longer console log, which succeeded after spending 236 seconds in the loop:<br></div><div><a rel="nofollow noopener noreferrer" href="https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a77f52f5e83c699b38a7c2d3acdc52d26ceeba71/XL225n%2520bnxt%2520long%2520good.txt" target="_blank">https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a77f52f5e83c699b38a7c2d3acdc52d26ceeba71/XL225n%2520bnxt%2520long%2520good.txt</a></div><div><br></div><div>Any ideas about this?</div><div><br></div></div> </div></div> </div></div> </div> </div></div></blockquote></div>help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM2hQG9XucjqM763CcCivtZufc4BYQi5BDdjmzAAMccBVEy2hA>
