Date: Thu, 16 Mar 2023 15:01:32 -0700 From: Rick Macklem <rick.macklem@gmail.com> To: Attila Nagy <nagy.attila@gmail.com> Cc: freebsd-stable@freebsd.org Subject: Re: Kernel DHCP unpredictable/fails (PXE boot), userspace DHCP works just fine Message-ID: <CAM5tNy5riHWthqzdCYncRkqOcTDdoTVy4J0WjNPbsx3zxPksKw@mail.gmail.com> In-Reply-To: <CAM2hQG-oDRsoccg3S1LykyUF=joWbdJz=GSPOnUroDRxjZ2_iQ@mail.gmail.com> References: <CAM2hQG-p=bfSh_nxuah9zcTBbz7HQ9pYyvOR2f6rC=CUGePKsg@mail.gmail.com> <CAM2hQG-oDRsoccg3S1LykyUF=joWbdJz=GSPOnUroDRxjZ2_iQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 16, 2023 at 1:44=E2=80=AFPM Attila Nagy <nagy.attila@gmail.com>= wrote: > > Hi, > > As this is super annoying, I'm willing to pay a $500 bounty for solving t= his issue (whomever is first, however I don't anticipate a big competition = :) Having an invoice would be best, but I'm willing to accept individuals a= s well). > I can't give remote access, but can run debug builds with serial console.= stable/13 branch. > > I have a bunch of netbooted machines, one set in a cluster is older (HP D= L80 G9, 2x8C, Intel I350 -igb- NICs), the other set is newer (HP XL225n G10= , AMD EPYC2x16C, BCM57412 -bnxt- NICs). > All of these boot from the network, which is basically: > - get IP and options with DHCP with the help of the NIC's PXE stack > - get the loader and kernel, start it > - do another round of DHCP from the kernel (bootp_subr.c) > - mount the root via NFS and let everything work as usual > > The problem is that the newer machines take an indefinite time to boot. T= he older ones (with igb NIC) work reliably, they always boot fast. Haven't you at least partially answered the question yourself here? In other words, it sounds like there is an issue with the NIC driver for the newer chip. (If you can replace the NIC with one with a different chip, I'd try that.) A possible workaround would be to switch to using "options NFS_ROOT" instea= d of "BOOTP_NFSROOT". This way of doing diskless NFS depends on pexboot loading the FreeBSD boot loader and then it sets enough environment variables so that a kernel built with "options NFS_ROOT" and no "options BOOTP_NFSROOT" will boot. Yes, both approaches should work, but if one doesn't, ... rick > The process of getting an IP address via DHCP (bootpc_call from bootp_sub= r.c) either succeeds normally (in a few seconds), or takes a lot of time. > Common (measured) times to boot range from 10s of minutes to anywhere bet= ween a few hours (1-6). > Sometimes it just gets stuck and couldn't get past bootpc_call (getting t= he DHCP lease). > > What I've already tried: > - we have a redundant set of DHCP servers which offer static leases (so t= here are two DHCPOFFERs), so I tried to turn off one of them, nothing has c= hanged > - tried to disable SMP, the effect is the same > - tried to see whether it's a network issue. The NIC's PXE stack always g= ets the lease quickly and booting FreeBSD from an ISO and issuing dhclient = on the same interface is also fast. After the machines have booted, there a= re no network issues, they work reliably (since more than a year for 20+ ma= chines, so not just a few hours) > > This issue wasn't so bad previously (only a few mins to tens of minutes d= elay), but recently it got pretty unbearable, even making some machines unb= ootable for days... > > First I thought it might be a packet loss (or more exactly packet deliver= y from the DHCP server to the receiving socket), either in the network or i= n the NIC/kernel itself, so I placed a few random printfs into bootp_subr.c= and udp_usrreq.c. > > After spending some time trying to understand the problem it feels like a= race condition in > bootpc_call, but I don't know the code well enough to effectively verify = that. > > Here are the modified bootp_subr.c and udp_usrreq.c: > https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c515= 7a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/bootp_subr.c > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad= e8af252f618c84a46da2452d557ebc5078ac/udp_usrreq.c > (modified from stable/13 branch from a few weeks earlier) > > This is the output with the always working DL80 (igb) machine: > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad= e8af252f618c84a46da2452d557ebc5078ac/DL80%2520igb%2520good.txt > > This is the console output from a working boot for the XL225n (bnxt) mach= ine: > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad= e8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520good.txt > as you can see, it's much slower than the DL80 (which also isn't that fas= t...) > > And this one is a longer output, without success to that point (2 minutes= without completing the DHCP flow): > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a8ad= e8af252f618c84a46da2452d557ebc5078ac/XL225n%2520bnxt%2520long.txt > > For the latter, here's an excerpt from the DHCP log: > https://gist.githubusercontent.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c515= 7a/raw/a8ade8af252f618c84a46da2452d557ebc5078ac/dhcp_log.txt > > It seems the DHCP state always gets reset to IF_DHCP_UNRESOLVED even if t= here's answers from the DHCP server. > > Here's another, longer console log, which succeeded after spending 236 se= conds in the loop: > https://gist.github.com/bra-fsn/128ae9a3bbc0dbdbb2f6f4b3e2c5157a/raw/a77f= 52f5e83c699b38a7c2d3acdc52d26ceeba71/XL225n%2520bnxt%2520long%2520good.txt > > Any ideas about this? >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM5tNy5riHWthqzdCYncRkqOcTDdoTVy4J0WjNPbsx3zxPksKw>