Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 17 Mar 2023 21:05:55 +0100
From:      Attila Nagy <nagy.attila@gmail.com>
To:        Rick Macklem <rick.macklem@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Kernel DHCP unpredictable/fails (PXE boot), userspace DHCP works just fine
Message-ID:  <CAM2hQG9PM-eC7HGU6Q6oY4d-rstkad2ra_1K6DwMOUHO-0cZGQ@mail.gmail.com>
In-Reply-To: <CAM5tNy5riHWthqzdCYncRkqOcTDdoTVy4J0WjNPbsx3zxPksKw@mail.gmail.com>
References:  <CAM2hQG-p=bfSh_nxuah9zcTBbz7HQ9pYyvOR2f6rC=CUGePKsg@mail.gmail.com> <CAM2hQG-oDRsoccg3S1LykyUF=joWbdJz=GSPOnUroDRxjZ2_iQ@mail.gmail.com> <CAM5tNy5riHWthqzdCYncRkqOcTDdoTVy4J0WjNPbsx3zxPksKw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000092443d05f71e1bce
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Rick Macklem <rick.macklem@gmail.com> ezt =C3=ADrta (id=C5=91pont: 2023. m=
=C3=A1rc. 16.,
Cs, 23:01):

> On Thu, Mar 16, 2023 at 1:44=E2=80=AFPM Attila Nagy <nagy.attila@gmail.co=
m> wrote:
> >
> > The problem is that the newer machines take an indefinite time to boot.
> The older ones (with igb NIC) work reliably, they always boot fast.
> Haven't you at least partially answered the question yourself here?
> In other words, it sounds like there is an issue with the NIC driver
> for the newer chip. (If you can replace the NIC with one with
> a different chip, I'd try that.)
>
Yes, this driver is quite bad and has a lot of flaws, but after the OS
boots, it works fine otherwise.
I can't change the NIC. :(


>
> A possible workaround would be to switch to using "options NFS_ROOT"
> instead of
> "BOOTP_NFSROOT".  This way of doing diskless NFS depends on pexboot
> loading the FreeBSD boot loader and then it sets enough environment
> variables so that a kernel built with "options NFS_ROOT" and no
> "options BOOTP_NFSROOT"
> will boot.
>
> Oh, I long forgot this option, thanks for bringing it up!
Yes, it skips that code and the DHCP query along with it and works
wonderfully, the machine boots fast.
For me this confirms that the problem lies in the bootp_subr.c DHCP code
(at least something works bad with this NIC, I guess it might be a timing
issue).

I had to dig out of my memory why we don't use that, but the first boot
helped me to get those memories back:
bootp_subr.c gets option 134 from the DHCP response and loads it into
kern.bootp_cookie, which is then used by /etc/rc.initdiskless to set up the
class, which we depend on.
Well, it's possible to work that around ("encoding" the class in the NFS
root path), but that's not the same (we have different initdiskless
"classes" with the same NFS root paths).

I'm not sure if pxeboot could get that information from the PXE stack, but
I guess even if it has access to the DHCP reply, nobody is interested in
modifying it to actually pass option 134 through to kern.bootp_cookie if it
wasn't implemented in the last many years. :)

--00000000000092443d05f71e1bce
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail=
_attr">Rick Macklem &lt;<a href=3D"mailto:rick.macklem@gmail.com">rick.mack=
lem@gmail.com</a>&gt; ezt =C3=ADrta (id=C5=91pont: 2023. m=C3=A1rc. 16., Cs=
, 23:01):<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0p=
x 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Thu=
, Mar 16, 2023 at 1:44=E2=80=AFPM Attila Nagy &lt;<a href=3D"mailto:nagy.at=
tila@gmail.com" target=3D"_blank">nagy.attila@gmail.com</a>&gt; wrote:<br>
&gt;<br>
&gt; The problem is that the newer machines take an indefinite time to boot=
. The older ones (with igb NIC) work reliably, they always boot fast.<br>
Haven&#39;t you at least partially answered the question yourself here?<br>
In other words, it sounds like there is an issue with the NIC driver<br>
for the newer chip. (If you can replace the NIC with one with<br>
a different chip, I&#39;d try that.)<br></blockquote><div>Yes, this driver =
is quite bad and has a lot of flaws, but after the OS boots, it works fine =
otherwise.<br></div><div>I can&#39;t change the NIC. :(<br></div><div>=C2=
=A0<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px =
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
A possible workaround would be to switch to using &quot;options NFS_ROOT&qu=
ot; instead of<br>
&quot;BOOTP_NFSROOT&quot;.=C2=A0 This way of doing diskless NFS depends on =
pexboot<br>
loading the FreeBSD boot loader and then it sets enough environment<br>
variables so that a kernel built with &quot;options NFS_ROOT&quot; and no<b=
r>
&quot;options BOOTP_NFSROOT&quot;<br>
will boot.<br>
<br></blockquote><div>Oh, I long forgot this option, thanks for bringing it=
 up!</div><div>Yes, it skips that code and the DHCP query along with it and=
 works wonderfully, the machine boots fast.<br></div><div>For me this confi=
rms that the problem lies in the bootp_subr.c DHCP code (at least something=
 works bad with this NIC, I guess it might be a timing issue).</div><div><b=
r></div><div>I had to dig out of my memory why we don&#39;t use that, but t=
he first boot helped me to get those memories back:</div><div>bootp_subr.c =
gets option 134 from the DHCP response and loads it into kern.bootp_cookie,=
 which is then used by /etc/rc.initdiskless to set up the class, which we d=
epend on.</div><div>Well, it&#39;s possible to work that around (&quot;enco=
ding&quot; the class in the NFS root path), but that&#39;s not the same (we=
 have different initdiskless &quot;classes&quot; with the same NFS root pat=
hs).</div><div><br></div><div>I&#39;m not sure if pxeboot could get that in=
formation from the PXE stack, but I guess even if it has access to the DHCP=
 reply, nobody is interested in modifying it to actually pass option 134 th=
rough to kern.bootp_cookie if it wasn&#39;t implemented in the last many ye=
ars. :)</div><div><br></div></div></div>

--00000000000092443d05f71e1bce--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAM2hQG9PM-eC7HGU6Q6oY4d-rstkad2ra_1K6DwMOUHO-0cZGQ>