Date: Thu, 21 Dec 2023 09:53:05 -0700 From: Warner Losh <imp@bsdimp.com> To: Mark Johnston <markj@freebsd.org> Cc: Warner Losh <imp@freebsd.org>, src-committers <src-committers@freebsd.org>, "<dev-commits-src-all@freebsd.org>" <dev-commits-src-all@freebsd.org>, "<dev-commits-src-main@freebsd.org>" <dev-commits-src-main@freebsd.org> Subject: Re: git: 9e6d11ce9a51 - main - vtnet: Adjust rx buffer so IP header 32-bit aligned Message-ID: <CANCZdfrF6OSDqo37xM3hiEpZrRzzfh3kJPA00kBdVj3RExuG9Q@mail.gmail.com> In-Reply-To: <ZYRflMqw11nYhG7l@nuc> References: <202312210421.3BL4Lb9T036317@gitrepo.freebsd.org> <ZYRWH4YEb6eO5hGD@nuc> <CANCZdfqtGww8-6ObNo_rwTvcRJWH8UT%2B-f21tXodM54sqaWLpw@mail.gmail.com> <ZYRflMqw11nYhG7l@nuc>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Thu, Dec 21, 2023, 8:54 AM Mark Johnston <markj@freebsd.org> wrote: > On Thu, Dec 21, 2023 at 08:31:36AM -0700, Warner Losh wrote: > > On Thu, Dec 21, 2023 at 8:13 AM Mark Johnston <markj@freebsd.org> wrote: > > > > > On Thu, Dec 21, 2023 at 04:21:37AM +0000, Warner Losh wrote: > > > > The branch main has been updated by imp: > > > > > > > > URL: > > > > https://cgit.FreeBSD.org/src/commit/?id=9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4 > > > > > > > > commit 9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4 > > > > Author: Warner Losh <imp@FreeBSD.org> > > > > AuthorDate: 2023-12-20 19:09:09 +0000 > > > > Commit: Warner Losh <imp@FreeBSD.org> > > > > CommitDate: 2023-12-21 04:16:45 +0000 > > > > > > > > vtnet: Adjust rx buffer so IP header 32-bit aligned > > > > > > > > Call madj(m, ETHER_ALIGN) to offset rx buffers when allocating > them. > > > > This improves performance everywhere, and allows armv7 to work at > > > all. > > > > > > > > PR: 271288 (PR had a different fix than I > wound > > > up with) > > > > MFC After: 3 days > > > > Sponsored by: Netflix > > > > Differential Revision: https://reviews.freebsd.org/D43136 > > > > --- > > > > sys/dev/virtio/network/if_vtnet.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/sys/dev/virtio/network/if_vtnet.c > > > b/sys/dev/virtio/network/if_vtnet.c > > > > index 287af7751066..360176e4f845 100644 > > > > --- a/sys/dev/virtio/network/if_vtnet.c > > > > +++ b/sys/dev/virtio/network/if_vtnet.c > > > > @@ -1532,8 +1532,8 @@ vtnet_rx_alloc_buf(struct vtnet_softc *sc, int > > > nbufs, struct mbuf **m_tailp) > > > > m_freem(m_head); > > > > return (NULL); > > > > } > > > > - > > > > m->m_len = size; > > > > + m_adj(m, ETHER_ALIGN); > > > > > > The driver is expecting to get a cluster of size > sc->vtnet_rx_clustersz, > > > but now it's getting one of size sc->vtnet_rx_clustersz - 2. I don't > > > see how this change can be sufficient on its own: what prevents virtio > > > from writing sc->vtnet_rx_clustersz bytes and thereby overwriting the > > > two bytes following the cluster? > > > > > > > The only trouble that I saw was here: > > > > /* > > * Every mbuf should have the expected cluster size since > > that > > * is also used to allocate the replacements. > > */ > > KASSERT(m->m_len == clustersz, > > ("%s: mbuf size %d not expected cluster size %d", > > __func__, > > m->m_len, clustersz)); > > > > and even that's fine: We can adjust that assert. The next lines I think > > make it fine: > > m->m_len = MIN(m->m_len, len); > > > > so we only use what we say is there to land bytes into the mbuf. I'm now > > not sure though what to do about a few lines later: > > > > m_new = vtnet_rx_alloc_buf(sc, nreplace, &m_tail); > > if (m_new == NULL) { > > m_prev->m_len = clustersz; > > return (ENOBUFS); > > } > > > > which likely needs to be adjusted (or the old saved size). > > I'm not certain about what it should be doing instead, but yes this code > looks wrong now. > Yea. We need to do this at time of use not time of alloc. > I likely didn't see asserts in my testing because I didn't have LRO slow > > path packets. > > > > Do you see other places that would need adjusting? > > Presumably vtnet_rx_cluster_size() needs to be adjusted too? > > I also don't understand why we do this adjustment unconditionally > instead of only when __NO_STRICT_ALIGNMENT is not defined, as almost all > other NIC drivers do. > None of the ones I looked at had this... maybe I got unlucky. I'd think we'd have better performance if things are aligned all the way... is there a reason not to? Warner > > > All the other places > > seem to use m_len correctly on the rx path, but I'm not a nic driver guy, > > and I might be missing something. > [-- Attachment #2 --] <div dir="auto"><div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 21, 2023, 8:54 AM Mark Johnston <<a href="mailto:markj@freebsd.org">markj@freebsd.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, Dec 21, 2023 at 08:31:36AM -0700, Warner Losh wrote:<br> > On Thu, Dec 21, 2023 at 8:13 AM Mark Johnston <<a href="mailto:markj@freebsd.org" target="_blank" rel="noreferrer">markj@freebsd.org</a>> wrote:<br> > <br> > > On Thu, Dec 21, 2023 at 04:21:37AM +0000, Warner Losh wrote:<br> > > > The branch main has been updated by imp:<br> > > ><br> > > > URL:<br> > > <a href="https://cgit.FreeBSD.org/src/commit/?id=9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4" rel="noreferrer noreferrer" target="_blank">https://cgit.FreeBSD.org/src/commit/?id=9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4</a><br> > > ><br> > > > commit 9e6d11ce9a51d75ed6a94e180f2fb4e9188a2ba4<br> > > > Author: Warner Losh <imp@FreeBSD.org><br> > > > AuthorDate: 2023-12-20 19:09:09 +0000<br> > > > Commit: Warner Losh <imp@FreeBSD.org><br> > > > CommitDate: 2023-12-21 04:16:45 +0000<br> > > ><br> > > > vtnet: Adjust rx buffer so IP header 32-bit aligned<br> > > ><br> > > > Call madj(m, ETHER_ALIGN) to offset rx buffers when allocating them.<br> > > > This improves performance everywhere, and allows armv7 to work at<br> > > all.<br> > > ><br> > > > PR: 271288 (PR had a different fix than I wound<br> > > up with)<br> > > > MFC After: 3 days<br> > > > Sponsored by: Netflix<br> > > > Differential Revision: <a href="https://reviews.freebsd.org/D43136" rel="noreferrer noreferrer" target="_blank">https://reviews.freebsd.org/D43136</a><br> > > > ---<br> > > > sys/dev/virtio/network/if_vtnet.c | 2 +-<br> > > > 1 file changed, 1 insertion(+), 1 deletion(-)<br> > > ><br> > > > diff --git a/sys/dev/virtio/network/if_vtnet.c<br> > > b/sys/dev/virtio/network/if_vtnet.c<br> > > > index 287af7751066..360176e4f845 100644<br> > > > --- a/sys/dev/virtio/network/if_vtnet.c<br> > > > +++ b/sys/dev/virtio/network/if_vtnet.c<br> > > > @@ -1532,8 +1532,8 @@ vtnet_rx_alloc_buf(struct vtnet_softc *sc, int<br> > > nbufs, struct mbuf **m_tailp)<br> > > > m_freem(m_head);<br> > > > return (NULL);<br> > > > }<br> > > > -<br> > > > m->m_len = size;<br> > > > + m_adj(m, ETHER_ALIGN);<br> > ><br> > > The driver is expecting to get a cluster of size sc->vtnet_rx_clustersz,<br> > > but now it's getting one of size sc->vtnet_rx_clustersz - 2. I don't<br> > > see how this change can be sufficient on its own: what prevents virtio<br> > > from writing sc->vtnet_rx_clustersz bytes and thereby overwriting the<br> > > two bytes following the cluster?<br> > ><br> > <br> > The only trouble that I saw was here:<br> > <br> > /*<br> > * Every mbuf should have the expected cluster size since<br> > that<br> > * is also used to allocate the replacements.<br> > */<br> > KASSERT(m->m_len == clustersz,<br> > ("%s: mbuf size %d not expected cluster size %d",<br> > __func__,<br> > m->m_len, clustersz));<br> > <br> > and even that's fine: We can adjust that assert. The next lines I think<br> > make it fine:<br> > m->m_len = MIN(m->m_len, len);<br> > <br> > so we only use what we say is there to land bytes into the mbuf. I'm now<br> > not sure though what to do about a few lines later:<br> > <br> > m_new = vtnet_rx_alloc_buf(sc, nreplace, &m_tail);<br> > if (m_new == NULL) {<br> > m_prev->m_len = clustersz;<br> > return (ENOBUFS);<br> > }<br> > <br> > which likely needs to be adjusted (or the old saved size).<br> <br> I'm not certain about what it should be doing instead, but yes this code<br> looks wrong now.<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Yea. We need to do this at time of use not time of alloc.</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> > I likely didn't see asserts in my testing because I didn't have LRO slow<br> > path packets.<br> > <br> > Do you see other places that would need adjusting?<br> <br> Presumably vtnet_rx_cluster_size() needs to be adjusted too?<br> <br> I also don't understand why we do this adjustment unconditionally<br> instead of only when __NO_STRICT_ALIGNMENT is not defined, as almost all<br> other NIC drivers do.<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto">None of the ones I looked at had this... maybe I got unlucky. I'd think we'd have better performance if things are aligned all the way... is there a reason not to?</div><div dir="auto"><br></div><div dir="auto">Warner</div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br> > All the other places<br> > seem to use m_len correctly on the rx path, but I'm not a nic driver guy,<br> > and I might be missing something.<br> </blockquote></div></div></div>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfrF6OSDqo37xM3hiEpZrRzzfh3kJPA00kBdVj3RExuG9Q>
