Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Jan 2024 14:13:25 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
Cc:        Warner Losh <imp@freebsd.org>, src-committers@freebsd.org,  dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org
Subject:   Re: git: 3be59adbb5a2 - main - vtnet: Adjust for ethernet alignment.
Message-ID:  <CANCZdfrMbeCBeCA=xuZBapR5rR-ASRtn3e3BCw%2BE8NU2AerzpA@mail.gmail.com>
In-Reply-To: <CANCZdfr_PQKH_9hvqDi=wZgR13WC6NwQ07OKdX_1pq7XrGVgfw@mail.gmail.com>
References:  <202401290514.40T5Eb1i061789@gitrepo.freebsd.org> <n4s5849r-4q46-3628-qq82-p50q3698172n@yvfgf.mnoonqbm.arg> <CANCZdfr_PQKH_9hvqDi=wZgR13WC6NwQ07OKdX_1pq7XrGVgfw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
Sorry for (a) the top post and (b) replying to myself...

https://reviews.freebsd.org/D43656

is what I think you're worried about. Am I right, or is there some place
else that has you uneasy?

Warner

P.S. I'm also thinking about following that up with
https://reviews.freebsd.org/D43654 since that style seems more in fashion
these days.

On Mon, Jan 29, 2024 at 10:14 AM Warner Losh <imp@bsdimp.com> wrote:

>
>
> On Mon, Jan 29, 2024 at 8:26 AM Bjoern A. Zeeb <
> bzeeb-lists@lists.zabbadoz.net> wrote:
>
>> On Mon, 29 Jan 2024, Warner Losh wrote:
>>
>> > The branch main has been updated by imp:
>> >
>> > URL:
>> https://cgit.FreeBSD.org/src/commit/?id=3be59adbb5a2ae7600d46432d3bc82286e507e95
>> >
>> > commit 3be59adbb5a2ae7600d46432d3bc82286e507e95
>> > Author:     Warner Losh <imp@FreeBSD.org>
>> > AuthorDate: 2024-01-29 05:08:55 +0000
>> > Commit:     Warner Losh <imp@FreeBSD.org>
>> > CommitDate: 2024-01-29 05:08:55 +0000
>> >
>> >    vtnet: Adjust for ethernet alignment.
>> >
>> >    If the header that we add to the packet's size is 0 % 4 and we're
>> >    strictly aligning, then we need to adjust where we store the header
>> so
>> >    the packet that follows will have it's struct ip header properly
>> >    aligned.  We do this on allocation (and when we check the length of
>> the
>> >    mbufs in the lro_nomrg case). We can't just adjust the clustersz in
>> the
>> >    softc, because it's also used to allocate the mbufs and it needs to
>> be
>> >    the proper size for that. Since we otherwise use the size of the mbuf
>> >    (or sometimes the smaller size of the received packet) to compute how
>> >    much we can buffer, this ensures no overflows. The 2 byte adjustment
>> >    also does not affect how many packets we can receive in the lro_nomrg
>> >    case.
>>
>>
>> Doesn't this still include at least these two un-asserted/un-documented
>> asumptions:
>>
>> (a) mbuf space is large enough to hold 2 extra bytes?  Is this always
>>      the case?
>>
>
> I was sure I puzzled through all the cases correctly.  We adjust the length
> of the available buffer by 2 and offset everything by 2. this work because
> all the vtnet header types only have 1 or 2 byte data fields. It keeps us
> from
> writing too much into the buffer.
>
> However, in vtnet_rx_cluster_size, we don't adjust the frame size before
> allocating. So if the mtu + vlan_header. So if the mtu + 12 + 18 is 2047
> or 2048
> or mtu = 2017 or 2018 we'll get it wrong (we don't adjust in the case where
> we use vtnet_rx_header which is 14 bytes). But I think in that case, we'll
> "drop
> the last two bytes off the end" get it wrong (since we adjust the total
> length
> of the mbuf space) rather than "overflow two bytes" get it wrong. For that
> case, we'd need to add two as I indicated in the comments below.
>
> static int
> vtnet_rx_cluster_size(struct vtnet_softc *sc, int mtu)
> {
>         int framesz;
>
>         if (sc->vtnet_flags & VTNET_FLAG_MRG_RXBUFS)
>                 return (MJUMPAGESIZE);
>         else if (sc->vtnet_flags & VTNET_FLAG_LRO_NOMRG)
>                 return (MCLBYTES);
>
>         /*
>          * Try to scale the receive mbuf cluster size from the MTU. We
>          * could also use the VQ size to influence the selected size,
>          * but that would only matter for very small queues.
>          */
>         if (vtnet_modern(sc)) {
>                 MPASS(sc->vtnet_hdr_size == sizeof(struct
> virtio_net_hdr_v1));
>                 framesz = sizeof(struct virtio_net_hdr_v1);
>         } else
>                 framesz = sizeof(struct vtnet_rx_header);
>         framesz += sizeof(struct ether_vlan_header) + mtu;
> // XXX if framesz % 4 == 2 and we're strict alignment we need to add 2
> // XXX or equivalently, if vnet_hdr_size % 4 == 0 and ...
>         if (framesz <= MCLBYTES)
>                 return (MCLBYTES);
>         else if (framesz <= MJUMPAGESIZE)
>                 return (MJUMPAGESIZE);
>         else if (framesz <= MJUM9BYTES)
>                 return (MJUM9BYTES);
>
>         /* Sane default; avoid 16KB clusters. */
>         return (MCLBYTES);
> }
>
> Do you agree? Is this what you are worried about? It's the only hole I
> could find
> this morning (after going over this a dozen other times trying to get it
> right for
> the review, and bryanv was happy neither noticed). It also explains why my
> tests work: I didn't try to have a weird mtu of 2018 bytes.
>
>
>> (b) the struct sizes assigned to vtnet_hdr_size are not odd numbers of
>>      bytes?  Could add comments or CTASSERTs?
>>
>
> True, I'll ctassert the sizes and say we rely on things being even sized
> in if_vnetvar.h.
>
> Warner
>
>
>> >    PR:                     271288
>> >    Sponsored by:           Netflix
>> >    Reviewed by:            bryanv
>> >    Differential Revision:  https://reviews.freebsd.org/D43224
>>
>> --
>> Bjoern A. Zeeb                                                     r15:7
>>
>

[-- Attachment #2 --]
<div dir="ltr">Sorry for (a) the top post and (b) replying to myself...<div><br></div><div><a href="https://reviews.freebsd.org/D43656">https://reviews.freebsd.org/D43656</a><br></div><div><br></div><div>is what I think you&#39;re worried about. Am I right, or is there some place else that has you uneasy?</div><div><br></div><div>Warner</div><div><br></div><div>P.S. I&#39;m also thinking about following that up with <a href="https://reviews.freebsd.org/D43654">https://reviews.freebsd.org/D43654</a>; since that style seems more in fashion these days.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jan 29, 2024 at 10:14 AM Warner Losh &lt;<a href="mailto:imp@bsdimp.com">imp@bsdimp.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jan 29, 2024 at 8:26 AM Bjoern A. Zeeb &lt;<a href="mailto:bzeeb-lists@lists.zabbadoz.net" target="_blank">bzeeb-lists@lists.zabbadoz.net</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, 29 Jan 2024, Warner Losh wrote:<br>
<br>
&gt; The branch main has been updated by imp:<br>
&gt;<br>
&gt; URL: <a href="https://cgit.FreeBSD.org/src/commit/?id=3be59adbb5a2ae7600d46432d3bc82286e507e95" rel="noreferrer" target="_blank">https://cgit.FreeBSD.org/src/commit/?id=3be59adbb5a2ae7600d46432d3bc82286e507e95</a><br>;
&gt;<br>
&gt; commit 3be59adbb5a2ae7600d46432d3bc82286e507e95<br>
&gt; Author:     Warner Losh &lt;imp@FreeBSD.org&gt;<br>
&gt; AuthorDate: 2024-01-29 05:08:55 +0000<br>
&gt; Commit:     Warner Losh &lt;imp@FreeBSD.org&gt;<br>
&gt; CommitDate: 2024-01-29 05:08:55 +0000<br>
&gt;<br>
&gt;    vtnet: Adjust for ethernet alignment.<br>
&gt;<br>
&gt;    If the header that we add to the packet&#39;s size is 0 % 4 and we&#39;re<br>
&gt;    strictly aligning, then we need to adjust where we store the header so<br>
&gt;    the packet that follows will have it&#39;s struct ip header properly<br>
&gt;    aligned.  We do this on allocation (and when we check the length of the<br>
&gt;    mbufs in the lro_nomrg case). We can&#39;t just adjust the clustersz in the<br>
&gt;    softc, because it&#39;s also used to allocate the mbufs and it needs to be<br>
&gt;    the proper size for that. Since we otherwise use the size of the mbuf<br>
&gt;    (or sometimes the smaller size of the received packet) to compute how<br>
&gt;    much we can buffer, this ensures no overflows. The 2 byte adjustment<br>
&gt;    also does not affect how many packets we can receive in the lro_nomrg<br>
&gt;    case.<br>
<br>
<br>
Doesn&#39;t this still include at least these two un-asserted/un-documented asumptions:<br>
<br>
(a) mbuf space is large enough to hold 2 extra bytes?  Is this always<br>
     the case?<br></blockquote><div><br></div><div>I was sure I puzzled through all the cases correctly.  We adjust the length</div><div>of the available buffer by 2 and offset everything by 2. this work because</div><div>all the vtnet header types only have 1 or 2 byte data fields. It keeps us from</div><div>writing too much into the buffer.</div><div><br></div><div>However, in vtnet_rx_cluster_size, we don&#39;t adjust the frame size before</div><div>allocating. So if the mtu + vlan_header. So if the mtu + 12 + 18 is 2047 or 2048</div><div>or mtu = 2017 or 2018 we&#39;ll get it wrong (we don&#39;t adjust in the case where</div><div>we use vtnet_rx_header which is 14 bytes). But I think in that case, we&#39;ll &quot;drop</div><div>the last two bytes off the end&quot; get it wrong (since we adjust the total length</div><div>of the mbuf space) rather than &quot;overflow two bytes&quot; get it wrong. For that</div><div>case, we&#39;d need to add two as I indicated in the comments below.</div><div><br></div><div>static int<br></div><div>vtnet_rx_cluster_size(struct vtnet_softc *sc, int mtu)<br>{<br>        int framesz;<br><br>        if (sc-&gt;vtnet_flags &amp; VTNET_FLAG_MRG_RXBUFS)<br>                return (MJUMPAGESIZE);<br>        else if (sc-&gt;vtnet_flags &amp; VTNET_FLAG_LRO_NOMRG)<br>                return (MCLBYTES);<br><br>        /*<br>         * Try to scale the receive mbuf cluster size from the MTU. We<br>         * could also use the VQ size to influence the selected size,<br>         * but that would only matter for very small queues.<br>         */<br>        if (vtnet_modern(sc)) {<br>                MPASS(sc-&gt;vtnet_hdr_size == sizeof(struct virtio_net_hdr_v1));<br>                framesz = sizeof(struct virtio_net_hdr_v1);<br>        } else<br>                framesz = sizeof(struct vtnet_rx_header); <br>        framesz += sizeof(struct ether_vlan_header) + mtu;<br>// XXX if framesz % 4 == 2 and we&#39;re strict alignment we need to add 2</div><div>// XXX or equivalently, if vnet_hdr_size % 4 == 0 and ...<br>        if (framesz &lt;= MCLBYTES)<br>                return (MCLBYTES);<br>        else if (framesz &lt;= MJUMPAGESIZE)<br>                return (MJUMPAGESIZE);<br>        else if (framesz &lt;= MJUM9BYTES)<br>                return (MJUM9BYTES);<br><br>        /* Sane default; avoid 16KB clusters. */<br>        return (MCLBYTES);<br>}<br></div><div><br></div><div>Do you agree? Is this what you are worried about? It&#39;s the only hole I could find</div><div>this morning (after going over this a dozen other times trying to get it right for</div><div>the review, and bryanv was happy neither noticed). It also explains why my</div><div>tests work: I didn&#39;t try to have a weird mtu of 2018 bytes.</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
(b) the struct sizes assigned to vtnet_hdr_size are not odd numbers of<br>
     bytes?  Could add comments or CTASSERTs?<br></blockquote><div><br></div><div>True, I&#39;ll ctassert the sizes and say we rely on things being even sized</div><div>in if_vnetvar.h.</div><div><br></div><div>Warner</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
&gt;    PR:                     271288<br>
&gt;    Sponsored by:           Netflix<br>
&gt;    Reviewed by:            bryanv<br>
&gt;    Differential Revision:  <a href="https://reviews.freebsd.org/D43224" rel="noreferrer" target="_blank">https://reviews.freebsd.org/D43224</a><br>;
<br>
-- <br>
Bjoern A. Zeeb                                                     r15:7<br>
</blockquote></div></div>
</blockquote></div>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfrMbeCBeCA=xuZBapR5rR-ASRtn3e3BCw%2BE8NU2AerzpA>