Date: Mon, 29 Jan 2024 14:13:25 -0700 From: Warner Losh <imp@bsdimp.com> To: "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net> Cc: Warner Losh <imp@freebsd.org>, src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 3be59adbb5a2 - main - vtnet: Adjust for ethernet alignment. Message-ID: <CANCZdfrMbeCBeCA=xuZBapR5rR-ASRtn3e3BCw%2BE8NU2AerzpA@mail.gmail.com> In-Reply-To: <CANCZdfr_PQKH_9hvqDi=wZgR13WC6NwQ07OKdX_1pq7XrGVgfw@mail.gmail.com> References: <202401290514.40T5Eb1i061789@gitrepo.freebsd.org> <n4s5849r-4q46-3628-qq82-p50q3698172n@yvfgf.mnoonqbm.arg> <CANCZdfr_PQKH_9hvqDi=wZgR13WC6NwQ07OKdX_1pq7XrGVgfw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000079bbe206101c1e42 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sorry for (a) the top post and (b) replying to myself... https://reviews.freebsd.org/D43656 is what I think you're worried about. Am I right, or is there some place else that has you uneasy? Warner P.S. I'm also thinking about following that up with https://reviews.freebsd.org/D43654 since that style seems more in fashion these days. On Mon, Jan 29, 2024 at 10:14=E2=80=AFAM Warner Losh <imp@bsdimp.com> wrote= : > > > On Mon, Jan 29, 2024 at 8:26=E2=80=AFAM Bjoern A. Zeeb < > bzeeb-lists@lists.zabbadoz.net> wrote: > >> On Mon, 29 Jan 2024, Warner Losh wrote: >> >> > The branch main has been updated by imp: >> > >> > URL: >> https://cgit.FreeBSD.org/src/commit/?id=3D3be59adbb5a2ae7600d46432d3bc82= 286e507e95 >> > >> > commit 3be59adbb5a2ae7600d46432d3bc82286e507e95 >> > Author: Warner Losh <imp@FreeBSD.org> >> > AuthorDate: 2024-01-29 05:08:55 +0000 >> > Commit: Warner Losh <imp@FreeBSD.org> >> > CommitDate: 2024-01-29 05:08:55 +0000 >> > >> > vtnet: Adjust for ethernet alignment. >> > >> > If the header that we add to the packet's size is 0 % 4 and we're >> > strictly aligning, then we need to adjust where we store the header >> so >> > the packet that follows will have it's struct ip header properly >> > aligned. We do this on allocation (and when we check the length of >> the >> > mbufs in the lro_nomrg case). We can't just adjust the clustersz in >> the >> > softc, because it's also used to allocate the mbufs and it needs to >> be >> > the proper size for that. Since we otherwise use the size of the mb= uf >> > (or sometimes the smaller size of the received packet) to compute h= ow >> > much we can buffer, this ensures no overflows. The 2 byte adjustmen= t >> > also does not affect how many packets we can receive in the lro_nom= rg >> > case. >> >> >> Doesn't this still include at least these two un-asserted/un-documented >> asumptions: >> >> (a) mbuf space is large enough to hold 2 extra bytes? Is this always >> the case? >> > > I was sure I puzzled through all the cases correctly. We adjust the leng= th > of the available buffer by 2 and offset everything by 2. this work becaus= e > all the vtnet header types only have 1 or 2 byte data fields. It keeps us > from > writing too much into the buffer. > > However, in vtnet_rx_cluster_size, we don't adjust the frame size before > allocating. So if the mtu + vlan_header. So if the mtu + 12 + 18 is 2047 > or 2048 > or mtu =3D 2017 or 2018 we'll get it wrong (we don't adjust in the case w= here > we use vtnet_rx_header which is 14 bytes). But I think in that case, we'l= l > "drop > the last two bytes off the end" get it wrong (since we adjust the total > length > of the mbuf space) rather than "overflow two bytes" get it wrong. For tha= t > case, we'd need to add two as I indicated in the comments below. > > static int > vtnet_rx_cluster_size(struct vtnet_softc *sc, int mtu) > { > int framesz; > > if (sc->vtnet_flags & VTNET_FLAG_MRG_RXBUFS) > return (MJUMPAGESIZE); > else if (sc->vtnet_flags & VTNET_FLAG_LRO_NOMRG) > return (MCLBYTES); > > /* > * Try to scale the receive mbuf cluster size from the MTU. We > * could also use the VQ size to influence the selected size, > * but that would only matter for very small queues. > */ > if (vtnet_modern(sc)) { > MPASS(sc->vtnet_hdr_size =3D=3D sizeof(struct > virtio_net_hdr_v1)); > framesz =3D sizeof(struct virtio_net_hdr_v1); > } else > framesz =3D sizeof(struct vtnet_rx_header); > framesz +=3D sizeof(struct ether_vlan_header) + mtu; > // XXX if framesz % 4 =3D=3D 2 and we're strict alignment we need to add = 2 > // XXX or equivalently, if vnet_hdr_size % 4 =3D=3D 0 and ... > if (framesz <=3D MCLBYTES) > return (MCLBYTES); > else if (framesz <=3D MJUMPAGESIZE) > return (MJUMPAGESIZE); > else if (framesz <=3D MJUM9BYTES) > return (MJUM9BYTES); > > /* Sane default; avoid 16KB clusters. */ > return (MCLBYTES); > } > > Do you agree? Is this what you are worried about? It's the only hole I > could find > this morning (after going over this a dozen other times trying to get it > right for > the review, and bryanv was happy neither noticed). It also explains why m= y > tests work: I didn't try to have a weird mtu of 2018 bytes. > > >> (b) the struct sizes assigned to vtnet_hdr_size are not odd numbers of >> bytes? Could add comments or CTASSERTs? >> > > True, I'll ctassert the sizes and say we rely on things being even sized > in if_vnetvar.h. > > Warner > > >> > PR: 271288 >> > Sponsored by: Netflix >> > Reviewed by: bryanv >> > Differential Revision: https://reviews.freebsd.org/D43224 >> >> -- >> Bjoern A. Zeeb r15:7 >> > --00000000000079bbe206101c1e42 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr">Sorry=C2=A0for (a) the top post and (b) replying to myself= ...<div><br></div><div><a href=3D"https://reviews.freebsd.org/D43656">https= ://reviews.freebsd.org/D43656</a><br></div><div><br></div><div>is what I th= ink you're worried about. Am I right, or is there some place else that = has you uneasy?</div><div><br></div><div>Warner</div><div><br></div><div>P.= S. I'm also thinking about following that up with=C2=A0<a href=3D"https= ://reviews.freebsd.org/D43654">https://reviews.freebsd.org/D43654</a> since= that style seems more in fashion these days.</div></div><br><div class=3D"= gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Jan 29, 2024 at = 10:14=E2=80=AFAM Warner Losh <<a href=3D"mailto:imp@bsdimp.com">imp@bsdi= mp.com</a>> wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"m= argin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left= :1ex"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_q= uote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Jan 29, 2024 at 8:26=E2= =80=AFAM Bjoern A. Zeeb <<a href=3D"mailto:bzeeb-lists@lists.zabbadoz.ne= t" target=3D"_blank">bzeeb-lists@lists.zabbadoz.net</a>> wrote:<br></div= ><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border= -left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, 29 Jan 2024, War= ner Losh wrote:<br> <br> > The branch main has been updated by imp:<br> ><br> > URL: <a href=3D"https://cgit.FreeBSD.org/src/commit/?id=3D3be59adbb5a2= ae7600d46432d3bc82286e507e95" rel=3D"noreferrer" target=3D"_blank">https://= cgit.FreeBSD.org/src/commit/?id=3D3be59adbb5a2ae7600d46432d3bc82286e507e95<= /a><br> ><br> > commit 3be59adbb5a2ae7600d46432d3bc82286e507e95<br> > Author:=C2=A0 =C2=A0 =C2=A0Warner Losh <imp@FreeBSD.org><br> > AuthorDate: 2024-01-29 05:08:55 +0000<br> > Commit:=C2=A0 =C2=A0 =C2=A0Warner Losh <imp@FreeBSD.org><br> > CommitDate: 2024-01-29 05:08:55 +0000<br> ><br> >=C2=A0 =C2=A0 vtnet: Adjust for ethernet alignment.<br> ><br> >=C2=A0 =C2=A0 If the header that we add to the packet's size is 0 %= 4 and we're<br> >=C2=A0 =C2=A0 strictly aligning, then we need to adjust where we store = the header so<br> >=C2=A0 =C2=A0 the packet that follows will have it's struct ip head= er properly<br> >=C2=A0 =C2=A0 aligned.=C2=A0 We do this on allocation (and when we chec= k the length of the<br> >=C2=A0 =C2=A0 mbufs in the lro_nomrg case). We can't just adjust th= e clustersz in the<br> >=C2=A0 =C2=A0 softc, because it's also used to allocate the mbufs a= nd it needs to be<br> >=C2=A0 =C2=A0 the proper size for that. Since we otherwise use the size= of the mbuf<br> >=C2=A0 =C2=A0 (or sometimes the smaller size of the received packet) to= compute how<br> >=C2=A0 =C2=A0 much we can buffer, this ensures no overflows. The 2 byte= adjustment<br> >=C2=A0 =C2=A0 also does not affect how many packets we can receive in t= he lro_nomrg<br> >=C2=A0 =C2=A0 case.<br> <br> <br> Doesn't this still include at least these two un-asserted/un-documented= asumptions:<br> <br> (a) mbuf space is large enough to hold 2 extra bytes?=C2=A0 Is this always<= br> =C2=A0 =C2=A0 =C2=A0the case?<br></blockquote><div><br></div><div>I was sur= e I puzzled through all the cases correctly.=C2=A0 We adjust the length</di= v><div>of the available buffer by 2 and offset everything by 2. this work b= ecause</div><div>all the vtnet header types only have 1 or 2 byte data fiel= ds. It keeps us from</div><div>writing too much into the buffer.</div><div>= <br></div><div>However, in vtnet_rx_cluster_size, we don't adjust the f= rame size before</div><div>allocating. So if the mtu=C2=A0+ vlan_header. So= if the mtu + 12 + 18 is 2047 or 2048</div><div>or mtu =3D 2017 or 2018 we&= #39;ll get it wrong (we don't adjust in the case where</div><div>we use= vtnet_rx_header which is 14 bytes). But I think in that case, we'll &q= uot;drop</div><div>the last two bytes off the end" get it wrong (since= we adjust the total length</div><div>of the mbuf space) rather than "= overflow two bytes" get it wrong. For that</div><div>case, we'd ne= ed to add two as I indicated in the comments below.</div><div><br></div><di= v>static int<br></div><div>vtnet_rx_cluster_size(struct vtnet_softc *sc, in= t mtu)<br>{<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 int framesz;<br><br>=C2=A0 =C2= =A0 =C2=A0 =C2=A0 if (sc->vtnet_flags & VTNET_FLAG_MRG_RXBUFS)<br>= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (MJUMPAGESIZ= E);<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 else if (sc->vtnet_flags & VTNET_= FLAG_LRO_NOMRG)<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = return (MCLBYTES);<br><br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 /*<br>=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0* Try to scale the receive mbuf cluster size from the M= TU. We<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* could also use the VQ size to= influence the selected size,<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* but th= at would only matter for very small queues.<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0*/<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (vtnet_modern(sc)) {<br>=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 MPASS(sc->vtnet_hdr_siz= e =3D=3D sizeof(struct virtio_net_hdr_v1));<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 framesz =3D sizeof(struct virtio_net_hdr_v1);<b= r>=C2=A0 =C2=A0 =C2=A0 =C2=A0 } else<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 framesz =3D sizeof(struct vtnet_rx_header);=C2=A0<br>= =C2=A0 =C2=A0 =C2=A0 =C2=A0 framesz +=3D sizeof(struct ether_vlan_header) += mtu;<br>// XXX if framesz % 4 =3D=3D 2 and we're strict alignment we n= eed to add 2</div><div>// XXX or equivalently, if vnet_hdr_size % 4 =3D=3D = 0 and ...<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (framesz <=3D MCLBYTES)<br>= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (MCLBYTES);<= br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 else if (framesz <=3D MJUMPAGESIZE)<br>= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (MJUMPAGESIZ= E);<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 else if (framesz <=3D MJUM9BYTES)<br>= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return (MJUM9BYTES)= ;<br><br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 /* Sane default; avoid 16KB clusters. = */<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 return (MCLBYTES);<br>}<br></div><div><br= ></div><div>Do you agree? Is this what you are worried about? It's the = only hole I could find</div><div>this morning (after going over this a doze= n other times trying to get it right for</div><div>the review, and bryanv w= as happy neither noticed). It also explains why my</div><div>tests work: I = didn't try to have a weird mtu of 2018 bytes.</div><div>=C2=A0<br></div= ><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border= -left:1px solid rgb(204,204,204);padding-left:1ex"> (b) the struct sizes assigned to vtnet_hdr_size are not odd numbers of<br> =C2=A0 =C2=A0 =C2=A0bytes?=C2=A0 Could add comments or CTASSERTs?<br></bloc= kquote><div><br></div><div>True, I'll ctassert the sizes and say we rel= y on things being even sized</div><div>in if_vnetvar.h.</div><div><br></div= ><div>Warner</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style= =3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding= -left:1ex"> >=C2=A0 =C2=A0 PR:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0271288<br> >=C2=A0 =C2=A0 Sponsored by:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Net= flix<br> >=C2=A0 =C2=A0 Reviewed by:=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 bry= anv<br> >=C2=A0 =C2=A0 Differential Revision:=C2=A0 <a href=3D"https://reviews.f= reebsd.org/D43224" rel=3D"noreferrer" target=3D"_blank">https://reviews.fre= ebsd.org/D43224</a><br> <br> -- <br> Bjoern A. Zeeb=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0r15:7<br> </blockquote></div></div> </blockquote></div> --00000000000079bbe206101c1e42--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfrMbeCBeCA=xuZBapR5rR-ASRtn3e3BCw%2BE8NU2AerzpA>