Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Aug 2017 23:18:23 -0500
From:      "Mike Karels" <mike@karels.net>
To:        "Gopakumar Pillai" <gpillai@vmware.com>
Cc:        "Julian Elischer" <julian@freebsd.org>, "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>, "freebsd-net@FreeBSD.org" <freebsd-net@FreeBSD.org>
Subject:   Re: Only last IP frag sent if ARP entry absent
Message-ID:  <B2C02E8A-7F4A-4715-8294-F61609E9E53D@karels.net>
In-Reply-To: <C9314EED-0092-49B4-BA3E-9FA58D81D064@vmware.com>
References:  <F9ABB88D-108D-4EF0-8962-091662F488FD@vmware.com> <43CC3432-DB42-4170-B3E7-E305561973F3@lists.zabbadoz.net> <AFD0C317-D4E2-4A9E-B6F2-CCA2B0B7464F@vmware.com> <9B1B1A12-CD9F-4A9F-B596-A2F6E5BAED1E@karels.net> <f9fcad5d-1fc3-0d53-8eb1-0577df673c38@freebsd.org> <EF9870E7-2622-4250-8897-2711C0B51692@karels.net> <C9314EED-0092-49B4-BA3E-9FA58D81D064@vmware.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 21 Aug 2017, at 1:11, Gopakumar Pillai wrote:

> Looks like later FreeBSD already has some amount of queueing from what =

> Oleg has pointed out:
>
> $ sysctl net.link.ether.inet.maxhold
> net.link.ether.inet.maxhold: 1
>
> As Mike mentioned, my fix looks into a logical IP packet. And it keeps =

> only one logical IP packet =E2=80=93 i.e 64K bytes =E2=80=93 43 packets=
=2E I did =

> test it in my code, didn=E2=80=99t see any issues yet.
>
> Latest FreeBSD code would keep the specified number of physical IP =

> packets, possible to have more than one logical IP packet, but could =

> possibly break a logical IP packet too.
>
> I do now understand its not a big deal, especially since there=E2=80=99=
s a =

> way to configure that in latest FreeBSD code. I shall fix my code one =

> of the above 2 ways.

Why not just set maxhold  to your favorite value (e.g. 43?).

> Thank You all for your support and help.
>
> --Gopu
>
>
> On 8/19/17, 9:56 AM, "Mike Karels" <mike@karels.net> wrote:
>
>
>
>     On 19 Aug 2017, at 4:00, Julian Elischer wrote:
>
>     > On 18/8/17 11:33 am, Mike Karels wrote:
>     >> Another $.02 (inline):
>     >>
>     >> On 17 Aug 2017, at 18:39, Gopakumar Pillai wrote:
>     >>
>     >>> Thank You Bjoern. Could you please point me to the RFC?
>     >>
>     >> I don=E2=80=99t know if there is anything more recent than RFC11=
22 on =

> this.
>     >>  IIRC, it requires queuing at least one packet.  Queing one =

> packet is
>     >> what BSD has done essentially since ARP was implemented.
>     >
>     > This asks the question:  One physical packet or one logical =

> packet?
>     > Gopakumar's change effectively changes the queuing from one =

> physical
>     > packet to the logical one.
>     > The next question becomes "how much extra work do we do to =

> achieve
>     > this and does it affect anything else"?
>
>     That isn=E2=80=99t the whole question.  It=E2=80=99s one physical p=
acket, one
>     logical packet, or multiple frames?
>     It makes more sense to me to support multiple frames rather than =

> just
>     one logical packet.  However,
>     I don=E2=80=99t see a good reason to change from the current code.
>
>     >>> If this is not a MUST behavior in RFC, would my fix be good? I =

> agree
>     >>> that this would affect only ICMP/UDP traffic.
>     >>
>     >> People have been asking for queuing of multiple packets for =

> years.
>     >> That is a more general change.  Consider another dumb =

> application
>     >> that starts out by sending multiple UDP packets back-to-back.
>     >> However, well-designed application protocols don=E2=80=99t exper=
ience
>     >> problems like this.  I=E2=80=99ll quickly note that ping isn=E2=80=
=99t an
>     >> application, but a network measuring tool.  If you ask the =

> question
>     >> =E2=80=9Cwhat happens if I start off a session with a single lar=
ge =

> packet
>     >> and I don=E2=80=99t support retransmission=E2=80=9D, ping answer=
s that =

> question
>     >> correctly.
>     >>
>     >> If badly-designed protocols get bad performance, that doesn=E2=80=
=99t =

> seem
>     >> like a bug to me, but a feature.
>     >>
>     >>> On 8/17/17, 2:40 PM, "Bjoern A. Zeeb"
>     >>> <bzeeb-lists@lists.zabbadoz.net> wrote:
>     >>>
>     >>>     On 17 Aug 2017, at 21:16, Gopakumar Pillai wrote:
>     >>>
>     >>>     > Hi FreeBSD Networking Gurus,
>     >>>     > I came across an issue with an old version of FreeBSD =

> and
>     >>> looking at
>     >>>     > the latest FreeBSD code, seems it exists even now. I am
>     >>> assuming that
>     >>>     > this issue is not reported.
>     >>>     >
>     >>>     > Observation:
>     >>>     > When a ping was performed with larger payload than MTU, =

> the
>     >>> first ping
>     >>>     > failed when the ARP entry was absent for that IP.
>     >>>
>     >>>     That is because ping/ICMP has no retransmit.
>     >>>
>     >>>
>     >>>     > Noticed on the wire that the last IP fragment was sent =

> for the
>     >>> first
>     >>>     > request and then the subsequent requests were fine.
>     >>>     >
>     >>>     > Root Cause:
>     >>>     >   * ip_output fragments the packets and loops through =

> the
>     >>> fragments to
>     >>>     > send them to ether_output.
>     >>>     >   * ether_output does an arpresolve and if there is no
>     >>> existing ARP
>     >>>     > entry it'll return EWOULDBLOCK after sending ARP =

> Request.
>     >>>     >   * ether_output ignores the error and propagates =

> success to
>     >>> ip_output
>     >>>     > and it continues to send the remaining fragments.
>     >>>     >   * llentry keeps only one mbuf and the last fragment is
>     >>> retained when
>     >>>     > the ARP Reply comes and the fragment is sent.
>     >>>
>     >>>     Yes, according to the spec (RFC) we are supposed to throw =

> the
>     >>> packet
>     >>>     away entirely and simply report that to the next upper =

> layer.
>     >>> However
>     >>>     over the years people realised that this sucks for a TCP =

> SYN
>     >>> packet with
>     >>>     a retransmit timer and hence we store one of them.
>     >>>
>     >>>     A large UDP packet would btw see the same behaviour to =

> your
>     >>> ping.
>     >>>     There=E2=80=99s no guarantee any of these packets will not =
be =

> dropped
>     >>> anywhere
>     >>>     on the network, so we can as well.
>     >>>
>     >>>     Just my 2ct
>     >>>
>     >>>     /bz
>     >>
>     >>         Mike
>     >> _______________________________________________
>     >> freebsd-net@freebsd.org mailing list
>     >> =

> https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__lists.freebsd.or=
g_mailman_listinfo_freebsd-2Dnet&d=3DDwIFaQ&c=3DuilaK90D4TOVoH58JNXRgQ&r=3D=
SPMIiiJNfXk7ujuip5qobK77LnnVM8kVNC-LzM_0RWk&m=3DgVqPCwvWs-eO0Y8jGefr8abxl=
nmG_GklVISDsn3solU&s=3D_748SiGYexZf7oZMSG2ZVDkzcelyZECM0lFMpbojDWA&e=3D
>     >> To unsubscribe, send any mail to
>     >> "freebsd-net-unsubscribe@freebsd.org"
>     >>
>     >>
>     >
>     > _______________________________________________
>     > freebsd-net@freebsd.org mailing list
>     > =

> https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__lists.freebsd.or=
g_mailman_listinfo_freebsd-2Dnet&d=3DDwIFaQ&c=3DuilaK90D4TOVoH58JNXRgQ&r=3D=
SPMIiiJNfXk7ujuip5qobK77LnnVM8kVNC-LzM_0RWk&m=3DgVqPCwvWs-eO0Y8jGefr8abxl=
nmG_GklVISDsn3solU&s=3D_748SiGYexZf7oZMSG2ZVDkzcelyZECM0lFMpbojDWA&e=3D
>     > To unsubscribe, send any mail to =

> "freebsd-net-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B2C02E8A-7F4A-4715-8294-F61609E9E53D>