From owner-freebsd-net@freebsd.org Tue Aug 22 04:18:11 2017 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7865ADEF0BE for ; Tue, 22 Aug 2017 04:18:11 +0000 (UTC) (envelope-from mike@karels.net) Received: from mail.karels.net (mail.karels.net [63.231.190.5]) by mx1.freebsd.org (Postfix) with ESMTP id 14B2A70875; Tue, 22 Aug 2017 04:18:10 +0000 (UTC) (envelope-from mike@karels.net) Received: from [10.0.2.11] (mjk-mac2.karels.net [10.0.2.11]) by mail.karels.net (8.15.2/8.15.2) with ESMTP id v7M4I36k002938; Mon, 21 Aug 2017 23:18:03 -0500 (CDT) (envelope-from mike@karels.net) From: "Mike Karels" To: "Gopakumar Pillai" Cc: "Julian Elischer" , "Bjoern A. Zeeb" , "freebsd-net@FreeBSD.org" Subject: Re: Only last IP frag sent if ARP entry absent Date: Mon, 21 Aug 2017 23:18:23 -0500 Message-ID: In-Reply-To: References: <43CC3432-DB42-4170-B3E7-E305561973F3@lists.zabbadoz.net> <9B1B1A12-CD9F-4A9F-B596-A2F6E5BAED1E@karels.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Mailer: MailMate (1.9.6r5347) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Aug 2017 04:18:11 -0000 On 21 Aug 2017, at 1:11, Gopakumar Pillai wrote: > Looks like later FreeBSD already has some amount of queueing from what = > Oleg has pointed out: > > $ sysctl net.link.ether.inet.maxhold > net.link.ether.inet.maxhold: 1 > > As Mike mentioned, my fix looks into a logical IP packet. And it keeps = > only one logical IP packet =E2=80=93 i.e 64K bytes =E2=80=93 43 packets= =2E I did = > test it in my code, didn=E2=80=99t see any issues yet. > > Latest FreeBSD code would keep the specified number of physical IP = > packets, possible to have more than one logical IP packet, but could = > possibly break a logical IP packet too. > > I do now understand its not a big deal, especially since there=E2=80=99= s a = > way to configure that in latest FreeBSD code. I shall fix my code one = > of the above 2 ways. Why not just set maxhold to your favorite value (e.g. 43?). > Thank You all for your support and help. > > --Gopu > > > On 8/19/17, 9:56 AM, "Mike Karels" wrote: > > > > On 19 Aug 2017, at 4:00, Julian Elischer wrote: > > > On 18/8/17 11:33 am, Mike Karels wrote: > >> Another $.02 (inline): > >> > >> On 17 Aug 2017, at 18:39, Gopakumar Pillai wrote: > >> > >>> Thank You Bjoern. Could you please point me to the RFC? > >> > >> I don=E2=80=99t know if there is anything more recent than RFC11= 22 on = > this. > >> IIRC, it requires queuing at least one packet. Queing one = > packet is > >> what BSD has done essentially since ARP was implemented. > > > > This asks the question: One physical packet or one logical = > packet? > > Gopakumar's change effectively changes the queuing from one = > physical > > packet to the logical one. > > The next question becomes "how much extra work do we do to = > achieve > > this and does it affect anything else"? > > That isn=E2=80=99t the whole question. It=E2=80=99s one physical p= acket, one > logical packet, or multiple frames? > It makes more sense to me to support multiple frames rather than = > just > one logical packet. However, > I don=E2=80=99t see a good reason to change from the current code. > > >>> If this is not a MUST behavior in RFC, would my fix be good? I = > agree > >>> that this would affect only ICMP/UDP traffic. > >> > >> People have been asking for queuing of multiple packets for = > years. > >> That is a more general change. Consider another dumb = > application > >> that starts out by sending multiple UDP packets back-to-back. > >> However, well-designed application protocols don=E2=80=99t exper= ience > >> problems like this. I=E2=80=99ll quickly note that ping isn=E2=80= =99t an > >> application, but a network measuring tool. If you ask the = > question > >> =E2=80=9Cwhat happens if I start off a session with a single lar= ge = > packet > >> and I don=E2=80=99t support retransmission=E2=80=9D, ping answer= s that = > question > >> correctly. > >> > >> If badly-designed protocols get bad performance, that doesn=E2=80= =99t = > seem > >> like a bug to me, but a feature. > >> > >>> On 8/17/17, 2:40 PM, "Bjoern A. Zeeb" > >>> wrote: > >>> > >>> On 17 Aug 2017, at 21:16, Gopakumar Pillai wrote: > >>> > >>> > Hi FreeBSD Networking Gurus, > >>> > I came across an issue with an old version of FreeBSD = > and > >>> looking at > >>> > the latest FreeBSD code, seems it exists even now. I am > >>> assuming that > >>> > this issue is not reported. > >>> > > >>> > Observation: > >>> > When a ping was performed with larger payload than MTU, = > the > >>> first ping > >>> > failed when the ARP entry was absent for that IP. > >>> > >>> That is because ping/ICMP has no retransmit. > >>> > >>> > >>> > Noticed on the wire that the last IP fragment was sent = > for the > >>> first > >>> > request and then the subsequent requests were fine. > >>> > > >>> > Root Cause: > >>> > * ip_output fragments the packets and loops through = > the > >>> fragments to > >>> > send them to ether_output. > >>> > * ether_output does an arpresolve and if there is no > >>> existing ARP > >>> > entry it'll return EWOULDBLOCK after sending ARP = > Request. > >>> > * ether_output ignores the error and propagates = > success to > >>> ip_output > >>> > and it continues to send the remaining fragments. > >>> > * llentry keeps only one mbuf and the last fragment is > >>> retained when > >>> > the ARP Reply comes and the fragment is sent. > >>> > >>> Yes, according to the spec (RFC) we are supposed to throw = > the > >>> packet > >>> away entirely and simply report that to the next upper = > layer. > >>> However > >>> over the years people realised that this sucks for a TCP = > SYN > >>> packet with > >>> a retransmit timer and hence we store one of them. > >>> > >>> A large UDP packet would btw see the same behaviour to = > your > >>> ping. > >>> There=E2=80=99s no guarantee any of these packets will not = be = > dropped > >>> anywhere > >>> on the network, so we can as well. > >>> > >>> Just my 2ct > >>> > >>> /bz > >> > >> Mike > >> _______________________________________________ > >> freebsd-net@freebsd.org mailing list > >> = > https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__lists.freebsd.or= g_mailman_listinfo_freebsd-2Dnet&d=3DDwIFaQ&c=3DuilaK90D4TOVoH58JNXRgQ&r=3D= SPMIiiJNfXk7ujuip5qobK77LnnVM8kVNC-LzM_0RWk&m=3DgVqPCwvWs-eO0Y8jGefr8abxl= nmG_GklVISDsn3solU&s=3D_748SiGYexZf7oZMSG2ZVDkzcelyZECM0lFMpbojDWA&e=3D > >> To unsubscribe, send any mail to > >> "freebsd-net-unsubscribe@freebsd.org" > >> > >> > > > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > = > https://urldefense.proofpoint.com/v2/url?u=3Dhttps-3A__lists.freebsd.or= g_mailman_listinfo_freebsd-2Dnet&d=3DDwIFaQ&c=3DuilaK90D4TOVoH58JNXRgQ&r=3D= SPMIiiJNfXk7ujuip5qobK77LnnVM8kVNC-LzM_0RWk&m=3DgVqPCwvWs-eO0Y8jGefr8abxl= nmG_GklVISDsn3solU&s=3D_748SiGYexZf7oZMSG2ZVDkzcelyZECM0lFMpbojDWA&e=3D > > To unsubscribe, send any mail to = > "freebsd-net-unsubscribe@freebsd.org"