Date: Wed, 14 Aug 2013 17:40:28 +0200 From: Marko Zec <zec@fer.hr> To: <freebsd-net@freebsd.org> Cc: Lawrence Stewart <lstewart@freebsd.org>, Lev Serebryakov <lev@freebsd.org>, Luigi Rizzo <rizzo@iet.unipi.it>, "Alexander V. Chernikov" <melifaro@ipfw.ru>, FreeBSD Net <net@freebsd.org> Subject: Re: route/arp lifetime (Re: it's the output, not ack coalescing (Re: TSO and FreeBSD vs Linux)) Message-ID: <201308141740.28779.zec@fer.hr> In-Reply-To: <20130814124024.GA64548@onelab2.iet.unipi.it> References: <520A6D07.5080106@freebsd.org> <520B74DD.1060102@ipfw.ru> <20130814124024.GA64548@onelab2.iet.unipi.it>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 14 August 2013 14:40:24 Luigi Rizzo wrote: > On Wed, Aug 14, 2013 at 04:15:25PM +0400, Alexander V. Chernikov wrote: > > On 14.08.2013 16:05, Luigi Rizzo wrote: > > > On Wed, Aug 14, 2013 at 03:47:13PM +0400, Lev Serebryakov wrote: > > >> Hello, Luigi. > > >> You wrote 14 ?????????????? 2013 ??., 14:21:09: > > >> > > >> LR> Then the problem remains that we should keep a copy of route and > > >> LR> arp information in the socket instead of redoing the lookups on > > >> LR> every single transmission, as they consume some 25% of the time > > >> of LR> a sendto(), and probably even more when it comes to large tcp > > >> LR> segments, sendfile() and the like. > > >> And we should invalidate this info on ARP/route changes, or > > >> connection will be lost in such cases, am I right?.. So, on each > > >> such event code should look into all sockets and check, if > > >> routing/ARP information is still valid for them. Or we should store > > >> lists of sockets in routing and ARP tables... I don't know, what is > > >> worse. > > > > > > I think we should start by acknowledging that routing and ARP > > > information is inherently stale, and changes unfrequently. > > > So it is not a disaster if we have incorrect information for some > > > short amount of time (milliseconds) because in the end the remote > > > party that decides to change it and inform us may take much longer > > > than that to distribute the update. > > > > You can save rte&arp, however doing this > > gives you perfect chance to crash your kernel if egress interface is > > destroyed (like vlan or ng or tun). > > I hope I learned not to follow a stale ifp pointer :) > anyways ARP is really just the mac address so there is no > dandling pointer issue. > > For the ifp associated to the route, > i do not see a huge problem in marking the route/ifp as > zombie and destroy it when the last reference goes away. FWIW, apparently we already have that infrastrucure in place - if_rele() calls if_free_internal() only when the last reference to the ifnet is dropped, so with little care this should be usable for caching ifp pointers w/o fears for kernel crashes mentioned above. Marko > Not that the current way is any better -- you need to lock/unlock > the rte while you do the lookup, and hold a refcount to the ifp > until the packet is queued. So how does my suggestion make > things worse ? > > cheers > luigi > > > > Considering that each lookup takes between 100..300ns if you are > > > lucky (not many misses, relatively empty table etc.), one could > > > reasonably do the lookup at most once per millisecond or so (just > > > reading 'ticks', no need for a nanotime() if you have a slow clock), > > > or whenever we get an error related to the socket, either in the > > > forward path (e.g. ifp points to an interface that is down) or in > > > the reverse path (e.g. a dupack because we sent a packet to the > > > wrong place). > > > > This sounds like "Hey, the kernel lookup is slow (which is true), let's > > make a hack and don't bother lookups". > > This approach gives us mtx-locked rte refcounts which are used > > (misused) in many places making things worse and decreasing the ability > > to fix the things up.. > > > > > cheers > > > luigi > > > _______________________________________________ > > > freebsd-net@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > > To unsubscribe, send any mail to > > > "freebsd-net-unsubscribe@freebsd.org" > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201308141740.28779.zec>