From owner-freebsd-net@FreeBSD.ORG Thu May 29 12:39:26 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A3D04597 for ; Thu, 29 May 2014 12:39:26 +0000 (UTC) Received: from mailgw12.technion.ac.il (mailgw12.technion.ac.il [132.68.225.12]) by mx1.freebsd.org (Postfix) with ESMTP id 1BC4E2522 for ; Thu, 29 May 2014 12:39:25 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AncDALgph1OERHMGjGdsb2JhbABZg1mCcqd3BQEBBoM8jR6HPoENFg4BAQEnPIJTEz8OLjQFBxFEiELSK4VDFwSFUYh9HYMVgRUEihyPWAGBPZUnOg X-IPAS-Result: AncDALgph1OERHMGjGdsb2JhbABZg1mCcqd3BQEBBoM8jR6HPoENFg4BAQEnPIJTEz8OLjQFBxFEiELSK4VDFwSFUYh9HYMVgRUEihyPWAGBPZUnOg X-IronPort-AV: E=Sophos;i="4.98,934,1392156000"; d="scan'208";a="109364595" Received: from fermat.math.technion.ac.il ([132.68.115.6]) by mailgw12.technion.ac.il with ESMTP; 29 May 2014 15:38:14 +0300 Received: by fermat.math.technion.ac.il (Postfix, from userid 4298) id 8591983E42; Thu, 29 May 2014 15:33:06 +0300 (IDT) Date: Thu, 29 May 2014 15:33:06 +0300 From: Nadav Har'El To: freebsd-net@freebsd.org Subject: Route caching Message-ID: <20140529123306.GA16644@fermat.math.technion.ac.il> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hebrew-Date: 29 Iyyar 5774 User-Agent: Mutt/1.5.20 (2009-12-10) Cc: osv-dev@googlegroups.com X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 May 2014 12:39:26 -0000 Hi, I'm working on the OSv project (http://osv.io/), a new BSD-licensed operating system for virtual machines. OSv's networking code is based on that of FreeBSD. I recently noticed an inefficiency that I believe exists also in FreeBSD's networking code, and I was wondering why this was done, and whether FreeBSD can also be improved in the same way by fixing this problem. My issue is that, for example, when running a UDP server answering hundreds of thousands of requests per second, I get the same number of calls to the routing table lookup function (rtalloc_ign_fib(), etc.). These calls are relatively slow: Each involves several mutex locks and unlocks (a rwlock for the radix tree, and a mutex for the individual route), which are relatively slow in the uncontended case, but even worse when several CPUs start to access the network heavily, and we start to see context switches hurting the performance of the server even further. Looking at FreeBSD's udp_output(), I see it does the following: error = ip_output(m, inp->inp_options, NULL, ipflags, inp->inp_moptions, inp) Note how NULL is passed as the third parameter. This tells ip_output that it can't cache the previously found route, and needs to look for it again and again on every packet output - even in the common case where a socket will only ever send packets on one interface. It seems that this change was done around FreeBSD 5.4. In the original UCB code (4.4Lite), I see this: error = ip_output(m, inp->inp_options, &inp->inp_route, inp->inp_socket->so_options & (SO_DONTROUTE | SO_BROADCAST), inp->inp_moptions); So the last-found route was cached in inp->inp_route, and possibly reused on the next packet to be sent. Does anyone have any idea why inp->inp_route was removed in FreeBSD? Doesn't this also hurt FreeBSD's network performance? Thanks, Nadav. -- Nadav Har'El | Thursday, May 29 2014, 29 Iyyar 5774 nyh@math.technion.ac.il |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |If glory comes after death, I'm not in a http://nadav.harel.org.il |hurry. (Latin proverb)