From owner-freebsd-net@FreeBSD.ORG  Thu May 29 12:39:26 2014
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A3D04597
 for <freebsd-net@freebsd.org>; Thu, 29 May 2014 12:39:26 +0000 (UTC)
Received: from mailgw12.technion.ac.il (mailgw12.technion.ac.il
 [132.68.225.12]) by mx1.freebsd.org (Postfix) with ESMTP id 1BC4E2522
 for <freebsd-net@freebsd.org>; Thu, 29 May 2014 12:39:25 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AncDALgph1OERHMGjGdsb2JhbABZg1mCcqd3BQEBBoM8jR6HPoENFg4BAQEnPIJTEz8OLjQFBxFEiELSK4VDFwSFUYh9HYMVgRUEihyPWAGBPZUnOg
X-IPAS-Result: AncDALgph1OERHMGjGdsb2JhbABZg1mCcqd3BQEBBoM8jR6HPoENFg4BAQEnPIJTEz8OLjQFBxFEiELSK4VDFwSFUYh9HYMVgRUEihyPWAGBPZUnOg
X-IronPort-AV: E=Sophos;i="4.98,934,1392156000"; d="scan'208";a="109364595"
Received: from fermat.math.technion.ac.il ([132.68.115.6])
 by mailgw12.technion.ac.il with ESMTP; 29 May 2014 15:38:14 +0300
Received: by fermat.math.technion.ac.il (Postfix, from userid 4298)
 id 8591983E42; Thu, 29 May 2014 15:33:06 +0300 (IDT)
Date: Thu, 29 May 2014 15:33:06 +0300
From: Nadav Har'El <nyh@math.technion.ac.il>
To: freebsd-net@freebsd.org
Subject: Route caching
Message-ID: <20140529123306.GA16644@fermat.math.technion.ac.il>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Hebrew-Date: 29 Iyyar 5774
User-Agent: Mutt/1.5.20 (2009-12-10)
Cc: osv-dev@googlegroups.com
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 29 May 2014 12:39:26 -0000

Hi,

I'm working on the OSv project (http://osv.io/), a new BSD-licensed
operating system for virtual machines. OSv's networking code is based
on that of FreeBSD.

I recently noticed an inefficiency that I believe exists also in
FreeBSD's networking code, and I was wondering why this was done,
and whether FreeBSD can also be improved in the same way by fixing
this problem.

My issue is that, for example, when running a UDP server answering
hundreds of thousands of requests per second, I get the same number of
calls to the routing table lookup function (rtalloc_ign_fib(), etc.).
These calls are relatively slow: Each involves several mutex locks and
unlocks (a rwlock for the radix tree, and a mutex for the individual
route), which are relatively slow in the uncontended case, but even worse
when several CPUs start to access the network heavily, and we start to see
context switches hurting the performance of the server even further.

Looking at FreeBSD's udp_output(), I see it does the following:

   error = ip_output(m, inp->inp_options, NULL, ipflags,
                     inp->inp_moptions, inp)

Note how NULL is passed as the third parameter. This tells ip_output
that it can't cache the previously found route, and needs to look for
it again and again on every packet output - even in the common case
where a socket will only ever send packets on one interface.

It seems that this change was done around FreeBSD 5.4. In the original
UCB code (4.4Lite), I see this:

	error = ip_output(m, inp->inp_options, &inp->inp_route,
                inp->inp_socket->so_options & (SO_DONTROUTE | SO_BROADCAST),
                inp->inp_moptions);

So the last-found route was cached in inp->inp_route, and possibly
reused on the next packet to be sent.

Does anyone have any idea why inp->inp_route was removed in FreeBSD?
Doesn't this also hurt FreeBSD's network performance?

Thanks,
Nadav.


-- 
Nadav Har'El                        |     Thursday, May 29 2014, 29 Iyyar 5774
nyh@math.technion.ac.il             |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |If glory comes after death, I'm not in a
http://nadav.harel.org.il           |hurry. (Latin proverb)