From owner-freebsd-arch@FreeBSD.ORG Sat Aug 20 21:37:07 2011 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8388E106564A; Sat, 20 Aug 2011 21:37:07 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 2B84F8FC0A; Sat, 20 Aug 2011 21:37:06 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 1A0217300A; Sat, 20 Aug 2011 23:55:03 +0200 (CEST) Date: Sat, 20 Aug 2011 23:55:03 +0200 From: Luigi Rizzo To: Lev Serebryakov Message-ID: <20110820215503.GA45984@onelab2.iet.unipi.it> References: <810527321.20110819123700@serebryakov.spb.ru> <201108191401.23083.pieter@degoeje.nl> <425884435.20110819175307@serebryakov.spb.ru> <20110819172252.GE88904@in-addr.com> <368496955.20110820101506@serebryakov.spb.ru> <20110820134530.GA42942@onelab2.iet.unipi.it> <1361908410.20110821011005@serebryakov.spb.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1361908410.20110821011005@serebryakov.spb.ru> User-Agent: Mutt/1.4.2.3i Cc: freebsd-arch@freebsd.org Subject: Re: 10gbps scalability (was: Re: FreeBSD problems and preliminary ways to solve) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Aug 2011 21:37:07 -0000 On Sun, Aug 21, 2011 at 01:10:05AM +0400, Lev Serebryakov wrote: > Hello, Luigi. > You wrote 20 ??????? 2011 ?., 17:45:30: > > > - the Click modular router now runs (in userspace) at up to 4Mpps > > per core, which is faster than in-kernel linux; > > A userspace version of ipfw should be available in a short time, > > and i have some work in progress to bring the forwarding tables > > in userspace (but of course you can do the same with Click). > > I also see people start using it, which is a good thing because > > i am getting useful feedback on features and bugs and patches > > for more device drivers. > [SKIPPED] > > On the general issue of improving performance of the network stack, > > I feel that to achieve significant speed improvements we should > > really reconsider the way things are done in the network stack. > > And that comes before support for special HW features. > Could you please explain (I don't mean, that you are wrong, I really > don't understand), how netmap and other user-level processing could > help for ROUTING (with firewalling, different routes, etc) and > software switching? I understand very well, why this help user-level i am working on the following now: - routing daemons and the like still work as usual, adding and modifying routes with the standard mechanisms (routing sockets etc.) - the kernel updates its own forwarding tables (FIB) as usual But: - a netmap client (userspace) listens for FIB updates on a routing socket, and builds its own copy of the FIB in userspace (call it uFIB) - the same process sets interfaces in netmap mode, and uses the uFIB to do forwarding, injecting back into the kernel those packets that have a local destination. > applications, which need to process huge PPS rates. Less memcpy, less > allocations, less context switches (and TLB/cache flushes) -- all > these things is very clear to me. But why user-level software > swithcing is faster than in-kernel one, hwcih should wotk without > memory context switches AT ALL?! essentially, the driver in netmap mode is way more efficient and this offsets the cost of the few syscalls. As an example, currently with netmap one core can forward packets between interfaces at a rate between 3 and 10 Mpps depending on the amount of processing on the packet, and there are significant optimizations that are still possible especially at the lower speeds (if 3 Mpps can be called so) > Or netmap is used for prototyping code, which will be moved into > kernel later? Nothing prevents, of course, that kernel subsystems use the interface directly in netmap mode. But i think that now that we have the option, it makes sense to spend some time to experiment with newer solutions (FIB data structures, firewalls, memory aligment, possibly even tcp buffer management) in userspace and then move stuff back into the kernel once we have a good solution. i am using it for prototyping and testing subsystems in userspace, whether it makes sense to move them depends on the performance we manage to get. cheers luigi