From owner-freebsd-net@FreeBSD.ORG Sat Jan 3 12:02:04 2015 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 440949B9; Sat, 3 Jan 2015 12:02:04 +0000 (UTC) Received: from forward-corp1f.mail.yandex.net (forward-corp1f.mail.yandex.net [IPv6:2a02:6b8:0:801::10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E343919D1; Sat, 3 Jan 2015 12:02:03 +0000 (UTC) Received: from smtpcorp1m.mail.yandex.net (smtpcorp1m.mail.yandex.net [77.88.61.150]) by forward-corp1f.mail.yandex.net (Yandex) with ESMTP id 8D2A624202CE; Sat, 3 Jan 2015 15:01:59 +0300 (MSK) Received: from smtpcorp1m.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp1m.mail.yandex.net (Yandex) with ESMTP id 563EB2CA0302; Sat, 3 Jan 2015 15:01:59 +0300 (MSK) Received: from unknown (unknown [2a02:6b8:0:401:222:4dff:fe50:cd2f]) by smtpcorp1m.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id UONxnCsg3v-1xlmDQb4; Sat, 3 Jan 2015 15:01:59 +0300 (using TLSv1.2 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1420286519; bh=OxbVapjiDQmM6Q3oAVUBfP5Pfwsr8/xK02PM0Y577qk=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=QCkfrhlpQIfjbVboKisNjsL3iKgbhduA076A1VNPiWEix9BSvP1+gdNat2yFsMOdT IUt2V0V0ClVc0dr4v3wXsKZWM01k+y9r01i3z/HsG3NO3VEzequJEspgF39asiKaaE icTCXtk9XGLfEJTOHLfqPoEWD9cPBqT4NgBQRCbc= Authentication-Results: smtpcorp1m.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <54A7D9F9.5080206@yandex-team.ru> Date: Sat, 03 Jan 2015 15:00:57 +0300 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Luigi Rizzo Subject: Re: host pipes and netmap 'emulated mode' References: <54A6BAE0.9020404@yandex-team.ru> <20150102164559.GA68836@onelab2.iet.unipi.it> In-Reply-To: <20150102164559.GA68836@onelab2.iet.unipi.it> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Jan 2015 12:02:04 -0000 On 02.01.2015 19:45, Luigi Rizzo wrote: > On Fri, Jan 02, 2015 at 06:36:00PM +0300, Alexander V. Chernikov wrote: >> Hello list. >> >> It looks like it is impossible to use host pipes and emulated netmap >> mode in some cases. >> >> For example, if you're doing something like what traditional router do: >> packet processing, with kernel-visible logical interfaces, routing >> daemon running there, you can easily get a panic like this: >> >> #0 0xffffffff8094aa76 at kdb_backtrace+0x66 >> #1 0xffffffff809104ee at panic+0x1ce >> #2 0xffffffff80cf9660 at trap_fatal+0x290 >> #3 0xffffffff80cf99c1 at trap_pfault+0x211 >> #4 0xffffffff80cf9f89 at trap+0x329 >> #5 0xffffffff80ce30d3 at calltrap+0x8 >> #6 0xffffffff809d3b5f at ether_demux+0x6f >> #7 0xffffffff809d3f34 at ether_nh_input+0x204 >> #8 0xffffffff809dd6d8 at netisr_dispatch_src+0x218 >> #9 0xffffffff8061b2b5 at netmap_send_up+0x35 >> #10 0xffffffff8061b3d7 at netmap_txsync_to_host+0x97 >> #11 0xffffffff8061b400 at netmap_txsync_to_host_compat+0x10 >> #12 0xffffffff8061de8c at netmap_poll+0x2fc >> #13 0xffffffff807f2313 at devfs_poll_f+0x63 >> #14 0xffffffff8095ea3d at sys_poll+0x35d >> #15 0xffffffff80cf8e0a at amd64_syscall+0x5ea >> #16 0xffffffff80ce33b7 at Xfast_syscall+0xf7 >> Uptime: 4m21s >> >> The problem here is the following: >> netmap changes if_input() for the logical network interface and always >> assumes that generic_rx_handler() is called with netmap-enabled ifp >> (e.g. original inteface). >> Unfortunately, there are cases where we have different ifp passed to >> if_input handler. This particular case is triggered by >> (*ifp->if_input)(ifv->ifv_ifp, m); >> line, where "ifp" represents netmap-enabled NIC, and ifv->ifv_ifp >> represents vlan subinterface. >> >> Then, generic_rx_handler() tries to looking NA/GNA structure but fails >> since vlan subinterface is not netmap-enabled. >> So, it looks like that we need a way to call original if_input() but I >> can't imagine (good) one/ > Surely we can put a check in generic_rx_handler() to make sure that > NA(ifp) is NULL -- this is already a relatively expensive code > path so the extra checks won't harm. Yes, but it would be great if we can recover/call original input procedure instead of silently dropping frame > > But I am a bit unclear on how you trigger this error, > can you give me more details ? > > The offending instruction (*ifp->if_input)(ifv->ifv_ifp, m) > is in vlan_input(), so it looks like you are setting the parent > interface in netmap mode, and (looking at the trace) > sending packets to the host port. Yes. Basically, this is "netmap router case" - we configure vlan interface, bridge interfaces, etc using kernel interfaces, and propagate some (or all) configuration to netmap application. It somehow processes "fast path" traffic and sends control plane traffic to host via host pipe to handle routing daemon updates, icmp, fragments, etc.. > > So i suppose the error path is when netmap_send_up() calls > the original input handler, NA(ifp)->if_input [i should rename > the field to something else] which is vlan_input(). > > I think a proper fix is to make vlan_input() netmap aware, > and call NA(ifp)->if_input if the interface is in netmap mode. Well, than we will have to make bridge code netmap aware, netgraph and tunnel after that. It seems this is pretty invasive way of doing things. > > Otherwise, if vlan_input() is the only case where if_input() is called > with a different ifp, _and_ the vlan (child) interface has a reference > to the parent (ifp->parent, though i don't know where this is), > we can tweak generic_rx_handler() so that it calls > NA(ifp->parent)->if_input in case NA(ifp) is null. We will have to add different code for all type of virtual interfaces, which seems to be better approach. There is also an option like having per-vnet ifindex-based array with vlan (and other objects) tracking to get ifp quickly. What we really lacks here is set_if_input_func() method which can call some eventhandler so it can be done more generic way (so, adding glebius@ here) > > cheers > luigi