From owner-freebsd-net@FreeBSD.ORG Tue Jan 6 16:57:25 2015 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 92AB1E0A for ; Tue, 6 Jan 2015 16:57:25 +0000 (UTC) Received: from mail-ob0-x22f.google.com (mail-ob0-x22f.google.com [IPv6:2607:f8b0:4003:c01::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 526CD64447 for ; Tue, 6 Jan 2015 16:57:25 +0000 (UTC) Received: by mail-ob0-f175.google.com with SMTP id wp4so66589821obc.6 for ; Tue, 06 Jan 2015 08:57:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=MU/Y1nyqGDwFGuqKiiEv5xNGWDuuZ/90kCWHRB5xtto=; b=etAlSB6h7E8i9MnnqrMD8ATrmkhwP1YcGPuNKexSyjtqCEMAbxIP2Yg1GC28/zPpqD LB3Y45uLqDkZL+2akRlozEuSpbXUK2ftcVfM6i5RKVPh8MxSFOl7oSBRgvqJqe/km9Qb Q2YDurvrvNQ2AKX0ZlSZr+64+9d7rfDHvWNOQQjOBJ9FCaNF8HSsuROGMb6uVO+WUnpz 4XzuYAG8YJXRp+aZ+mn8PI7VNX8dtvHI9pVKa43ydmXy2QhvJhuznR77XqMiS+vK+Ea/ roy05dAJ2swgGnOoyRPSfffd/e/K67I19jF+rGaRBIZ/ggIkKehUooRwwSQ5xTZNYDPI FUHg== MIME-Version: 1.0 X-Received: by 10.202.75.81 with SMTP id y78mr54358733oia.80.1420563444591; Tue, 06 Jan 2015 08:57:24 -0800 (PST) Received: by 10.60.22.105 with HTTP; Tue, 6 Jan 2015 08:57:24 -0800 (PST) In-Reply-To: References: Date: Tue, 6 Jan 2015 17:57:24 +0100 Message-ID: Subject: Re: netmap over virtio giving packets with extra 12 bytes From: Vincenzo Maffione To: Avinash Sridharan Content-Type: text/plain; charset=UTF-8 Cc: Luigi Rizzo , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2015 16:57:25 -0000 2015-01-06 17:31 GMT+01:00 Avinash Sridharan : > Hi Vincenzo, > Thanks for the explanation. From your explanation it seems like the netmap > in "native" mode over virtio-net should be giving some indication of how > many extra bytes have been added by the virtio-net driver (or for that > matter any other driver that provides this type of rx-descriptor). > Otherwise, the application will have to store knowledge about the specifics > of the underlying devices which dosen't seem that clean. (I think Adrian was > referring to the same issue) I understand the problem (this was left as an open problem), and it's not clear to me what should be the best solution. On one hand, one could modify the native virtio-net adapter so that it discards the extra header. This can be done with a copy - as Luigi suggests - or with some trick involving scatter-gather virtio support - e.g. trying to make the virtio-net headers go in some other buffers parallel to the "official" netmap buffers (i.e. the ones your application reads from). On the other end, the virtio-net header carry some information (mainly TCP/UDP related) that you may not want to discard - even when using netmap - and this depends on the specific application. So to avoid making a one-for-all decision, I thought it was better to leave it to the application. > > That said, how do we handle TX in this case? Since the underlying driver > (netmap + virtio-net) expects an extra 12 bytes of header that the > application should know when to add. Or is this optional? Yes you have to add it, otherwise it won't work! It's not optional. In order to make tests for virtio-net tx, I added a "-H" option to pkt-gen, which pushes an empty virtio-net-header before the ethernet frame. You can use "-H 12" for your tests. Again, it's not clear (at least to me) how we should manage this virtio-net peculiarity at the netmap API level. Cheers, Vincenzo > > > > > > On Tue, Jan 6, 2015 at 8:17 AM, Vincenzo Maffione > wrote: >> >> Hello, >> >> From what I can guess you are dealing with a QEMU-KVM guest that >> uses virtio-net device(s) and runs netmap over that device(s). >> Then, you connect the guest to the host (gentoo) network stack using a >> standard linux bridge: a TAP device is used by QEMU to forward guest >> traffic from/to the host network stack. >> >> Is that correct? >> >> Following Luigi's explanations, the virtio-net header is part of the >> virtio standard, and its purpose is to carry offloading info >> (checksum, TSO) across the guests and host kernels. For instance, your >> guest kernel can offload the TCP checksum to the virtio-net device, >> which in turn uses the virtio-net header (that requires TAP driver >> support) to postpone the checksum to the host kernel. If packets >> arrive to a physical NIC that supports checksum offloading (e.g. a >> r8169 NIC attached to the same bridge to which the TAP is attached), >> you have effectively offloaded the checksum computation from the guest >> kernel straight to the physical NIC in the physical host. >> >> If you see the virtio-net header with "pkt-gen -f rx", it means that >> you are using netmap in "native" mode, that is you use the specific >> virtio netmap adapter to send/receive packets from the (virtual) NIC. >> If you used netmap over virtio-net in "emulated" mode you wouldn't see >> the virtio-net header, because netmap would be using the standard >> driver (slow) datapath under the hood: In the rx datapath, the driver >> converts the virtio-net header into skbuffs/mbufs metadata, so you >> don't see it. >> >> I don't remember having tried to make QEMU use a TAP with no >> virtio-net-header extension, but I see that it is possible to disable >> it invoking qemu from command line >> >> $ x86_64-softmmu/qemu-system-x86_64 --help | grep tap >> >> -net >> tap[,vlan=n][,name=str][,fd=h][,fds=x:y:...:z][,ifname=name][,script=file][,downscript=dfile][,helper=helper][,sndbuf=nbytes][,vnet_hdr=on|off][,vhost=on|off][,vhostfd=h][,vhostfds=x:y:...:z][,vhostforce=on|off][,queues=n] >> use vnet_hdr=off to avoid enabling the IFF_VNET_HDR tap >> flag >> -netdev >> [user|tap|bridge|vde|netmap|vhost-user|socket|hubport],id=str[,option][,option][,...] >> >> where you see that you can specify "vnet_hdr=off" when declaring the >> qemu "backend" associated to the virtio-net guest device. >> Never tried, but it should work. In the worst case you can recompile >> the tap driver without IFF_VNET_HDR extension, so that QEMU does not >> find it. >> >> >> Cheers, >> Vincenzo >> >> 2015-01-05 13:19 GMT+01:00 Luigi Rizzo : >> > What you see is a virtio issue. >> > >> > virtio prepends a 10 or 12-byte "virtio header" >> > to all packets, which is used to define what sort >> > of NIC accelerations (checksum, tso etc.) are >> > expected on the link. >> > >> > I do not remember if there is a way in qemu-kvm to >> > remove the header. Maybe Vincenzo (in Cc) remembers. >> > >> > cheers >> > luigi >> > >> > On Mon, Jan 5, 2015 at 6:54 AM, Avinash Sridharan >> > wrote: >> >> >> >> I am using netmap with the click modular router, running the >> >> click-modular >> >> router in user space. A while back I was using this combination with >> >> the >> >> e1000 device driver, with a slightly older netmap code-base. >> >> >> >> Recently I updated my netmap code base and am trying to use the >> >> click-modular router with netmap over a virtio-net device driver (over >> >> KVM). >> >> With this combination, though I was able to receive packets I was >> >> unable to >> >> interpret any packets coming from the FromDevice element. >> >> >> >> To debug this issue (and to negate any changes I made to the >> >> click-modular >> >> router), I ran the pkt-gen application with the "dump payload" option: >> >> >> >> sudo ~/pkt-gen -i eth1 -f rx -X >> >> >> >> This showed that packets are being received correctly from the >> >> netmap-enabled interface, but there are an extra "12" bytes appended to >> >> the >> >> packet. >> >> >> >> 381.088570 main_thread [1446] 1 pps (1 pkts in 1001088 usec) >> >> >> >> ring 0x7f133bca6000 cur 1 [buf 516 flags 0x0000 len 72] >> >> >> >> 0: 00 00 00 00 00 00 00 00 00 00 01 00 01 80 c2 00 ................ >> >> << >> >> extra 12 bytes >> >> >> >> 16: 00 00 40 16 7e 5b 50 f0 00 26 42 42 03 00 00 00 ..@.~[P..&BB.... >> >> >> >> 32: 00 00 80 00 40 16 7e 5b 50 f0 00 00 00 00 80 00 ....@.~[P....... >> >> >> >> 48: 40 16 7e 5b 50 f0 80 01 00 00 14 00 02 00 00 00 @.~[P........... >> >> >> >> 64: 00 00 00 00 bc 9b f6 74 >> >> >> >> >> >> As we can see, the above is an STP BPDU, and there are 12 leading bytes >> >> in >> >> the payload. >> >> >> >> >> >> The extra leading bytes screw up the packet interpretation. >> >> >> >> >> >> So is this is an artifact of the virtio-net driver or has something >> >> changed in the netmap device driver? >> >> >> >> >> >> Thanks, >> >> >> >> Avinash >> > >> > >> > >> > >> > -- >> > >> > -----------------------------------------+------------------------------- >> > Prof. Luigi RIZZO, rizzo@iet.unipi.it . Dip. di Ing. dell'Informazione >> > http://www.iet.unipi.it/~luigi/ . Universita` di Pisa >> > TEL +39-050-2211611 . via Diotisalvi 2 >> > Mobile +39-338-6809875 . 56122 PISA (Italy) >> > >> > -----------------------------------------+------------------------------- >> >> >> >> -- >> Vincenzo Maffione > > -- Vincenzo Maffione