From owner-freebsd-net@FreeBSD.ORG Tue Aug 16 09:56:16 2011 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 69CE11065673 for ; Tue, 16 Aug 2011 09:56:16 +0000 (UTC) (envelope-from dudu@dudu.ro) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 00EE58FC1C for ; Tue, 16 Aug 2011 09:56:15 +0000 (UTC) Received: by bkat8 with SMTP id t8so4663054bka.13 for ; Tue, 16 Aug 2011 02:56:15 -0700 (PDT) Received: by 10.204.172.2 with SMTP id j2mr1184491bkz.150.1313488574774; Tue, 16 Aug 2011 02:56:14 -0700 (PDT) Received: from [192.168.10.3] ([82.76.253.74]) by mx.google.com with ESMTPS id n11sm1833465bkd.14.2011.08.16.02.56.12 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 16 Aug 2011 02:56:13 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1244.3) Content-Type: text/plain; charset=us-ascii From: Vlad Galu In-Reply-To: <2AB05A3E-BDC3-427D-B4A7-ABDDFA98D194@dudu.ro> Date: Tue, 16 Aug 2011 11:56:10 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <0BB87D28-3094-422D-8262-5FA0E40BFC7C@dudu.ro> References: <2AB05A3E-BDC3-427D-B4A7-ABDDFA98D194@dudu.ro> To: Vlad Galu X-Mailer: Apple Mail (2.1244.3) Cc: Takuya ASADA , net@freebsd.org Subject: Re: Multiqueue support for bpf X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2011 09:56:16 -0000 On Aug 16, 2011, at 11:50 AM, Vlad Galu wrote: > On Aug 16, 2011, at 11:13 AM, Takuya ASADA wrote: >> Hi all, >>=20 >> I implemented multiqueue support for bpf, I'd like to present for = review. >> This is a Google Summer of Code project, the project goal is to >> support multiqueue network interface on BPF, and provide interfaces >> for multithreaded packet processing using BPF. >> Modern high performance NICs have multiple receive/send queues and = RSS >> feature, this allows to process packet concurrently on multiple >> processors. >> Main purpose of the project is to support these hardware and get >> benefit of parallelism. >>=20 >> This provides following new APIs: >> - queue filter for each bpf descriptor (bpf ioctl) >> - BIOCENAQMASK Enables multiqueue filter on the descriptor >> - BIOCDISQMASK Disables multiqueue filter on the descriptor >> - BIOCSTRXQMASK Set mask bit on specified RX queue >> - BIOCCRRXQMASK Clear mask bit on specified RX queue >> - BIOCGTRXQMASK Get mask bit on specified RX queue >> - BIOCSTTXQMASK Set mask bit on specified TX queue >> - BIOCCRTXQMASK Clear mask bit on specified TX queue >> - BIOCGTTXQMASK Get mask bit on specified TX queue >> - BIOCSTOTHERMASK Set mask bit for the packets which not tied >> with any queues >> - BIOCCROTHERMASK Clear mask bit for the packets which not tied >> with any queues >> - BIOCGTOTHERMASK Get mask bit for the packets which not tied >> with any queues >>=20 >> - generic interface for getting hardware queue information from NIC >> driver (socket ioctl) >> - SIOCGIFQLEN Get interface RX/TX queue length >> - SIOCGIFRXQAFFINITY Get interface RX queue affinity >> - SIOCGIFTXQAFFINITY Get interface TX queue affinity >>=20 >> Patch for -CURRENT is here, right now it only supports igb(4), >> ixgbe(4), mxge(4): >> http://www.dokukino.com/mq_bpf_20110813.diff >>=20 >> And below is performance benchmark: >>=20 >> =3D=3D=3D=3D >> I implemented benchmark programs based on >> bpfnull(//depot/projects/zcopybpf/utils/bpfnull/), >>=20 >> test_sqbpf measures bpf throughput on one thread, without using = multiqueue APIs. >> = http://p4db.freebsd.org/fileViewer.cgi?FSPC=3D//depot/projects/soc2011/mq_= bpf/src/tools/regression/bpf/mq_bpf/test_sqbpf/test_sqbpf.c >>=20 >> test_mqbpf is multithreaded version of test_sqbpf, using multiqueue = APIs. >> = http://p4db.freebsd.org/fileViewer.cgi?FSPC=3D//depot/projects/soc2011/mq_= bpf/src/tools/regression/bpf/mq_bpf/test_mqbpf/test_mqbpf.c >>=20 >> I benchmarked with six conditions: >> - benchmark1 only reads bpf, doesn't write packet anywhere >> - benchmark2 writes packet on memory(mfs) >> - benchmark3 writes packet on hdd(zfs) >> - benchmark4 only reads bpf, doesn't write packet anywhere, with = zerocopy >> - benchmark5 writes packet on memory(mfs), with zerocopy >> - benchmark6 writes packet on hdd(zfs), with zerocopy >>=20 >>> =46rom benchmark result, I can say the performance is increased = using >> mq_bpf on 10GbE, but not on GbE. >>=20 >> * Throughput benchmark >> - Test environment >> - FreeBSD node >> CPU: Core i7 X980 (12 threads) >> MB: ASUS P6X58D Premium(Intel X58) >> NIC1: Intel Gigabit ET Dual Port Server Adapter(82576) >> NIC2: Intel Ethernet X520-DA2 Server Adapter(82599) >> - Linux node >> CPU: Core 2 Quad (4 threads) >> MB: GIGABYTE GA-G33-DS3R(Intel G33) >> NIC1: Intel Gigabit ET Dual Port Server Adapter(82576) >> NIC2: Intel Ethernet X520-DA2 Server Adapter(82599) >>=20 >> iperf used for generate network traffic, with following argument = options >> - Linux node: iperf -c [IP] -i 10 -t 100000 -P12 >> - FreeBSD node: iperf -s >> # 12 threads, TCP >>=20 >> following sysctl parameter is changed >> sysctl -w net.bpf.maxbufsize=3D1048576 >=20 >=20 > Thank you for your work! You may want to increase that (4x/8x) and = rerun the test, though. More, actually. Your current buffer is easily filled.=