Date: Mon, 8 Aug 2011 21:31:27 -0400 From: George Neville-Neil <gnn@freebsd.org> To: Takuya ASADA <syuu@dokukino.com> Cc: "Robert N. M. Watson" <rwatson@freebsd.org>, soc-status@freebsd.org, Kazuya Goda <gockzy@gmail.com> Subject: Re: [mq_bpf] status report #9 Message-ID: <7FB7BCF6-5224-420D-85FA-3B82F1407E93@freebsd.org> In-Reply-To: <CALG4x-UdHdg6NYgvrD986_kPeyYLR3KmJ8ijOLr%2BkQ-8_SaByA@mail.gmail.com> References: <CALG4x-UdHdg6NYgvrD986_kPeyYLR3KmJ8ijOLr%2BkQ-8_SaByA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jul 27, 2011, at 19:11 , Takuya ASADA wrote: > *Project summary > The project goal is to support multiqueue network interface on BPF, > and provide interfaces for multithreaded packet processing using BPF. > Modern high performance NICs have multiple receive/send queues and RSS > feature, this allows to process packet concurrently on multiple > processors. > Main purpose of the project is to support these hardware and get > benefit of parallelism. >=20 > Here's status update from last week: > * Throughput benchmark > - Test environment > CPU: Core i7 X980 > MB: ASUS P6X58D Premium(Intel X58) > NIC: Intel Gigabit ET Dual Port Server Adapter(82576) >=20 > - Benchmark program > test_sqpbf is single threaded bpf benchmark which used only existing = bpf ioctls. > It fetch all packets from a NIC and output them on file. >=20 > test_mqbpf is multithreaded bpf benchmark which used new multiqueue = bpf ioctls. > Each thread fetch packets only from pinned queue and output them on > per thread separated file. >=20 > - Test conditions > iperf used for generate network traffic, with following argument = options > test node: iperf -s -i1 > other node: iperf -c [IP] -i1 -t 100000 -P8 > # 8 threads, TCP >=20 > tested with following 4 kernels to compare > current: GENERIC kernel on current, BPFIF_LOCK:mtx = BPFQ_LOCK:doesn't exist > mq_bpf1: RSS kernel on mp_bpf, BPFIF_LOCK:mtx BPFQ_LOCK:mtx > mq_bpf2: RSS kernel on mp_bpf, BPFIF_LOCK:mtx BPFQ_LOCK:rmlock > mq_bpf3: RSS kernel on mp_bpf, BPFIF_LOCK:rmlock BPFQ_LOCK:rmlock >=20 > - Benchmark result(MB/s) > The result is 20 times average of test_sqbpf / test_mqbpf > test_sqbpf test_mqbpf > current 26.65568315 - > mq_bpf1 24.96387975 36.608574 > mq_bpf2 27.13427415 41.76666665 > mq_bpf3 27.0958332 51.48198915 This looks good and it looks as if the performance scales linearly. = Were the test programs cpuset to each core? Is the test code in the p4 tree yet? Best, George
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7FB7BCF6-5224-420D-85FA-3B82F1407E93>