From owner-svn-src-all@FreeBSD.ORG Tue Jun 23 13:43:48 2009 Return-Path: Delivered-To: svn-src-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BAFFD1065670; Tue, 23 Jun 2009 13:43:48 +0000 (UTC) (envelope-from gallatin@cs.duke.edu) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.freebsd.org (Postfix) with ESMTP id 918E68FC12; Tue, 23 Jun 2009 13:43:48 +0000 (UTC) (envelope-from gallatin@cs.duke.edu) Received: from [172.31.193.10] (cpe-075-177-134-250.nc.res.rr.com [75.177.134.250]) (authenticated bits=0) by duke.cs.duke.edu (8.14.2/8.14.2) with ESMTP id n5NDgn3R020317 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 23 Jun 2009 09:43:18 -0400 (EDT) Message-ID: <4A40DBD4.3070904@cs.duke.edu> Date: Tue, 23 Jun 2009 09:42:44 -0400 From: Andrew Gallatin User-Agent: Thunderbird 2.0.0.19 (X11/20090105) MIME-Version: 1.0 To: Andre Oppermann References: <200906222308.n5MN856I055711@svn.freebsd.org> In-Reply-To: <200906222308.n5MN856I055711@svn.freebsd.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r194672 - in head/sys: kern netinet sys X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Jun 2009 13:43:49 -0000 Andre Oppermann wrote: > Add soreceive_stream(), an optimized version of soreceive() for > stream (TCP) sockets. <....> > > Testers, especially with 10GigE gear, are welcome. Awesome! On my very weak, ancient consumer grade athlon64 test machine (AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2050.16-MHz K8-class CPU)) using mxge and LRO, I see a roughly 700Mb/s increase in bandwidth from 7.7Gb/s to 8.4Gb/s. For what its worth, this finally gives FreeBSD performance parity with Linux on this hardware for 10GbE single-stream receive. TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to venice-my (192.168.1.15) port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % C us/KB us/KB before: 65536 65536 65536 60.01 7709.14 13.30 79.60 0.283 1.692 after: 65536 65536 65536 60.01 8403.86 14.66 81.63 0.286 1.592 This is consistent across runs. Lockstat output for 10 seconds in the middle of a run is very interesting and shows a huge reduction in lock contention. Before: Adaptive mutex spin: 369333 events in 10.017 seconds (36869 events/sec) Count indv cuml rcnt nsec Lock Caller ------------------------------------------------------------------------------- 303685 82% 82% 0.00 1080 0xffffff000f2f98d0 recvit+0x21 63847 17% 100% 0.00 25 0xffffff000f2f98d0 ip_input+0xad 1788 0% 100% 0.00 172 0xffffff0001c57c08 intr_event_execute_handlers+0x100 8 0% 100% 0.00 389 vm_page_queue_mtx trap+0x4ce 1 0% 100% 0.00 30 0xffffff8000251598 ithread_loop+0x8e 1 0% 100% 0.00 720 0xffffff8000251598 uhub_read_port_status+0x2d 1 0% 100% 0.00 1639 0xffffff000f477190 vm_fault+0x112 1 0% 100% 0.00 1 0xffffff001fecce10 mxge_intr+0x425 1 0% 100% 0.00 1332 0xffffff0001845600 clnt_reconnect_call+0x105 ------------------------------------------------------------------------------- Adaptive mutex block: 89 events in 10.017 seconds (9 events/sec) Count indv cuml rcnt nsec Lock Caller ------------------------------------------------------------------------------- 83 93% 93% 0.00 20908 0xffffff000f2f98d0 tcp_input+0xd96 3 3% 97% 0.00 45234 0xffffff8000259f08 fork_exit+0x118 3 3% 100% 0.00 44862 0xffffff8000251598 fork_exit+0x118 ------------------------------------------------------------------------------- After: Adaptive mutex spin: 105102 events in 10.020 seconds (10490 events/sec) Count indv cuml rcnt nsec Lock Caller ------------------------------------------------------------------------------- 75886 72% 72% 0.00 2860 0xffffff0001fdde20 ip_input+0xad 28418 27% 99% 0.00 1355 0xffffff0001fdde20 recvit+0x21 779 1% 100% 0.00 171 0xffffff0001642808 intr_event_execute_handlers+0x100 7 0% 100% 0.00 670 vm_page_queue_mtx trap+0x4ce 5 0% 100% 0.00 46 0xffffff001fecce10 mxge_intr+0x425 1 0% 100% 0.00 105 vm_page_queue_mtx trap_pfault+0x142 1 0% 100% 0.00 568 0xffffff8000251598 usb_process+0xd8 1 0% 100% 0.00 880 0xffffff8000251598 ithread_loop+0x8e 1 0% 100% 0.00 233 0xffffff001a224578 vm_fault+0x112 1 0% 100% 0.00 60 0xffffff001a1759b8 syscall+0x28f 1 0% 100% 0.00 809 0xffffff0001846000 clnt_reconnect_call+0x105 1 0% 100% 0.00 1139 0xffffff0001fdde20 kern_recvit+0x1d4 ------------------------------------------------------------------------------- Adaptive mutex block: 88 events in 10.020 seconds (9 events/sec) Count indv cuml rcnt nsec Lock Caller ------------------------------------------------------------------------------- 80 91% 91% 0.00 25891 0xffffff0001fdde20 tcp_input+0xd96 3 3% 94% 0.00 45979 0xffffff8000259f08 fork_exit+0x118 3 3% 98% 0.00 45886 0xffffff8000251598 fork_exit+0x118 1 1% 99% 0.00 38254 0xffffff8000259f08 intr_event_execute_handlers+0x100 1 1% 100% 0.00 79858 0xffffff001a1760f8 kern_wait+0x7ee ------------------------------------------------------------------------------- Drew