From owner-freebsd-net@FreeBSD.ORG Sun Jul 27 02:20:01 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 97A05FF6 for ; Sun, 27 Jul 2014 02:20:01 +0000 (UTC) Received: from mail-qa0-x229.google.com (mail-qa0-x229.google.com [IPv6:2607:f8b0:400d:c00::229]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 56AAB2892 for ; Sun, 27 Jul 2014 02:20:01 +0000 (UTC) Received: by mail-qa0-f41.google.com with SMTP id j7so6365911qaq.28 for ; Sat, 26 Jul 2014 19:20:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=8JbSqMnO8Trhw5HwA6kfRPI9AwCq2m9EBqZkr6LBB7Y=; b=rN3fYlnnDt1j8aG5OvrWC8rHTrBnF5h3zOosr2t8CNW6FT4j6/IjQCviNSFRJlVG2u M6mztOoYqHigTkVgPY/gTBh8E0sVCUPt7YS+wFuRwIbqS0ppNh9FlMnYlPJe67gLH7qa iivG6KnsIxp76hLWFVm6HLUfCyFW7l6VOcrueA5jQ0jZ1oXq5wSeeyLjhZbbQC4gizAL KoIYvYeokN92Wm/XTD3IdM16fKMtVxVP5FNnRQItyzRZkOiMaQ3tud8GvUXKfWaIuHl8 X6ilwwGwpv18RpNaIKV/ouBMOIWz/knKsP3ZKbix7dzuQD44ikwbzgcVmyphFR3BzVyC 6Hpw== MIME-Version: 1.0 X-Received: by 10.224.71.198 with SMTP id i6mr44228358qaj.76.1406427600433; Sat, 26 Jul 2014 19:20:00 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.1.6 with HTTP; Sat, 26 Jul 2014 19:20:00 -0700 (PDT) In-Reply-To: <53D4600A.1010505@gmail.com> References: <53CE80DD.9090109@gmail.com> <53D4600A.1010505@gmail.com> Date: Sat, 26 Jul 2014 19:20:00 -0700 X-Google-Sender-Auth: 8mHNfNI39Mk1Q8QfNwzsB2wNuZM Message-ID: Subject: Re: fastforward/routing: a 3 million packet-per-second system? From: Adrian Chadd To: John Jasen Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Jul 2014 02:20:01 -0000 Flowtable is enabled? That's odd, it shouldn't be showing up like that. -a On 26 July 2014 19:12, John Jasen wrote: > flowtables enabled. This is a 30 second burst of lock profiling, > processed via: > head -2 debug.lock.pref.stats.out-20140726-1 ; cat > debug.lock.pref.stats.out-20140726-1 | sort -nk4 | tail > debug.lock.prof.stats: > > max wait_max total wait_total count avg wait_avg > cnt_hold cnt_lock name > 0 0 4758 31 22444 0 0 > 0 103 /usr/src/sys/kern/sched_ule.c:886 (spin mutex:sched lock 11) > 1 1 770 45 2833 0 0 > 0 115 /usr/src/sys/kern/sched_ule.c:886 (spin mutex:sched lock 6) > 3 16 50 54 30 1 1 > 0 8 /usr/src/sys/dev/usb/usb_device.c:2783 (sleep mutex:Giant) > 0 14 5 61 28 0 2 > 0 13 /usr/src/sys/dev/usb/usb_device.c:2758 (sleep mutex:Giant) > 1 1 3219 68 10452 0 0 > 0 161 /usr/src/sys/kern/sched_ule.c:886 (spin mutex:sched lock 7) > 12 4 446 100 154 2 0 > 0 46 /usr/src/sys/kern/kern_condvar.c:145 (sleep mutex:Giant) > 1 0 84164 976 405264 0 0 > 0 2681 /usr/src/sys/kern/subr_turnstile.c:552 (spin mutex:turnstile chain) > 4 3 12812430 262856 68940690 0 0 0 > 546014 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry) > 3 4 4683374 1274675 68940690 0 0 0 > 3433615 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry) > 5 6 3513959 2106043 68940690 0 0 0 > 7509422 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry) > > The high level overview of pmc output is also attached. > > Profile trace for function: __rw_rlock() [19.54%] > Profile trace for function: _rw_runlock_cookie() [9.25%] > Profile trace for function: __mtx_unlock_flags() [5.97%] > Profile trace for function: memcpy() [5.68%] > Profile trace for function: bcopy() [5.53%] > Profile trace for function: bcmp() [5.09%] > Profile trace for function: _mtx_lock_spin_cookie() [4.77%] > Profile trace for function: ip_fastforward() [4.27%] > Profile trace for function: t4_eth_tx() [3.37%] > Profile trace for function: rn_match() [2.92%] > Profile trace for function: ether_output() [2.79%] > Profile trace for function: rtalloc1_fib() [2.35%] > Profile trace for function: get_scatter_segment() [1.68%] > Profile trace for function: uma_zalloc_arg() [1.65%] > Profile trace for function: rtfree() [1.58%] > Profile trace for function: bzero() [1.54%] > Profile trace for function: ether_nh_input() [1.47%] > Profile trace for function: cxgbe_transmit() [1.39%] > Profile trace for function: service_iq() [1.26%] > Profile trace for function: rtalloc_ign_fib() [1.21%] > Profile trace for function: __mtx_lock_sleep() [1.12%] > Profile trace for function: arpresolve() [1.11%] > Profile trace for function: uma_zfree_arg() [1.02%] > Profile trace for function: reclaim_tx_descs() [0.92%] > Profile trace for function: _mtx_trylock_flags_() [0.77%] > Profile trace for function: bounce_bus_dmamap_load_buffer() [0.76%] > Profile trace for function: ether_demux() [0.72%] > Profile trace for function: mb_ctor_mbuf() [0.59%] > Profile trace for function: in_localip() [0.51%] > > As usual, to save everyone's mail INBOX, the full output is available on > request. > > Thanks again! > > > > > On 07/25/2014 04:53 PM, Adrian Chadd wrote: >> Yeah: >> >> Adrians-MacBook-Pro:Downloads adrian$ head -2 >> debug.lock.prof.stats.out.20140725 ; cat >> debug.lock.prof.stats.out.20140725 | sort -nk4 | tail -10 >> >> debug.lock.prof.stats: >> >> max wait_max total wait_total count avg wait_avg >> cnt_hold cnt_lock name >> >> 6 3 419 145 160 2 0 >> 0 63 /usr/src/sys/kern/kern_condvar.c:145 (sleep mutex:Giant) >> >> 282 133 991 215 8 123 26 >> 0 2 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 >> (sleep mutex:cxl3 txq26) >> >> 69 72 71 250 5 14 50 >> 0 4 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 >> (sleep mutex:cxl1 txq37) >> >> 281 197 1638 286 13 126 22 >> 0 2 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 >> (sleep mutex:cxl1 txq46) >> >> 351 182 2416 499 38 63 13 >> 0 10 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 >> (sleep mutex:cxl3 txq17) >> >> 276 193 802 643 10 80 64 >> 0 5 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 >> (sleep mutex:cxl3 txq27) >> >> 0 1 98578 1341 482441 0 0 >> 0 3767 /usr/src/sys/kern/subr_turnstile.c:552 (spin mutex:turnstile >> chain) >> >> 7 13 11543138 470545 63952832 0 0 >> 0 815777 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry) >> >> 6 15 3943582 1545195 63952779 0 0 >> 0 3439254 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry) >> >> 7 17 3271389 2258698 63952832 0 0 >> 0 6761237 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry) >> >> .. try FLOWTABLE. >> >> The in_rmx.c is the hook to check for temporary routes installed by >> redirect ICMP messages. It's .. not very pretty. Just use FLOWTABLE >> for now and see if it improves things. >> >> (Yes, we likely can do better on the rtentry locking..) >> >> >> -a >> >> >> On 25 July 2014 13:51, Adrian Chadd wrote: >>> Ugh, the forwarding table stupidity. Try enabling FLOWTABLE as an option. >>> >>> I really dislike how the rtentry locking works. But that isn't a >>> rwlock - i'll have to look at your full lock profiling output to see. >>> >>> >>> -a >>> >>> >>> On 25 July 2014 09:20, John Jasen wrote: >>>> Based on advice I received, I've collected lock profile debugging output, >>>> and pmcannotate'd data from the system while it was processing about 3 >>>> million packets/second. >>>> >>>> Combined, the files are about 325k in size, so I'll submit highlights here. >>>> I can provide the raw files to interested parties privately. >>>> >>>> pmcannotate summary output: >>>> >>>> grep ^Profile pmcannotate.20140725 >>>> Profile trace for function: __rw_rlock() [17.04%] >>>> Profile trace for function: __mtx_unlock_flags() [9.10%] >>>> Profile trace for function: _rw_runlock_cookie() [7.67%] >>>> Profile trace for function: sched_idletd() [5.73%] >>>> Profile trace for function: memcpy() [5.64%] >>>> Profile trace for function: bcopy() [5.04%] >>>> Profile trace for function: bcmp() [5.01%] >>>> Profile trace for function: __mtx_lock_flags() [3.66%] >>>> Profile trace for function: t4_eth_tx() [3.25%] >>>> Profile trace for function: lock_profile_release_lock() [2.73%] >>>> Profile trace for function: ip_fastforward() [2.68%] >>>> Profile trace for function: ether_output() [2.50%] >>>> Profile trace for function: get_scatter_segment() [1.75%] >>>> Profile trace for function: rn_match() [1.74%] >>>> Profile trace for function: _mtx_lock_spin_cookie() [1.53%] >>>> Profile trace for function: lock_profile_obtain_lock_success() [1.49%] >>>> Profile trace for function: cxgbe_transmit() [1.37%] >>>> Profile trace for function: uma_zalloc_arg() [1.31%] >>>> Profile trace for function: bzero() [1.30%] >>>> Profile trace for function: service_iq() [1.26%] >>>> Profile trace for function: ether_nh_input() [1.23%] >>>> Profile trace for function: __mtx_lock_sleep() [1.19%] >>>> Profile trace for function: arpresolve() [1.07%] >>>> Profile trace for function: uma_zfree_arg() [0.95%] >>>> Profile trace for function: reclaim_tx_descs() [0.87%] >>>> Profile trace for function: _mtx_trylock_flags_() [0.80%] >>>> Profile trace for function: bounce_bus_dmamap_load_buffer() [0.72%] >>>> Profile trace for function: ether_demux() [0.64%] >>>> Profile trace for function: mb_ctor_mbuf() [0.63%] >>>> Profile trace for function: rtalloc1_fib() [0.54%] >>>> >>>> sysctl debug.lock.prof.stats summary: (some of the highest hit counts, >>>> especially in cnt_hold: >>>> >>>> 7 17 3271389 2258698 63952832 0 0 0 >>>> 6761237 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry) >>>> >>>> 7 13 11543138 470545 63952832 0 0 0 >>>> 815777 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry) >>>> >>>> 6 15 3943582 1545195 63952779 0 0 0 >>>> 3439254 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Jul 22, 2014 at 11:18 AM, John Jasen wrote: >>>> >>>>> Feedback and/or tips and tricks more than welcome. >>>>> >>>> >>>> _______________________________________________ >>>> freebsd-net@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>>> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >