From owner-freebsd-net@FreeBSD.ORG Fri Jul 25 20:53:32 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 00AE7B6 for ; Fri, 25 Jul 2014 20:53:31 +0000 (UTC) Received: from mail-qg0-x234.google.com (mail-qg0-x234.google.com [IPv6:2607:f8b0:400d:c04::234]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B52BB2C99 for ; Fri, 25 Jul 2014 20:53:31 +0000 (UTC) Received: by mail-qg0-f52.google.com with SMTP id f51so5555160qge.25 for ; Fri, 25 Jul 2014 13:53:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=2l+3oLLC2oQkuw60fOpC3f4Qo4fJIJJanzUvkEHNpFI=; b=PHrC4U/BtVOB4v9aEqv9vxkYG7jJDR8D4A6Qjz7FsUtEdIDRaBGj2iuEZnvK8jWwoL jDs1bwoY4aCCNgM4OmOFIkZ1FVO3le87Mn4Nm8RT4XTjDuPXA06GQ/IDFvN44C2oYGse 5Odh7ZN00fJL09B1fWH+taMzGvu78FhKvAL9qx+mFOFd+eEk/uibQLphleOJv3/iH2nf CUefwc/ufzdBt/MduUbZ9EXKs5h0xU7P6d8WjnT1zbiOSce0sMC2iISNzI1tLTca544n NpePKk4S40icvA+UtASReYbHt4mAiuMdOsiwomq74J4VMRtqoJZGtTGxiLAD3X92YCWb zsCA== MIME-Version: 1.0 X-Received: by 10.224.55.131 with SMTP id u3mr31361011qag.98.1406321610672; Fri, 25 Jul 2014 13:53:30 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.1.6 with HTTP; Fri, 25 Jul 2014 13:53:30 -0700 (PDT) In-Reply-To: References: <53CE80DD.9090109@gmail.com> Date: Fri, 25 Jul 2014 13:53:30 -0700 X-Google-Sender-Auth: yqgjqWjp5jco-hQdL024_cVe0Q4 Message-ID: Subject: Re: fastforward/routing: a 3 million packet-per-second system? From: Adrian Chadd To: John Jasen Content-Type: text/plain; charset=UTF-8 Cc: FreeBSD Net X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jul 2014 20:53:32 -0000 Yeah: Adrians-MacBook-Pro:Downloads adrian$ head -2 debug.lock.prof.stats.out.20140725 ; cat debug.lock.prof.stats.out.20140725 | sort -nk4 | tail -10 debug.lock.prof.stats: max wait_max total wait_total count avg wait_avg cnt_hold cnt_lock name 6 3 419 145 160 2 0 0 63 /usr/src/sys/kern/kern_condvar.c:145 (sleep mutex:Giant) 282 133 991 215 8 123 26 0 2 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 (sleep mutex:cxl3 txq26) 69 72 71 250 5 14 50 0 4 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 (sleep mutex:cxl1 txq37) 281 197 1638 286 13 126 22 0 2 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 (sleep mutex:cxl1 txq46) 351 182 2416 499 38 63 13 0 10 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 (sleep mutex:cxl3 txq17) 276 193 802 643 10 80 64 0 5 /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_main.c:6657 (sleep mutex:cxl3 txq27) 0 1 98578 1341 482441 0 0 0 3767 /usr/src/sys/kern/subr_turnstile.c:552 (spin mutex:turnstile chain) 7 13 11543138 470545 63952832 0 0 0 815777 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry) 6 15 3943582 1545195 63952779 0 0 0 3439254 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry) 7 17 3271389 2258698 63952832 0 0 0 6761237 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry) .. try FLOWTABLE. The in_rmx.c is the hook to check for temporary routes installed by redirect ICMP messages. It's .. not very pretty. Just use FLOWTABLE for now and see if it improves things. (Yes, we likely can do better on the rtentry locking..) -a On 25 July 2014 13:51, Adrian Chadd wrote: > Ugh, the forwarding table stupidity. Try enabling FLOWTABLE as an option. > > I really dislike how the rtentry locking works. But that isn't a > rwlock - i'll have to look at your full lock profiling output to see. > > > -a > > > On 25 July 2014 09:20, John Jasen wrote: >> Based on advice I received, I've collected lock profile debugging output, >> and pmcannotate'd data from the system while it was processing about 3 >> million packets/second. >> >> Combined, the files are about 325k in size, so I'll submit highlights here. >> I can provide the raw files to interested parties privately. >> >> pmcannotate summary output: >> >> grep ^Profile pmcannotate.20140725 >> Profile trace for function: __rw_rlock() [17.04%] >> Profile trace for function: __mtx_unlock_flags() [9.10%] >> Profile trace for function: _rw_runlock_cookie() [7.67%] >> Profile trace for function: sched_idletd() [5.73%] >> Profile trace for function: memcpy() [5.64%] >> Profile trace for function: bcopy() [5.04%] >> Profile trace for function: bcmp() [5.01%] >> Profile trace for function: __mtx_lock_flags() [3.66%] >> Profile trace for function: t4_eth_tx() [3.25%] >> Profile trace for function: lock_profile_release_lock() [2.73%] >> Profile trace for function: ip_fastforward() [2.68%] >> Profile trace for function: ether_output() [2.50%] >> Profile trace for function: get_scatter_segment() [1.75%] >> Profile trace for function: rn_match() [1.74%] >> Profile trace for function: _mtx_lock_spin_cookie() [1.53%] >> Profile trace for function: lock_profile_obtain_lock_success() [1.49%] >> Profile trace for function: cxgbe_transmit() [1.37%] >> Profile trace for function: uma_zalloc_arg() [1.31%] >> Profile trace for function: bzero() [1.30%] >> Profile trace for function: service_iq() [1.26%] >> Profile trace for function: ether_nh_input() [1.23%] >> Profile trace for function: __mtx_lock_sleep() [1.19%] >> Profile trace for function: arpresolve() [1.07%] >> Profile trace for function: uma_zfree_arg() [0.95%] >> Profile trace for function: reclaim_tx_descs() [0.87%] >> Profile trace for function: _mtx_trylock_flags_() [0.80%] >> Profile trace for function: bounce_bus_dmamap_load_buffer() [0.72%] >> Profile trace for function: ether_demux() [0.64%] >> Profile trace for function: mb_ctor_mbuf() [0.63%] >> Profile trace for function: rtalloc1_fib() [0.54%] >> >> sysctl debug.lock.prof.stats summary: (some of the highest hit counts, >> especially in cnt_hold: >> >> 7 17 3271389 2258698 63952832 0 0 0 >> 6761237 /usr/src/sys/netinet/in_rmx.c:114 (sleep mutex:rtentry) >> >> 7 13 11543138 470545 63952832 0 0 0 >> 815777 /usr/src/sys/net/route.c:439 (sleep mutex:rtentry) >> >> 6 15 3943582 1545195 63952779 0 0 0 >> 3439254 /usr/src/sys/netinet/ip_fastfwd.c:593 (sleep mutex:rtentry >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Jul 22, 2014 at 11:18 AM, John Jasen wrote: >> >>> Feedback and/or tips and tricks more than welcome. >>> >> >> >> _______________________________________________ >> freebsd-net@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"