From owner-freebsd-net@freebsd.org Tue May 8 17:37:58 2018 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 25270FBD2B9 for ; Tue, 8 May 2018 17:37:58 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mx0.gentlemail.de (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9B16B84AEB; Tue, 8 May 2018 17:37:57 +0000 (UTC) (envelope-from freebsd@omnilan.de) Received: from mh0.gentlemail.de (ezra.dcm1.omnilan.net [IPv6:2a00:e10:2800::a135]) by mx0.gentlemail.de (8.14.5/8.14.5) with ESMTP id w48HbuK0082917; Tue, 8 May 2018 19:37:56 +0200 (CEST) (envelope-from freebsd@omnilan.de) Received: from titan.inop.mo1.omnilan.net (s1.omnilan.de [217.91.127.234]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mh0.gentlemail.de (Postfix) with ESMTPSA id 161C916C; Tue, 8 May 2018 19:37:56 +0200 (CEST) Message-ID: <5AF1E073.5010701@omnilan.de> Date: Tue, 08 May 2018 19:37:55 +0200 From: Harry Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: Sean Bruno CC: Kevin Bowling , "freebsd-net@freebsd.org" , Stephen Hurd Subject: Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys] References: <201805072142.w47LgN1R041002@repo.freebsd.org> <5AF16B8B.7030703@omnilan.de> <5AF17134.7020602@omnilan.de> <5AF1CF0F.4040909@omnilan.de> <65972f0d-2873-42ea-464c-a3db543abafb@freebsd.org> In-Reply-To: <65972f0d-2873-42ea-464c-a3db543abafb@freebsd.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (mx0.gentlemail.de [IPv6:2a00:e10:2800::a130]); Tue, 08 May 2018 19:37:56 +0200 (CEST) X-Milter: Spamilter (Reciever: mx0.gentlemail.de; Sender-ip: ; Sender-helo: mh0.gentlemail.de; ) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 May 2018 17:37:58 -0000 Bezüglich Sean Bruno's Nachricht vom 08.05.2018 18:44 (localtime): > > > On 05/08/18 10:23, Harry Schmalzbauer wrote: >> Bezüglich Kevin Bowling's Nachricht vom 08.05.2018 11:52 (localtime): >> … >>>> But if the simple iflib/hw-support test with kawela+hartwell helps I'm >>>> happy to do. >>> >>> At this point it would be helpful, we think e1000 is nearing pretty >>> good shape and I need to become familiar with any outstanding bugs. >> >> I started with hartwell: >> em1: attach_pre capping queues at 2 >> >> Current cap: 0x460b >> em1: using 1024 tx descriptors and 1024 rx descriptors >> em1: msix_init qsets capped at 2 >> em1: pxm cpus: 2 queue msgs: 4 admincnt: 1 >> em1: using 2 rx queues 2 tx queues >> em1: Using MSIX interrupts with 3 vectors >> em1: allocated for 2 tx_queues >> em1: allocated for 2 rx_queues >> em1: Ethernet address: 00:1b:21:3e:90:52 >> em1: netmap queues/slots: TX 2/1024, RX 2/1024 >> dev.em.1.iflib.driver_version: 7.6.1-k >> dev.em.1.queue_rx_1.rx_irq: 0 >> dev.em.1.queue_rx_1.rxd_tail: 607 >> dev.em.1.queue_rx_1.rxd_head: 21 >> dev.em.1.queue_rx_0.rx_irq: 0 >> dev.em.1.queue_rx_0.rxd_tail: 410 >> dev.em.1.queue_rx_0.rxd_head: 412 >> dev.em.1.queue_tx_1.tx_irq: 0 >> dev.em.1.queue_tx_1.txd_tail: 8 >> dev.em.1.queue_tx_1.txd_head: 8 >> dev.em.1.queue_tx_0.tx_irq: 0 >> dev.em.1.queue_tx_0.txd_tail: 428 >> dev.em.1.queue_tx_0.txd_head: 428 >> >> Looks good so far, no problems with simple line speed (NFS4) copies. >> >> According to the i217 (Clarkville) Datasheet, it also supports 2 queues: >> Table 63. Intel® Ethernet Controller I217 Capability PHY Address 01, >> Page 776,Register 19 >> But it probably was never supported, at least I haven't ever checked >> pre-iflib. >> Here's the clakville: >> em0: attach_pre capping queues at 1 >> em0: using 1024 tx descriptors and 1024 rx descriptors >> em0: msix_init qsets capped at >> em0: PCIY_MSIX capability not found; or rid 0 == 0. >> em0: Using an MSI interrupt >> em0: allocated for 1 tx_queues >> em0: allocated for 1 rx_queues >> em0: Ethernet address: 54:be:f7:0b:d7:4e >> em0: netmap queues/slots: TX 1/1024, RX 1/1024 >> >> Since it's not not effort here, I also tried LACP, which panicked. >> vmcore available, but what debugger to use these days? kgdb seems to be >> replaced... >> >> -harry >> _____________ > > /usr/libexec/kgdb should be the old kgdb that you are used to. Most of > us have switched to using devel/gdb from ports. Thanks, me stupid – it's in libexec, not in my path... Unfortunately I have no clue about those essential C tools, so it doesn't make much sense for me to waste energy installing devel/gdb ;-) While I'm wondering why/how LLVM/gdb can be mixed... pure lack of essentials :-( So back to iflib-if_em panic after setting up a if_lagg(4) interface (which consists of an addon 82574 and the on-board (PCH)+i217 NIC, which was assigned a locally administrated ethernet address and used as first laggport, so the private MAC was (successfully) set on both NICs) and firing dhclient to get a lease: Sleeping on "e1000_delay" with the following non-sleepable locks held: exclusive rm if_lagg rmlock (if_lagg rmlock) r = 0 (0xfffff80014228c08) locked @ /usr/src/sys/net/if_lagg.c:1433 stack backtrace: #0 0xffffffff80701113 at witness_debugger+0x73 #1 0xffffffff807024f1 at witness_warn+0x461 #2 0xffffffff806a42cc at _sleep+0x6c #3 0xffffffff806a4b34 at pause_sbt+0x144 #4 0xffffffff80440e21 at e1000_write_phy_reg_mdic+0xf1 #5 0xffffffff804446bf at e1000_enable_phy_wakeup_reg_access_bm+0x2f #6 0xffffffff80432e0a at e1000_update_mc_addr_list_pch2lan+0x3a #7 0xffffffff8041408f at em_if_multi_set+0x1bf #8 0xffffffff807bc02e at iflib_if_ioctl+0xfe #9 0xffffffff82111a15 at lagg_ioctl+0x115 #10 0xffffffff807dd348 at inm_release_task+0x218 #11 0xffffffff806dea29 at gtaskqueue_run_locked+0x139 #12 0xffffffff806de7a8 at gtaskqueue_thread_loop+0x88 #13 0xffffffff80659d84 at fork_exit+0x84 #14 0xffffffff809b767e at fork_trampoline+0xe Sleeping thread (tid 100017, pid 0) owns a non-sleepable lock KDB: stack backtrace of thread 100017: sched_switch() at sched_switch+0x945/frame 0xfffffe00750dc5d0 mi_switch() at mi_switch+0x18c/frame 0xfffffe00750dc600 sleepq_switch() at sleepq_switch+0x10d/frame 0xfffffe00750dc640 sleepq_timedwait() at sleepq_timedwait+0x50/frame 0xfffffe00750dc680 _sleep() at _sleep+0x307/frame 0xfffffe00750dc730 pause_sbt() at pause_sbt+0x144/frame 0xfffffe00750dc780 e1000_write_phy_reg_mdic() at e1000_write_phy_reg_mdic+0xf1/frame 0xfffffe00750dc7c0 e1000_enable_phy_wakeup_reg_access_bm() at e1000_enable_phy_wakeup_reg_access_bm+0x2f/frame 0xfffffe00750dc7e0 e1000_update_mc_addr_list_pch2lan() at e1000_update_mc_addr_list_pch2lan+0x3a/frame 0xfffffe00750dc820 em_if_multi_set() at em_if_multi_set+0x1bf/frame 0xfffffe00750dc870 iflib_if_ioctl() at iflib_if_ioctl+0xfe/frame 0xfffffe00750dc8e0 lagg_ioctl() at lagg_ioctl+0x115/frame 0xfffffe00750dc990 inm_release_task() at inm_release_task+0x218/frame 0xfffffe00750dc9f0 gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame 0xfffffe00750dca40 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame 0xfffffe00750dca70 fork_exit() at fork_exit+0x84/frame 0xfffffe00750dcab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00750dcab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- panic: sleeping thread cpuid = 3 time = 1525794682 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008fe180e0 vpanic() at vpanic+0x1a3/frame 0xfffffe008fe18140 panic() at panic+0x43/frame 0xfffffe008fe181a0 propagate_priority() at propagate_priority+0x335/frame 0xfffffe008fe181e0 turnstile_wait() at turnstile_wait+0x38d/frame 0xfffffe008fe18230 __mtx_lock_sleep() at __mtx_lock_sleep+0x1e1/frame 0xfffffe008fe182b0 __mtx_lock_flags() at __mtx_lock_flags+0xf9/frame 0xfffffe008fe18300 _rm_rlock() at _rm_rlock+0x280/frame 0xfffffe008fe18330 _rm_rlock_debug() at _rm_rlock_debug+0x14c/frame 0xfffffe008fe18380 lagg_transmit() at lagg_transmit+0x38/frame 0xfffffe008fe183f0 ether_output_frame() at ether_output_frame+0xaa/frame 0xfffffe008fe18420 ether_output() at ether_output+0x68b/frame 0xfffffe008fe184c0 arprequest() at arprequest+0x474/frame 0xfffffe008fe185c0 arp_ifinit() at arp_ifinit+0x58/frame 0xfffffe008fe18600 ether_ioctl() at ether_ioctl+0x1d1/frame 0xfffffe008fe18630 lagg_ioctl() at lagg_ioctl+0x602/frame 0xfffffe008fe186e0 in_control() at in_control+0x8f5/frame 0xfffffe008fe18780 ifioctl() at ifioctl+0x19c6/frame 0xfffffe008fe18850 kern_ioctl() at kern_ioctl+0x2b9/frame 0xfffffe008fe188b0 sys_ioctl() at sys_ioctl+0x168/frame 0xfffffe008fe18980 amd64_syscall() at amd64_syscall+0x2cc/frame 0xfffffe008fe18ab0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe008fe18ab0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8004820ba, rsp = 0x7fffffffe1c8, rbp = 0x7fffffffe210 --- KDB: enter: panic Hope this helps, -harry