From owner-svn-src-head@freebsd.org Sun Jul 22 18:08:23 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 361E11051D9E; Sun, 22 Jul 2018 18:08:23 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "alchemy.franken.de", Issuer "alchemy.franken.de" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id BB69D7E5E4; Sun, 22 Jul 2018 18:08:22 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.15.2/8.15.2/ALCHEMY.FRANKEN.DE) with ESMTPS id w6MI2cK3091010 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 22 Jul 2018 20:02:38 +0200 (CEST) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.15.2/8.15.2/Submit) id w6MI2c0S091009; Sun, 22 Jul 2018 20:02:38 +0200 (CEST) (envelope-from marius) Date: Sun, 22 Jul 2018 20:02:38 +0200 From: Marius Strobl To: Alexander Leidinger Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r336313 - in head/sys: dev/bnxt dev/e1000 dev/ixgbe dev/ixl net sys Message-ID: <20180722180238.GY21523@alchemy.franken.de> References: <201807151904.w6FJ4NNg039896@repo.freebsd.org> <20180718223313.Horde.lYE8PRYqLdkrN3QMTTHx3aV@webmail.leidinger.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180718223313.Horde.lYE8PRYqLdkrN3QMTTHx3aV@webmail.leidinger.net> User-Agent: Mutt/1.9.2 (2017-12-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (alchemy.franken.de [0.0.0.0]); Sun, 22 Jul 2018 20:02:38 +0200 (CEST) X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Jul 2018 18:08:23 -0000 On Wed, Jul 18, 2018 at 10:33:13PM +0200, Alexander Leidinger wrote: > Quoting Marius Strobl (from Sun, 15 Jul 2018 > 19:04:23 +0000 (UTC)): > > > Author: marius > > Date: Sun Jul 15 19:04:23 2018 > > New Revision: 336313 > > URL: https://svnweb.freebsd.org/changeset/base/336313 > > > > Log: > > Assorted TSO fixes for em(4)/iflib(9) and dead code removal: > [...] > > Okayed by: sbruno@ at 201806 DevSummit Transport Working Group [1] > > Reviewed by: sbruno (earlier version), erj > > PR: 219428 (part of; comment #10) [1], 220997 (part of; comment #3) > > Hi Marius, > > thanks a lot for this change, it improves the situation (PR 220997) a > lot. The system is running at r336329, as such I don't have your > change r336356 yet on the system. Maybe the 2 panics (more below) I've > seen are fixed by this. Before I try your second change (surely not > before the WE), here at least the report in case it is related to your > changes and not related to r336313: > > I got 2 panics, both within 6 minutes (based upon the timestamp of the > coredumps in the filesystem): > > 1) > panic: Assertion ifsd_m[next] == NULL failed at /usr/src/sys/net/iflib.c:3151 > cpuid = 2 > time = 1531944124 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008af85850 > vpanic() at vpanic+0x1a3/frame 0xfffffe008af858b0 > doadump() at doadump/frame 0xfffffe008af85930 > iflib_txq_drain() at iflib_txq_drain+0xe58/frame 0xfffffe008af85aa0 > ifmp_ring_check_drainage() at ifmp_ring_check_drainage+0x16c/frame > 0xfffffe008af85b00 > _task_fn_tx() at _task_fn_tx+0x76/frame 0xfffffe008af85b30 > gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame > 0xfffffe008af85b80 > gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame > 0xfffffe008af85bb0 > fork_exit() at fork_exit+0x84/frame 0xfffffe008af85bf0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008af85bf0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > Uptime: 1d22h51m17s > Dumping 2990 out of 8037 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > __curthread () at ./machine/pcpu.h:230 > 230 __asm("movq %%gs:%1,%0" : "=r" (td) > (kgdb) #0 __curthread () at ./machine/pcpu.h:230 > #1 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:366 > #2 0xffffffff80485ea1 in kern_reboot (howto=260) > at /usr/src/sys/kern/kern_shutdown.c:446 > #3 0xffffffff80486483 in vpanic (fmt=, ap=0xfffffe008af858f0) > at /usr/src/sys/kern/kern_shutdown.c:863 > #4 0xffffffff804861f0 in kassert_panic ( > fmt=0xffffffff807e085f "Assertion %s failed at %s:%d") > at /usr/src/sys/kern/kern_shutdown.c:749 > #5 0xffffffff8059cd78 in iflib_busdma_load_mbuf_sg (flags=0, > txq=, tag=, map=, > m0=, segs=, nsegs=, > max_segs=) at /usr/src/sys/net/iflib.c:3151 > #6 iflib_encap (txq=0xfffff800028dc000, m_headp=0xfffffe00959bdd30) > at /usr/src/sys/net/iflib.c:3321 > #7 iflib_txq_drain (r=0xfffffe00959ba000, cidx=, > pidx=41319936) at /usr/src/sys/net/iflib.c:3636 > #8 0xffffffff805a0f4c in drain_ring_lockless (r=, os=..., > prev=, budget=) > at /usr/src/sys/net/mp_ring.c:199 > #9 ifmp_ring_check_drainage (r=, budget=32) > at /usr/src/sys/net/mp_ring.c:502 > #10 0xffffffff80599c46 in _task_fn_tx (context=) > at /usr/src/sys/net/iflib.c:3747 > #11 0xffffffff804cd2c9 in gtaskqueue_run_locked (queue=0xfffff800025e0d00) > at /usr/src/sys/kern/subr_gtaskqueue.c:332 > #12 0xffffffff804cd048 in gtaskqueue_thread_loop (arg=) > at /usr/src/sys/kern/subr_gtaskqueue.c:507 > #13 0xffffffff8044cc34 in fork_exit ( > callout=0xffffffff804ccfc0 , > arg=0xfffffe0007ffd038, frame=0xfffffe008af85c00) > at /usr/src/sys/kern/kern_fork.c:1057 > (kgdb) up 5 > #5 0xffffffff8059cd78 in iflib_busdma_load_mbuf_sg (flags=0, > txq=, tag=, > map=, m0=, segs=, > nsegs=, max_segs=) > at /usr/src/sys/net/iflib.c:3151 > 3151 MPASS(ifsd_m[next] == NULL); > (kgdb) list > 3146 /* > 3147 * see if we can't be smarter about physically > 3148 * contiguous mappings > 3149 */ > 3150 next = (pidx + count) & (ntxd-1); > 3151 MPASS(ifsd_m[next] == NULL); > 3152 #if MEMORY_LOGGING > 3153 txq->ift_enqueued++; > 3154 #endif > 3155 ifsd_m[next] = m; > (kgdb) print ifsd_m > $1 = (struct mbuf **) 0xfffffe00959b8000 > (kgdb) print next > $2 = > (kgdb) print pidx > $3 = 277 > (kgdb) print count > $4 = 0 > (kgdb) print ntxd > $5 = > > > 2) > Unread portion of the kernel message buffer: > panic: Assertion ifsd_m[next] == NULL failed at /usr/src/sys/net/iflib.c:3151 > cpuid = 2 > time = 1531944550 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe008af85850 > vpanic() at vpanic+0x1a3/frame 0xfffffe008af858b0 > doadump() at doadump/frame 0xfffffe008af85930 > iflib_txq_drain() at iflib_txq_drain+0xe58/frame 0xfffffe008af85aa0 > ifmp_ring_check_drainage() at ifmp_ring_check_drainage+0x16c/frame > 0xfffffe008af85b00 > _task_fn_tx() at _task_fn_tx+0x76/frame 0xfffffe008af85b30 > gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame > 0xfffffe008af85b80 > gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame > 0xfffffe008af85bb0 > fork_exit() at fork_exit+0x84/frame 0xfffffe008af85bf0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe008af85bf0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > Uptime: 5m27s > Dumping 1555 out of 8037 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > __curthread () at ./machine/pcpu.h:230 > 230 __asm("movq %%gs:%1,%0" : "=r" (td) > (kgdb) bt > #0 __curthread () at ./machine/pcpu.h:230 > #1 doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:366 > #2 0xffffffff80485ea1 in kern_reboot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:446 > #3 0xffffffff80486483 in vpanic (fmt=, ap=0xfffffe008af858f0) > at /usr/src/sys/kern/kern_shutdown.c:863 > #4 0xffffffff804861f0 in kassert_panic (fmt=0xffffffff807e085f > "Assertion %s failed at %s:%d") > at /usr/src/sys/kern/kern_shutdown.c:749 > #5 0xffffffff8059cd78 in iflib_busdma_load_mbuf_sg (flags=0, > txq=, tag=, > map=, m0=, segs=, > nsegs=, max_segs=) > at /usr/src/sys/net/iflib.c:3151 > #6 iflib_encap (txq=0xfffff800028fe000, m_headp=0xfffffe00959bdde8) > at /usr/src/sys/net/iflib.c:3321 > #7 iflib_txq_drain (r=0xfffffe00959ba000, cidx=, > pidx=42948608) at /usr/src/sys/net/iflib.c:3636 > #8 0xffffffff805a0f4c in drain_ring_lockless (r=, > os=..., prev=, > budget=) at /usr/src/sys/net/mp_ring.c:199 > #9 ifmp_ring_check_drainage (r=, budget=32) at > /usr/src/sys/net/mp_ring.c:502 > #10 0xffffffff80599c46 in _task_fn_tx (context=) at > /usr/src/sys/net/iflib.c:3747 > #11 0xffffffff804cd2c9 in gtaskqueue_run_locked (queue=0xfffff800025a2200) > at /usr/src/sys/kern/subr_gtaskqueue.c:332 > #12 0xffffffff804cd048 in gtaskqueue_thread_loop (arg=) > at /usr/src/sys/kern/subr_gtaskqueue.c:507 > #13 0xffffffff8044cc34 in fork_exit (callout=0xffffffff804ccfc0 > , arg=0xfffffe0007ffd038, > frame=0xfffffe008af85c00) at /usr/src/sys/kern/kern_fork.c:1057 > #14 > (kgdb) up 5 > #5 0xffffffff8059cd78 in iflib_busdma_load_mbuf_sg (flags=0, > txq=, tag=, > map=, m0=, segs=, > nsegs=, max_segs=) > at /usr/src/sys/net/iflib.c:3151 > 3151 MPASS(ifsd_m[next] == NULL); > (kgdb) print ifsd_m > $1 = (struct mbuf **) 0xfffffe00959b8000 > (kgdb) print pidx > $2 = 707 > (kgdb) print count > $3 = 0 Hrm, so far I neither see how iflib(9) could get into that state nor did I succeed in reproducing the panic, including not with a LEM-class MAC. Is that an old or a new problem? If the latter, please try with r336612. The fix in r336356 is only relevant for IGB-class devices so doesn't apply to your machine unless the above panics are from gear different than what PR 220997 is about. Marius