Date: Fri, 25 Apr 2014 09:58:31 -0600 From: Alan Somers <asomers@freebsd.org> To: Adrian Chadd <adrian@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r253687 - head/sys/net Message-ID: <CAOtMX2iXzXY1zAebqxJYGXw-_HRmRGmkpw7fgyLkvavJGcZy=g@mail.gmail.com> In-Reply-To: <201307261941.r6QJfEMO087844@svn.freebsd.org> References: <201307261941.r6QJfEMO087844@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jul 26, 2013 at 1:41 PM, Adrian Chadd <adrian@freebsd.org> wrote: > Author: adrian > Date: Fri Jul 26 19:41:13 2013 > New Revision: 253687 > URL: http://svnweb.freebsd.org/changeset/base/253687 > > Log: > Break out the static, global LACP debug options into a per-lagg unit > sysctl tree. > > * Create a net.link.lagg.X.lacp node I think this introduced a lock order reversal. > * Add a debug node under that for tx_test and rx_test > * Add lacp_strict_mode, defaulting to 1 > > tx_test and rx_test are still a bitmap of unit numbers for now. > At some point it would be nice to create child nodes of the lagg bundle > for each sub-interface, and then populate those with various knobs > and statistics. > > Sponsored by: Netflix > > Modified: > head/sys/net/ieee8023ad_lacp.c > head/sys/net/ieee8023ad_lacp.h > head/sys/net/if_lagg.c > head/sys/net/if_lagg.h > > Modified: head/sys/net/ieee8023ad_lacp.c > ============================================================================== > --- head/sys/net/ieee8023ad_lacp.c Fri Jul 26 19:11:08 2013 (r253686) > +++ head/sys/net/ieee8023ad_lacp.c Fri Jul 26 19:41:13 2013 (r253687) <Extra chunks elided> ; > @@ -765,10 +791,19 @@ lacp_attach(struct lagg_softc *sc) > > lsc->lsc_hashkey = arc4random(); > lsc->lsc_active_aggregator = NULL; > + lsc->lsc_strict_mode = 1; > LACP_LOCK_INIT(lsc); > TAILQ_INIT(&lsc->lsc_aggregators); > LIST_INIT(&lsc->lsc_ports); > > + /* Create a child of the parent lagg interface */ > + oid = SYSCTL_ADD_NODE(&sc->ctx, SYSCTL_CHILDREN(sc->sc_oid), > + OID_AUTO, "lacp", CTLFLAG_RD, NULL, "LACP"); This line grabs a sleepable lock, but we already had a nonsleepable lock further up the stack, acquired in lagg_ioctl(). > + > + /* Attach sysctl nodes */ > + lacp_attach_sysctl(lsc, oid); > + lacp_attach_sysctl_debug(lsc, oid); > + > callout_init_mtx(&lsc->lsc_transit_callout, &lsc->lsc_mtx, 0); > callout_init_mtx(&lsc->lsc_callout, &lsc->lsc_mtx, 0); > Here's the warning from Witness.as well as a warning from UMA. Many more UMA warnings followed. lock order reversal: (sleepable after non-sleepable) 1st 0xfffff8000252ca08 if_lagg rmlock (if_lagg rmlock) @ /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c:1040 2nd 0xffffffff814ef4e0 sysctl lock (sysctl lock) @ /usr/home/alans/freebsd/head/sys/kern/kern_sysctl.c:474 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00977485b0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe0097748660 witness_checkorder() at witness_checkorder+0xdc2/frame 0xfffffe00977486f0 _sx_xlock() at _sx_xlock+0x75/frame 0xfffffe0097748730 sysctl_add_oid() at sysctl_add_oid+0x4a/frame 0xfffffe0097748780 lacp_attach() at lacp_attach+0xf7/frame 0xfffffe00977487f0 lagg_lacp_attach() at lagg_lacp_attach+0x88/frame 0xfffffe0097748810 lagg_ioctl() at lagg_ioctl+0x98a/frame 0xfffffe00977488f0 in_control() at in_control+0x38e/frame 0xfffffe0097748970 ifioctl() at ifioctl+0xba2/frame 0xfffffe0097748a30 kern_ioctl() at kern_ioctl+0x22b/frame 0xfffffe0097748a90 sys_ioctl() at sys_ioctl+0x13c/frame 0xfffffe0097748ae0 amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe0097748bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0097748bf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800fa045a, rsp = 0x7fffffffe118, rbp = 0x7fffffffe1a0 --- uma_zalloc_arg: zone "128" with the following non-sleepable locks held: exclusive rm if_lagg rmlock (if_lagg rmlock) r = 0 (0xfffff8000252ca08) locked @ /usr/home/alans/freebsd/head/sys/modules/if_lagg/../../net/if_lagg.c:1040 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0097748500 kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe00977485b0 witness_warn() at witness_warn+0x4b5/frame 0xfffffe0097748670 uma_zalloc_arg() at uma_zalloc_arg+0x3b/frame 0xfffffe00977486e0 malloc() at malloc+0x194/frame 0xfffffe0097748730 sysctl_add_oid() at sysctl_add_oid+0x11f/frame 0xfffffe0097748780 lacp_attach() at lacp_attach+0xf7/frame 0xfffffe00977487f0 lagg_lacp_attach() at lagg_lacp_attach+0x88/frame 0xfffffe0097748810 lagg_ioctl() at lagg_ioctl+0x98a/frame 0xfffffe00977488f0 in_control() at in_control+0x38e/frame 0xfffffe0097748970 ifioctl() at ifioctl+0xba2/frame 0xfffffe0097748a30 kern_ioctl() at kern_ioctl+0x22b/frame 0xfffffe0097748a90 sys_ioctl() at sys_ioctl+0x13c/frame 0xfffffe0097748ae0 amd64_syscall() at amd64_syscall+0x25a/frame 0xfffffe0097748bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0097748bf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800fa045a, rsp = 0x7fffffffe118, rbp = 0x7fffffffe1a0 --- # uname -a FreeBSD alans-fbsd-head 11.0-CURRENT FreeBSD 11.0-CURRENT #49 r264887M: Thu Apr 24 17:21:48 MDT 2014 alans@ns1.eng.sldomain.com:/vmpool/obj/usr/home/alans/freebsd/head/sys/GENERIC amd64 To reproduce: ifconfig tap0 create ifconfig tap1 create ifconfig tap2 create ifconfig lagg0 create ifconfig lagg0 up laggproto lacp laggport tap0 laggport tap1 laggport tap2 192.0.0.2/24 If I create and destroy the lagg in a tight loop, while running "ifconfig -am" in a tight loop in another terminal, I eventually hit a general protection fault in __mtx_lock_sleep. I think it might be related. Can you reproduce this? Do you have any good ideas for a solution? -Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2iXzXY1zAebqxJYGXw-_HRmRGmkpw7fgyLkvavJGcZy=g>