From owner-freebsd-net@FreeBSD.ORG Mon Apr 2 14:31:49 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8D894106566B for ; Mon, 2 Apr 2012 14:31:49 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 633A98FC16 for ; Mon, 2 Apr 2012 14:31:49 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id B4223B91C; Mon, 2 Apr 2012 10:31:48 -0400 (EDT) From: John Baldwin To: freebsd-net@freebsd.org Date: Mon, 2 Apr 2012 08:35:00 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201204020835.00357.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 02 Apr 2012 10:31:48 -0400 (EDT) Cc: Andrew Boyer Subject: Re: LACP kernel panics: /* unlocking is safe here */ X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Apr 2012 14:31:49 -0000 On Friday, March 30, 2012 6:04:24 pm Andrew Boyer wrote: > While investigating a LACP issue, I turned on LACP_DEBUG on a debug kernel. In this configuration it's easy to panic the kernel - just run 'ifconfig lagg0 laggproto lacp' on a lagg that's already in LACP mode and receiving LACP messages. > > The problem is that lagg_lacp_detach() drops the lagg wlock (with the comment in the title), which allows incoming LACP messages to get through lagg_input() while the structure is being destroyed in lacp_detach(). > > There's a very simple fix, but I don't know if it's the best way to fix it. Resetting the protocol before calling sc_detach causes any further incoming packets to be dropped until the lagg gets reconfigured. Thoughts? This looks sensible. > Is it safe to just hold on to the lagg wlock across the callout_drain() calls in lacp_detach()? That's what OpenBSD does. No, callout_drain() can sleep. Also, if this is using callout_init_mtx() or callout_init_rwlock() (which it probably should), then holding the lock across callout_drain() could deadlock. > -Andrew > > Index: sys/net/if_lagg.c > =================================================================== > --- sys/net/if_lagg.c (revision 233707) > +++ sys/net/if_lagg.c (working copy) > @@ -952,9 +952,10 @@ > } > if (sc->sc_proto != LAGG_PROTO_NONE) { > LAGG_WLOCK(sc); > + /* Reset protocol */ > + sc->sc_proto = LAGG_PROTO_NONE; > error = sc->sc_detach(sc); > - /* Reset protocol and pointers */ > - sc->sc_proto = LAGG_PROTO_NONE; > + /* Reset pointers */ > sc->sc_detach = NULL; > sc->sc_start = NULL; > sc->sc_input = NULL; > > -------------------------------------------------- > Andrew Boyer aboyer@averesystems.com > > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > -- John Baldwin