From owner-svn-src-all@FreeBSD.ORG Mon Feb 9 19:28:13 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3FF184B6; Mon, 9 Feb 2015 19:28:13 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 217FC171; Mon, 9 Feb 2015 19:28:13 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.9/8.14.9) with ESMTP id t19JSC3b066300; Mon, 9 Feb 2015 19:28:12 GMT (envelope-from rrs@FreeBSD.org) Received: (from rrs@localhost) by svn.freebsd.org (8.14.9/8.14.9/Submit) id t19JSC5P066293; Mon, 9 Feb 2015 19:28:12 GMT (envelope-from rrs@FreeBSD.org) Message-Id: <201502091928.t19JSC5P066293@svn.freebsd.org> X-Authentication-Warning: svn.freebsd.org: rrs set sender to rrs@FreeBSD.org using -f From: Randall Stewart Date: Mon, 9 Feb 2015 19:28:12 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r278472 - in head/sys: netinet netinet6 X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Feb 2015 19:28:13 -0000 Author: rrs Date: Mon Feb 9 19:28:11 2015 New Revision: 278472 URL: https://svnweb.freebsd.org/changeset/base/278472 Log: This fixes a bug in the way that the LLE timers for nd6 and arp were being used. They basically would pass in the mutex to the callout_init. Because they used this method to the callout system, it was possible to "stop" the callout. When flushing the table and you stopped the running callout, the callout_stop code would return 1 indicating that it was going to stop the callout (that was about to run on the callout_wheel blocked by the function calling the stop). Now when 1 was returned, it would lower the reference count one extra time for the stopped timer, then a few lines later delete the memory. Of course the callout_wheel was stuck in the lock code and would then crash since it was accessing freed memory. By using callout_init(c, 1) we always get a 0 back and the reference counting bug does not rear its head. We do have to make a few adjustments to the callouts themselves though to make sure it does the proper thing if rescheduled as well as gets the lock. Commented upon by hiren and sbruno See Phabricator D1777 for more details. Commented upon by hiren and sbruno Reviewed by: adrian, jhb and bz Sponsored by: Netflix Inc. Modified: head/sys/netinet/if_ether.c head/sys/netinet/in.c head/sys/netinet6/in6.c head/sys/netinet6/nd6.c Modified: head/sys/netinet/if_ether.c ============================================================================== --- head/sys/netinet/if_ether.c Mon Feb 9 19:21:54 2015 (r278471) +++ head/sys/netinet/if_ether.c Mon Feb 9 19:28:11 2015 (r278472) @@ -166,10 +166,28 @@ arptimer(void *arg) struct ifnet *ifp; if (lle->la_flags & LLE_STATIC) { - LLE_WUNLOCK(lle); return; } - + LLE_WLOCK(lle); + if (callout_pending(&lle->la_timer)) { + /* + * Here we are a bit odd here in the treatment of + * active/pending. If the pending bit is set, it got + * rescheduled before I ran. The active + * bit we ignore, since if it was stopped + * in ll_tablefree() and was currently running + * it would have return 0 so the code would + * not have deleted it since the callout could + * not be stopped so we want to go through + * with the delete here now. If the callout + * was restarted, the pending bit will be back on and + * we just want to bail since the callout_reset would + * return 1 and our reference would have been removed + * by arpresolve() below. + */ + LLE_WUNLOCK(lle); + return; + } ifp = lle->lle_tbl->llt_ifp; CURVNET_SET(ifp->if_vnet); Modified: head/sys/netinet/in.c ============================================================================== --- head/sys/netinet/in.c Mon Feb 9 19:21:54 2015 (r278471) +++ head/sys/netinet/in.c Mon Feb 9 19:28:11 2015 (r278472) @@ -962,8 +962,7 @@ in_lltable_new(const struct sockaddr *l3 lle->base.lle_refcnt = 1; lle->base.lle_free = in_lltable_free; LLE_LOCK_INIT(&lle->base); - callout_init_rw(&lle->base.la_timer, &lle->base.lle_lock, - CALLOUT_RETURNUNLOCKED); + callout_init(&lle->base.la_timer, 1); return (&lle->base); } Modified: head/sys/netinet6/in6.c ============================================================================== --- head/sys/netinet6/in6.c Mon Feb 9 19:21:54 2015 (r278471) +++ head/sys/netinet6/in6.c Mon Feb 9 19:28:11 2015 (r278472) @@ -2047,8 +2047,7 @@ in6_lltable_new(const struct sockaddr *l lle->base.lle_refcnt = 1; lle->base.lle_free = in6_lltable_free; LLE_LOCK_INIT(&lle->base); - callout_init_rw(&lle->base.ln_timer_ch, &lle->base.lle_lock, - CALLOUT_RETURNUNLOCKED); + callout_init(&lle->base.ln_timer_ch, 1); return (&lle->base); } Modified: head/sys/netinet6/nd6.c ============================================================================== --- head/sys/netinet6/nd6.c Mon Feb 9 19:21:54 2015 (r278471) +++ head/sys/netinet6/nd6.c Mon Feb 9 19:28:11 2015 (r278472) @@ -473,9 +473,28 @@ nd6_llinfo_timer(void *arg) KASSERT(arg != NULL, ("%s: arg NULL", __func__)); ln = (struct llentry *)arg; - LLE_WLOCK_ASSERT(ln); + LLE_WLOCK(ln); + if (callout_pending(&ln->la_timer)) { + /* + * Here we are a bit odd here in the treatment of + * active/pending. If the pending bit is set, it got + * rescheduled before I ran. The active + * bit we ignore, since if it was stopped + * in ll_tablefree() and was currently running + * it would have return 0 so the code would + * not have deleted it since the callout could + * not be stopped so we want to go through + * with the delete here now. If the callout + * was restarted, the pending bit will be back on and + * we just want to bail since the callout_reset would + * return 1 and our reference would have been removed + * by nd6_llinfo_settimer_locked above since canceled + * would have been 1. + */ + LLE_WUNLOCK(ln); + return; + } ifp = ln->lle_tbl->llt_ifp; - CURVNET_SET(ifp->if_vnet); if (ln->ln_ntick > 0) {