From owner-freebsd-net@FreeBSD.ORG Wed Feb 4 22:26:58 2015 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 720AB8BE for ; Wed, 4 Feb 2015 22:26:58 +0000 (UTC) Received: from phabric-backend.isc.freebsd.org (phabric-backend.isc.freebsd.org [IPv6:2001:4f8:3:ffe0:406a:0:50:2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 325F0F9D for ; Wed, 4 Feb 2015 22:26:58 +0000 (UTC) Received: from phabric-backend.isc.freebsd.org (phabric-backend.isc.freebsd.org [127.0.1.5]) by phabric-backend.isc.freebsd.org (8.14.9/8.14.9) with ESMTP id t14MQvgd042213 for ; Wed, 4 Feb 2015 22:26:57 GMT (envelope-from root@phabric-backend.isc.freebsd.org) Received: (from root@localhost) by phabric-backend.isc.freebsd.org (8.14.9/8.14.9/Submit) id t14MQvDt042211; Wed, 4 Feb 2015 22:26:57 GMT (envelope-from root) Date: Wed, 4 Feb 2015 22:26:57 +0000 To: freebsd-net@freebsd.org From: "rrs (Randall Stewart)" Subject: [Differential] [Commented On] D1777: Associated fix for arp/nd6 timer usage. Message-ID: <1e8c87cf7782da9764a3dd929f7e2a43@localhost.localdomain> X-Priority: 3 Thread-Topic: D1777: Associated fix for arp/nd6 timer usage. X-Herald-Rules: none X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-To: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: X-Phabricator-Cc: In-Reply-To: References: Thread-Index: N2Y2Y2VmY2ZjNTc1MTM4NTA3YmIzZDk3NmE4IFTSnLE= X-Phabricator-Sent-This-Message: Yes X-Mail-Transport-Agent: MetaMTA X-Auto-Response-Suppress: All X-Phabricator-Mail-Tags: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Feb 2015 22:26:58 -0000 rrs added a comment. I don't think this is a refcnt issue bz, the base of this is a hole in the way the callout code works. Basically there is a window when a) The callout_wheel is executing, it sees that a "lock" has been configured, so it goes to release the callout wheel lock and then lock the callout init'd lock b) At that time some other cpu has the lock (that was inited on the callout), and it then runs a callout_stop (not drain). This cause the callout to "stop" the callout from running (which it can do). It sets a flag on the callout and returns to the caller. The caller (lle in this case) proceeds to delete the ref cnt since the callout was stopped (and it is it won't be run). It then in the end purges the memory. c) Now we resume above and it now de-ref's the lock. This window is not avoidable with the way the current callout code is architected. It can only be avoided by the caller getting the lock not the callout system. That way it won't de-ref the lock and blow up when it hits deleted memory. There may be other ways to fix this, but I don't know how we can change the callout system to handle it.. Even Han's re-write has this same problem if you use the callout_stop and not callout_drain* REVISION DETAIL https://reviews.freebsd.org/D1777 To: rrs, jhb, imp, sbruno, gnn, rwatson, lstewart, kostikbel, adrian, bz Cc: bz, emaste, hiren, julian, hselasky, freebsd-net