From owner-freebsd-bugs@freebsd.org Mon Nov 16 00:42:04 2015 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CF3FFA30BE0 for ; Mon, 16 Nov 2015 00:42:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9D88F1DAE for ; Mon, 16 Nov 2015 00:42:04 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAG0g4tS063180 for ; Mon, 16 Nov 2015 00:42:04 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 204340] [panic] nfsd, em, msix, fatal trap 9 Date: Mon, 16 Nov 2015 00:42:04 +0000 X-Bugzilla-Reason: AssignedTo CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.2-RELEASE X-Bugzilla-Keywords: IntelNetworking X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: rmacklem@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_status cc assigned_to attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Nov 2015 00:42:04 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204340 Rick Macklem changed: What |Removed |Added ---------------------------------------------------------------------------- Status|New |In Progress CC| |rmacklem@FreeBSD.org Assignee|freebsd-bugs@FreeBSD.org |rmacklem@FreeBSD.org --- Comment #2 from Rick Macklem --- Created attachment 163160 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=163160&action=edit patch that might fix this problem I think this crash might have been caused by a race between svcpool_destroy() and the socket upcall. The code in svcpool_destroy() assumes that SVC_RELEASE(xprt) drops the ref cnt to 0, so that SVC_DESTROY() is called. -->SVC_DESTROY() shuts down the socket upcall. --> If the ref cnt doesn't go to 0, svcpool_destroy() will mtx_destroy() the mutexes prematurely. I am not sure, but the race might have been introduced by r267228 since, prior to this there was a single mutex for the pool, held while all xprt's are unregistered. After r267228, there is a group of mutexes, where the code only held one at a time, so I think an xprt might get re-registered on another group after that group has had all de-registered. The attached little patch moves the mtx_lock() calls to a separate loop before the xprt_unregister loops, so that all locks are held while all are de-registered. I've added mav@ to the cc list, since he might be the guy that actually understands this. Anyhow, if you could test the attached patch with msi interrupts re-enabled and see if the crashes go away, that would be great. (I don't think that this indicates that the em(4) driver is broken. I suspect that it just affects timing of the interrupts that tripped over this race.) -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug.