From owner-freebsd-net@freebsd.org Tue Nov 17 01:29:31 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EDD7CA30134 for ; Tue, 17 Nov 2015 01:29:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C02AC1A81 for ; Tue, 17 Nov 2015 01:29:31 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id tAH1TVOZ008919 for ; Tue, 17 Nov 2015 01:29:31 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-net@FreeBSD.org Subject: [Bug 204340] [panic] nfsd, em, msix, fatal trap 9 Date: Tue, 17 Nov 2015 01:29:31 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.2-RELEASE X-Bugzilla-Keywords: IntelNetworking, crash X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: rmacklem@FreeBSD.org X-Bugzilla-Status: In Progress X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: rmacklem@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: mfc-stable9? mfc-stable10? X-Bugzilla-Changed-Fields: attachments.isobsolete attachments.created Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Nov 2015 01:29:32 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204340 Rick Macklem changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #163160|0 |1 is obsolete| | --- Comment #4 from Rick Macklem --- Created attachment 163217 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=163217&action=edit Patch to fix sg_threadcount++ so the mutex is held when done I took a closer look at svc.c and the only way that the race I think might have happened can occur is if svc_run() doesn't wait for all threads to terminate. I also can now see that the first patch wouldn't have fixed anything, so I'm not surprised it didn't work. The only thing I can see that is broken in the code and might allow svc_run() to return before all threads have terminated is: - The thread count sg_threadcount is incremented when the sg_lock mutex isn't held. --> This could conceivably result in a corrupted sg_threadcount, which would allow svc_run() to return before all threads have terminated. This second patch fixes the code so that sg_threadcount++ is always done when the sg_lock mutex is held. If you can test this one instead of the last one, that would be appreciated. I do know this patch fixes the above problem, but I don't know if this is the cause of your crashes. Also, this bug seems to have existed in the code forever and all that r267228 did was switch from not holding the pool mutex to not holding the group mutex. So Alexander, you are off the hook, I think.;-) I've left you on the cc, since you probably know this code better than anyone else and might have some insight w.r.t. this crash. -- You are receiving this mail because: You are on the CC list for the bug.