From owner-freebsd-current@FreeBSD.ORG Mon May 28 16:53:37 2007 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2366F16A41F; Mon, 28 May 2007 16:53:37 +0000 (UTC) (envelope-from Tor.Egge@cvsup.no.freebsd.org) Received: from pil.idi.ntnu.no (pil.idi.ntnu.no [129.241.107.93]) by mx1.freebsd.org (Postfix) with ESMTP id 88A3D13C44B; Mon, 28 May 2007 16:53:36 +0000 (UTC) (envelope-from Tor.Egge@cvsup.no.freebsd.org) Received: from cvsup.no.freebsd.org (c2h5oh.idi.ntnu.no [129.241.103.69]) by pil.idi.ntnu.no (8.13.6/8.13.1) with ESMTP id l4SGKPVo000101 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 28 May 2007 18:20:25 +0200 (MEST) Received: from localhost (localhost [127.0.0.1]) by cvsup.no.freebsd.org (8.13.4/8.13.4) with ESMTP id l4SGKO00094954; Mon, 28 May 2007 16:20:24 GMT (envelope-from Tor.Egge@cvsup.no.freebsd.org) Date: Mon, 28 May 2007 16:20:23 +0000 (UTC) Message-Id: <20070528.162023.41711345.Tor.Egge@cvsup.no.freebsd.org> To: marcus@marcuscom.com From: Tor Egge In-Reply-To: <1180140483.94117.24.camel@shumai.marcuscom.com> References: <1180138048.94117.17.camel@shumai.marcuscom.com> <465780A3.8040603@FreeBSD.org> <1180140483.94117.24.camel@shumai.marcuscom.com> X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned-By: mimedefang.idi.ntnu.no, using CLAMD X-SMTP-From: Sender=, Relay/Client=c2h5oh.idi.ntnu.no [129.241.103.69], EHLO=cvsup.no.freebsd.org X-Scanned-By: MIMEDefang 2.48 on 129.241.107.38 X-Scanned-By: mimedefang.idi.ntnu.no, using MIMEDefang 2.48 with local filter 16.42-idi X-Filter-Time: 1 seconds Cc: attilio@freebsd.org, jroberson@chesapeake.net, current@freebsd.org Subject: Re: Panic on -CURRENT after LDT changes X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 May 2007 16:53:37 -0000 > > Could you please try this better approach on a VANILLA kernel and say if > > it still works for you: > > http://users.gufi.org/~rookie/works/patches/schedlock/ldt2.diff > > Still works, no crash. I got similar crashes (page fault in i386_ldt_grow) and tried this patch. During testing, my development machine repeatedly got following panic: spin lock 0xa0ae4378 (descriptor tables) held by 0xadf77360 (tid 100161) too long exclusive spin mutex descriptor tables r = 0 (0xa0ae4378) locked @ i386/i386/sys_machdep.c:414 panic: spin lock held too long cpuid = 0 This looked like a lock leak, with user_ldt_free() as the suspect, since it initially appeared to be able to return with dt_lock still held. But that path seems to be impossible since the callers first check that mdp->md_ldt is non-NULL. During the hunt for the real reason, I found that unsharing of user LDT in cpu_fork() seems broken since the call to user_ldt_free() frees the newly allocated user LDT. Finally, I found that i386_ldt_grow() called smp_rendezvous() without temporarily unlocking dt_lock. That caused a deadlock. Adding a temporary unlock of dt_lock seems to solve the problem for me. smp_rendezvous_action() fails to make a local copy of smp_rv_teardown_func before bumping smp_rv_waiters[1], thus the other CPUs might end up calling the teardown function for the next rendezvous instead of the teardown function for the current rendezvous. - Tor Egge