From owner-freebsd-current Sun Jun 27 1:15:57 1999 Delivered-To: freebsd-current@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id 86CA014C58 for ; Sun, 27 Jun 1999 01:15:54 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id BAA10773; Sun, 27 Jun 1999 01:15:43 -0700 (PDT) (envelope-from dillon) Date: Sun, 27 Jun 1999 01:15:43 -0700 (PDT) From: Matthew Dillon Message-Id: <199906270815.BAA10773@apollo.backplane.com> To: Peter Wemm Cc: current@FreeBSD.ORG, mckusick@mckusick.com Subject: Re: BUF_LOCK() related panic.. References: <19990627075755.901EA82@overcee.netplex.com.au> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Ah, yes, some of us were just discussing this in a small mailing list. Hopefully Kirk will pick up on it soon. Ah well.. someone else gets to be the brunt of it for a change :-). Kirk doesn't have an SMP box so he didn't see the bug. I have tentitively tracked the problem down to the apparent inability of lockmgr() locks to function from interrupts, even when used in a non-blocking manner, due to the simplelock's it uses internally. The new buffer cache code Kirk committed switched from B_BUSY (manually implemented locks) to lockmgr() locks. I think what is going on is that mainline code is getting a simplelock and then an interrupt is coming along and also trying to get the same lock, but I can't be sure because my DDB backtraces are somewhat munged. I was hoping someone would come up with a quick hack to solve the problem, but baring that I do have a big nasty patch on my site http://www.backplane.com/FreeBSD4/ which replaces the use of lockmgr() locks in the buffer cache code with the new SMP qlock's I began working on yesterday - but I wasn't intending on submitting it for commit for a while -- even the SMP guys haven't seen it yet! This patch is currently only suitable for compiling a new i386 -CURRENT kernel, it will break buildworld and it will also break alpha builds. And it has not been well tested yet... it is running a whole lot of brand new untested code and I'm amazed that it works as well as it does :-) If you do not want to get that involved, you can turn off SMP on the system and it should boot ok. Also be sure to completely recompile your modules (/usr/src/sys/modules), and when cvs updating be sure to update /usr/src/contrib/sys as well as /usr/src/sys/ if you are using softupdates. The size of struct buf has changed radically. -Matt Matthew Dillon :I pressed reset, even break-to-debugger didn't work.. : :To provoke it, do this once or twice: : :root:[3:21pm]/etc-104# passwd root :Changing local password for root. :New password: :Retype new password: :passwd: updating the database... :Read from remote host beast: Operation timed out :Connection to beast closed. : :I'll look into this more shortly... : :Cheers, :-Peter :-- :Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message