From owner-freebsd-current@FreeBSD.ORG Mon Jan 21 18:22:19 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92C2016A41B; Mon, 21 Jan 2008 18:22:19 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from smtp.infidyne.com (ds9.infidyne.com [88.80.6.206]) by mx1.freebsd.org (Postfix) with ESMTP id 1EFF113C468; Mon, 21 Jan 2008 18:22:19 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from c-8216e555.03-51-73746f3.cust.bredbandsbolaget.se (c-8216e555.03-51-73746f3.cust.bredbandsbolaget.se [85.229.22.130]) by smtp.infidyne.com (Postfix) with ESMTP id DFA1F880E; Mon, 21 Jan 2008 19:22:14 +0100 (CET) From: Peter Schuller To: freebsd-current@freebsd.org Date: Mon, 21 Jan 2008 19:22:20 +0100 User-Agent: KMail/1.9.7 References: <200707282028.37102.peter.schuller@infidyne.com> <200707310126.06923.peter.schuller@infidyne.com> <200801011857.57757.peter.schuller@infidyne.com> In-Reply-To: <200801011857.57757.peter.schuller@infidyne.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart11573178.0QpTdHsbOW"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200801211922.29463.peter.schuller@infidyne.com> Cc: Pawel Jakub Dawidek , current@freebsd.org Subject: Re: (ZFS?): panic: lockmgr: locking against myself X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2008 18:22:19 -0000 --nextPart11573178.0QpTdHsbOW Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline > I *think* I just experienced the same problem on 7.0-BETA3, except the > kernel does not have WITNESS/INVARIANTS so I just get a hack instead of a > panic. I wanted to post with the information I have for completeness; I > realize what follows is a bunch of anecdotal mumbo-jumbo. So I can now confirm this problem on 7.0-RC1 on the machine where I origina= lly=20 saw this happen. If I could trigger this in a debuggable environment I would try to get some= =20 much more interesting information, but as this was during time-limited acce= ss=20 to the machine in a noisy colocation facility, with people waiting on the=20 machine to come back up, I was not in a position to do very much. Instead I= =20 will again, as an added data point, provide an approximate timeline below. = As=20 previously observed, it seems to be triggered by changes in the availabilit= y=20 of disks and/or zpool configuration, with cold reboots somehow mitigating t= he=20 problem. Note that the drives are likely to have been moved around a bit logically (= but=20 not physically), due to the level of indirection and drive number allocatio= n=20 resulting from the single-disk raid0 virtual hardware raid device. Timeline: * Machine running 7-CURRENT from the october/september era. * One disk in a three-way zfs mirror (tank, on which root fs is) gets kicke= d=20 out. * For probably unrelated reasons, the machine crashes with a kmem_alloc err= or=20 (this was the first time ever on this machine). Don't have details; this wa= s=20 observed by colocation personel. * Machine rebooted and panic:s as described in this thread. * I arrive on-site and reboot again just for kicks. Same problem. * I physically remove the broken disk and replace it with the new one, and = add=20 the virtual disk in the RAID controller bios (recap: this is a Dell 2950). * Now it boots again. * I zpool replace tank label/tank3 label/tank3r1 (after various=20 disklabel/glabel action). * make installkernell (7.0-RC1) * Reboot with resilvering/replacement still in progress. * Panic on boot. * Tried cold reboot (turn off,turn on) -> it now boots again without a pani= c. * Make installworld. * At this point I no longer remember whether it booted again or whether I h= ad=20 to do another cold reboot. * Machine has not been rebooted again since resilvering completed. =2D-=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller ' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --nextPart11573178.0QpTdHsbOW Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQBHlOLlDNor2+l1i30RAsFRAKCgj/XuvtqTc+WXzjOtWy/cxgkncACff+vg 3H9oCfMme4pUkUyIhkT9aXI= =/YjJ -----END PGP SIGNATURE----- --nextPart11573178.0QpTdHsbOW--