From owner-freebsd-stable@FreeBSD.ORG Wed Apr 8 15:32:38 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 14F5F10656BE; Wed, 8 Apr 2009 15:32:38 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) by mx1.freebsd.org (Postfix) with ESMTP id 77D318FC1B; Wed, 8 Apr 2009 15:32:36 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.3/8.14.3) with ESMTP id n38FWZhp013627; Wed, 8 Apr 2009 19:32:35 +0400 (MSD) (envelope-from marck@rinet.ru) Date: Wed, 8 Apr 2009 19:32:35 +0400 (MSD) From: Dmitry Morozovsky To: Pawel Jakub Dawidek In-Reply-To: Message-ID: References: <20090407101324.GA1473@garage.freebsd.pl> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.0.1 (woozle.rinet.ru [0.0.0.0]); Wed, 08 Apr 2009 19:32:36 +0400 (MSD) Cc: freebsd-stable@freebsd.org Subject: Re: RELENG_7/i386: ZFS constant panic on file system writes X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Apr 2009 15:32:38 -0000 On Wed, 8 Apr 2009, Dmitry Morozovsky wrote: DM> PJD> > DM> could you please help me a bit with *very* unpleasant situation: one of my DM> PJD> > DM> servers with very large ZFS reboots on most write requests to one (largest, DM> PJD> > DM> which effectively prohibits recreating) ZFS file system with DM> PJD> > DM> DM> PJD> > DM> panic: avl_find() succeeded inside avl_add() DM> PJD> > DM> PJD> > Is there a way I can clear the directory in question? Even the latest -current DM> PJD> > panics when I try to access the directory containing this file. DM> PJD> DM> PJD> Could you try running 'zpool scrub' on this pool? Nothing better comes DM> PJD> to my mind, it looks like some kind of internal inconsistency and DM> PJD> hopefully scrub will be able to find it. Could you also show 'zpool status' DM> PJD> output? DM> DM> zpool status is showing everything ok: DM> DM> marck@moose:~> zpool status DM> pool: m DM> state: ONLINE DM> scrub: none requested DM> config: DM> DM> NAME STATE READ WRITE CKSUM DM> m ONLINE 0 0 0 DM> raidz1 ONLINE 0 0 0 DM> ad4h ONLINE 0 0 0 DM> ad6h ONLINE 0 0 0 DM> ad8h ONLINE 0 0 0 DM> ad10h ONLINE 0 0 0 DM> ad12h ONLINE 0 0 0 DM> DM> errors: No known data errors DM> DM> will try scrub, thank you! Unfortunately, it does not help: scrub: scrub completed with 0 errors on Wed Apr 8 19:04:51 2009 and then root@moose:~# ls -la /ar/nfstat/nfc/.bad/200807 total 9089 drwxr-xr-x 3 rscript wheel 4 Nov 5 21:01 ./ d--------- 3 root wheel 3 Apr 7 14:29 ../ drwxr-xr-x 2 rscript wheel 36 Apr 2 22:12 daily/ -rw-r--r-- 1 rscript wheel 9207828 Aug 1 2008 total.200807 root@moose:~# ls -la /ar/nfstat/nfc/.bad/200807/daily/ panic: avl_find() succeeded inside avl_add() cpuid = 2 [-- marck@localhost detached -- Wed Apr 8 19:28:13 2009] [-- marck@localhost attached -- Wed Apr 8 19:28:15 2009] [halt sent] KDB: enter: Line break on console [thread pid 153 tid 100152 ] Stopped at kdb_enter_why+0x3a: movl $0,kdb_why db> reboot cpu_reset: Restarting BSP cpu_reset_proxy: Stopped CPU 1 I can set up an account for you to serial console for this server, if it can help... -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------