From owner-freebsd-stable@FreeBSD.ORG  Wed Apr  8 15:32:38 2009
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 14F5F10656BE;
	Wed,  8 Apr 2009 15:32:38 +0000 (UTC) (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
	by mx1.freebsd.org (Postfix) with ESMTP id 77D318FC1B;
	Wed,  8 Apr 2009 15:32:36 +0000 (UTC) (envelope-from marck@rinet.ru)
Received: from localhost (localhost [127.0.0.1])
	by woozle.rinet.ru (8.14.3/8.14.3) with ESMTP id n38FWZhp013627;
	Wed, 8 Apr 2009 19:32:35 +0400 (MSD) (envelope-from marck@rinet.ru)
Date: Wed, 8 Apr 2009 19:32:35 +0400 (MSD)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Pawel Jakub Dawidek <pjd@freebsd.org>
In-Reply-To: <alpine.BSF.2.00.0904081116070.81716@woozle.rinet.ru>
Message-ID: <alpine.BSF.2.00.0904081928410.81716@woozle.rinet.ru>
References: <alpine.BSF.2.00.0904030028520.30283@woozle.rinet.ru>
	<alpine.BSF.2.00.0904071358430.70511@woozle.rinet.ru>
	<20090407101324.GA1473@garage.freebsd.pl>
	<alpine.BSF.2.00.0904081116070.81716@woozle.rinet.ru>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.0.1
	(woozle.rinet.ru [0.0.0.0]); Wed, 08 Apr 2009 19:32:36 +0400 (MSD)
Cc: freebsd-stable@freebsd.org
Subject: Re: RELENG_7/i386: ZFS constant panic on file system writes
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Apr 2009 15:32:38 -0000

On Wed, 8 Apr 2009, Dmitry Morozovsky wrote:

DM> PJD> > DM> could you please help me a bit with *very* unpleasant situation: one of my 
DM> PJD> > DM> servers with very large ZFS reboots on most write requests to one (largest, 
DM> PJD> > DM> which effectively prohibits recreating) ZFS file system with
DM> PJD> > DM> 
DM> PJD> > DM> panic: avl_find() succeeded inside avl_add()
DM> PJD> > 
DM> PJD> > Is there a way I can clear the directory in question? Even the latest -current 
DM> PJD> > panics when I try to access the directory containing this file.
DM> PJD> 
DM> PJD> Could you try running 'zpool scrub' on this pool? Nothing better comes
DM> PJD> to my mind, it looks like some kind of internal inconsistency and
DM> PJD> hopefully scrub will be able to find it. Could you also show 'zpool status'
DM> PJD> output?
DM> 
DM> zpool status is showing everything ok:
DM> 
DM> marck@moose:~> zpool status
DM>   pool: m
DM>  state: ONLINE
DM>  scrub: none requested
DM> config:
DM> 
DM> 	NAME        STATE     READ WRITE CKSUM
DM> 	m           ONLINE       0     0     0
DM> 	  raidz1    ONLINE       0     0     0
DM> 	    ad4h    ONLINE       0     0     0
DM> 	    ad6h    ONLINE       0     0     0
DM> 	    ad8h    ONLINE       0     0     0
DM> 	    ad10h   ONLINE       0     0     0
DM> 	    ad12h   ONLINE       0     0     0
DM> 
DM> errors: No known data errors
DM> 
DM> will try scrub, thank you!

Unfortunately, it does not help:

 scrub: scrub completed with 0 errors on Wed Apr  8 19:04:51 2009

and then

root@moose:~# ls -la /ar/nfstat/nfc/.bad/200807
total 9089
drwxr-xr-x  3 rscript  wheel        4 Nov  5 21:01 ./
d---------  3 root     wheel        3 Apr  7 14:29 ../
drwxr-xr-x  2 rscript  wheel       36 Apr  2 22:12 daily/
-rw-r--r--  1 rscript  wheel  9207828 Aug  1  2008 total.200807
root@moose:~# ls -la /ar/nfstat/nfc/.bad/200807/daily/


panic: avl_find() succeeded inside avl_add()
cpuid = 2
[-- marck@localhost detached -- Wed Apr  8 19:28:13 2009]
[-- marck@localhost attached -- Wed Apr  8 19:28:15 2009]
[halt sent]
KDB: enter: Line break on console
[thread pid 153 tid 100152 ]
Stopped at      kdb_enter_why+0x3a:     movl    $0,kdb_why
db> reboot
cpu_reset: Restarting BSP
cpu_reset_proxy: Stopped CPU 1


I can set up an account for you to serial console for this server, if it can 
help...

-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------