From owner-freebsd-current@FreeBSD.ORG Sun Jan 6 16:47:33 2008 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A02E116A419; Sun, 6 Jan 2008 16:47:33 +0000 (UTC) (envelope-from hlh@restart.be) Received: from tignes.restart.be (unknown [IPv6:2001:41d0:1:2ad2::1]) by mx1.freebsd.org (Postfix) with ESMTP id 555E013C46E; Sun, 6 Jan 2008 16:47:33 +0000 (UTC) (envelope-from hlh@restart.be) Received: from restart.be (avoriaz.tunnel.bel [IPv6:2001:41d0:1:2ad2::fffe:0]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "avoriaz.restart.be", Issuer "CA master" (verified OK)) by tignes.restart.be (Postfix) with ESMTP id 818161BAC24; Sun, 6 Jan 2008 17:47:32 +0100 (CET) Received: from morzine.restart.bel (morzine6.restart.bel [IPv6:2001:41d0:1:2ad2::1:2]) (authenticated bits=0) by restart.be (8.14.2/8.14.2) with ESMTP id m06GlUL0009254; Sun, 6 Jan 2008 17:47:30 +0100 (CET) (envelope-from hlh@restart.be) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=restart.be; s=avoriaz; t=1199638051; bh=Nxm2sVPv8+nDgveUYUpIlkxy7q/JQBlU5EAiTX5 Uwjs=; h=DomainKey-Signature:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References:In-Reply-To: Content-Type:Content-Transfer-Encoding:X-Scanned-By; b=tYgfi3LyqaP RgUdlBmzZy0G4AUOdX6tos56BA9gX7jch5kr7PGf9aOSHv6JEIiy+O66wnPlmbJ9XnW UcVvakMg== DomainKey-Signature: a=rsa-sha1; s=avoriaz; d=restart.be; c=nofws; q=dns; h=message-id:date:from:organization:user-agent:mime-version:to:cc: subject:references:in-reply-to:content-type: content-transfer-encoding:x-scanned-by; b=PjvlaThtwmrFLWE+TCkeHGBrZ1px7h/9q2oFH94wfUq46eP7qG1LGHf7tipNmeyxv hNAnykhsi1J6fDhOUJi+A== Message-ID: <47810621.8080406@restart.be> Date: Sun, 06 Jan 2008 17:47:29 +0100 From: Henri Hennebert Organization: RestartSoft User-Agent: Thunderbird 2.0.0.9 (X11/20071118) MIME-Version: 1.0 To: Kris Kennaway References: <20080104163352.GA42835@lor.one-eyed-alien.net> <9bbcef730801040958t36e48c9fjd0fbfabd49b08b97@mail.gmail.com> <200801061051.26817.peter.schuller@infidyne.com> <9bbcef730801060458k4bc9f2d6uc3f097d70e087b68@mail.gmail.com> <4780D289.7020509@FreeBSD.org> <4780F839.5020200@restart.be> <4780FBE2.8040208@FreeBSD.org> In-Reply-To: <4780FBE2.8040208@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.63 on IPv6:2001:41d0:1:2ad2::1:1 Cc: freebsd-current@FreeBSD.org, Peter Schuller , Ivan Voras , Brooks Davis Subject: Re: When will ZFS become stable? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Jan 2008 16:47:33 -0000 Kris Kennaway wrote: > Henri Hennebert wrote: >> Kris Kennaway wrote: >>> Ivan Voras wrote: >>>> On 06/01/2008, Peter Schuller wrote: >>>>>> This number is not so large. It seems to be easily crashed by rsync, >>>>>> for example (speaking from my own experience, and also some of my >>>>>> colleagues). >>>>> I can definitely say this is not *generally* true, as I do a lot of >>>>> rsyncing/rdiff-backup:ing and similar stuff (with many files / >>>>> large files) >>>>> on ZFS without any stability issues. Problems for me have been >>>>> limited to >>>>> 32bit and the memory exhaustion issue rather than "hard" issues. >>>> >>>> It's not generally true since kmem problems with rsync are often hard >>>> to repeat - I have them on one machine, but not on another, similar >>>> machine. This nonrepeatability is also a part of the problem. >>>> >>>>> But perhaps that's all you are referring to. >>>> >>>> Mostly. I did have a ZFS crash with rsync that wasn't kmem related, >>>> but only once. >>> >>> kmem problems are just tuning. They are not indicative of stability >>> problems in ZFS. Please report any further non-kmem panics you >>> experience. >> >> I encounter 2 times a deadlock during high I/O activity (the last one >> during rsync + rm -r on a 5GB hierarchy (openoffice-2/work). >> >> I was running with this patch: >> http://people.freebsd.org/~pjd/patches/zgd_done.patch >> db> show allpcpu >> Current CPU: 1 >> >> cpuid = 0 >> curthread = 0xa5ebe440: pid 3422 "txg_thread_enter" >> curpcb = 0xeb175d90 >> fpcurthread = none >> idlethread = 0xa5529aa0: pid 12 "idle: cpu0" >> APIC ID = 0 >> currentldt = 0x50 >> >> cpuid = 1 >> curthread = 0xa56ab220: pid 47 "arc_reclaim_thread" >> curpcb = 0xe6837d90 >> fpcurthread = none >> idlethread = 0xa5529880: pid 11 "idle: cpu1" >> APIC ID = 1 >> currentldt = 0x50 >> >> With the 2 times arc_reclaim_thread `running` > > Backtraces of the affected processes (or just alltrace) are usually noted for next time > required to proceed with debugging, and lock status is also often vital > (show alllocks, requires witness). I add it to my kernel config Also, in the case when threads are > actually running (not deadlocked), then it is often useful to repeatedly > break/continue and sample many backtraces to try and determine where the > threads are looping. I do this after the second deadlock and arc_reclaim_thread was always there and second cpu was idle. Henri > > Kris