From owner-freebsd-fs@FreeBSD.ORG  Tue Jun  3 12:55:31 2008
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DF476106568E
	for <freebsd-fs@freebsd.org>; Tue,  3 Jun 2008 12:55:31 +0000 (UTC)
	(envelope-from lopez.on.the.lists@yellowspace.net)
Received: from mail.yellowspace.net (neruda.yellowspace.net [80.190.200.164])
	by mx1.freebsd.org (Postfix) with ESMTP id 884EE8FC1E
	for <freebsd-fs@freebsd.org>; Tue,  3 Jun 2008 12:55:31 +0000 (UTC)
	(envelope-from lopez.on.the.lists@yellowspace.net)
Received: from five.intranet ([88.217.69.39])
	(AUTH: LOGIN lopez.on.the.lists@yellowspace.net)
	by mail.yellowspace.net with esmtp; Tue, 03 Jun 2008 14:45:26 +0200
	id 0033C120.0000000048453CE6.00002C8B
Message-Id: <38DAE942-319A-4A44-A8F6-491D4269A8E7@yellowspace.net>
From: Lorenzo Perone <lopez.on.the.lists@yellowspace.net>
To: freebsd-fs@freebsd.org
In-Reply-To: <48446C42.4070208@mawer.org>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v924)
Date: Tue, 3 Jun 2008 14:45:26 +0200
References: <683A6ED2-0E54-42D7-8212-898221C05150@thefrog.net>	<20080518124217.GA16222@eos.sc1.parodius.com>	<93F07874-8D5F-44AE-945F-803FFC3B9279@thefrog.net>	<16a6ef710806012304m48b63161oee1bc6d11e54436a@mail.gmail.com>
	<20080602064023.GA95247@eos.sc1.parodius.com>
	<48446C42.4070208@mawer.org>
X-Mailer: Apple Mail (2.924)
Subject: Re: ZFS lockup in "zfs" state
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Jun 2008 12:55:32 -0000


Hello,

just to add one more voice to the issue:

I'm experiencing the lockups with zfs too.

Environment: development test machine, amd64, 3GHz AMD, 2GB ram,
running FreeBSD/amd64 7.0-STABLE #8, Sat Apr 26 10:10:53 CEST 2008,
with one 400GB SATA disk devoted completely to a zpool (no raid of
any kind). This disk has 5 filesystems which get rsynced on a
daily basis from different other development hosts. Some of the
filesystems are nfs-exported.

/boot/loader.conf contains:

vm.kmem_size=900M
vm.kmem_size_max=900M
vfs.zfs.arc_max=300M
vfs.zfs.prefetch_disable=1

The disk itself has no known hw problems.

A script controlled by cron makes a daily or weekly snapshot of the
filesystems (at 2:30 AM). Before that, a "housekeeping" script
checks for available space, and if the space is getting below
a certain threshold, it destroys older snapshots (at 1:30 am).
The rsyncs to the pool all happen a few hours later (4:30 am).

I've seen lockups periodically, where I could not do anything else
but  hard-reboot the machine to unstuck it. It was possible to use
other filesystems, but any process trying to access the zpool would
hang.

Now the very first hang was about 3 months after 7.0-BETA4, which
was when I first setup the pools.

I then csupped and rebuilt world and kernel periodically, the
last time being end of april. After that I got those lockups more
often, that is, after a maximum of 2 weeks.

I noticed that now that I lowered the threshold of the
"housekeeping" script, it hasn't locked up for about 3 weeks.

That seems to point at a problem with zfs destroy fs@snapshot -
or to anything my script does, so here's a link to it:
http://lorenzo.yellowspace.net/zfs_housekeeping.sh.txt

haven't seen any adX- timeouts or any other suspicious console
messages so far.

If there is anything I can provide to help nail down zfs problems
please refer to it and I'll do my best...


Thanx to everyone working on this great OS and on this
cute file/volsystem :)


Regards,

Lorenzo


On 02.06.2008, at 23:55, Antony Mawer wrote:

> Jeremy Chadwick wrote:
>> On Mon, Jun 02, 2008 at 04:04:12PM +1000, Andrew Hill wrote:
> ...
>>> unfortunately i couldn't get a backtrace or core dump for  
>>> 'political'
>>> reasons (the system was required for use by others) but i'll see  
>>> if i can
>>> get a panic happening after-hours to get some more info...
>> I can't tell you what to do or how to do your job, but honestly you
>> should be pulling this system out of production and replacing it  
>> with a
>> different one, or a different implementation, or a different OS.   
>> Your
>> users/employees are probably getting ticked off at the crashes, and  
>> it
>> probably irritates you too.  The added benefit is that you could get
>> Scott access to the box.
>
> It's a home fileserver rather than a production "work" system, so  
> the challenge is finding another system with an equivalent amount of  
> storage.. :-) As one knows these things are often hard enough to  
> procure out of a company budget, let alone out of ones own pocket!
>
> --Antony
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"