From owner-freebsd-fs@FreeBSD.ORG Tue Jun 3 12:55:31 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF476106568E for ; Tue, 3 Jun 2008 12:55:31 +0000 (UTC) (envelope-from lopez.on.the.lists@yellowspace.net) Received: from mail.yellowspace.net (neruda.yellowspace.net [80.190.200.164]) by mx1.freebsd.org (Postfix) with ESMTP id 884EE8FC1E for ; Tue, 3 Jun 2008 12:55:31 +0000 (UTC) (envelope-from lopez.on.the.lists@yellowspace.net) Received: from five.intranet ([88.217.69.39]) (AUTH: LOGIN lopez.on.the.lists@yellowspace.net) by mail.yellowspace.net with esmtp; Tue, 03 Jun 2008 14:45:26 +0200 id 0033C120.0000000048453CE6.00002C8B Message-Id: <38DAE942-319A-4A44-A8F6-491D4269A8E7@yellowspace.net> From: Lorenzo Perone To: freebsd-fs@freebsd.org In-Reply-To: <48446C42.4070208@mawer.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v924) Date: Tue, 3 Jun 2008 14:45:26 +0200 References: <683A6ED2-0E54-42D7-8212-898221C05150@thefrog.net> <20080518124217.GA16222@eos.sc1.parodius.com> <93F07874-8D5F-44AE-945F-803FFC3B9279@thefrog.net> <16a6ef710806012304m48b63161oee1bc6d11e54436a@mail.gmail.com> <20080602064023.GA95247@eos.sc1.parodius.com> <48446C42.4070208@mawer.org> X-Mailer: Apple Mail (2.924) Subject: Re: ZFS lockup in "zfs" state X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jun 2008 12:55:32 -0000 Hello, just to add one more voice to the issue: I'm experiencing the lockups with zfs too. Environment: development test machine, amd64, 3GHz AMD, 2GB ram, running FreeBSD/amd64 7.0-STABLE #8, Sat Apr 26 10:10:53 CEST 2008, with one 400GB SATA disk devoted completely to a zpool (no raid of any kind). This disk has 5 filesystems which get rsynced on a daily basis from different other development hosts. Some of the filesystems are nfs-exported. /boot/loader.conf contains: vm.kmem_size=900M vm.kmem_size_max=900M vfs.zfs.arc_max=300M vfs.zfs.prefetch_disable=1 The disk itself has no known hw problems. A script controlled by cron makes a daily or weekly snapshot of the filesystems (at 2:30 AM). Before that, a "housekeeping" script checks for available space, and if the space is getting below a certain threshold, it destroys older snapshots (at 1:30 am). The rsyncs to the pool all happen a few hours later (4:30 am). I've seen lockups periodically, where I could not do anything else but hard-reboot the machine to unstuck it. It was possible to use other filesystems, but any process trying to access the zpool would hang. Now the very first hang was about 3 months after 7.0-BETA4, which was when I first setup the pools. I then csupped and rebuilt world and kernel periodically, the last time being end of april. After that I got those lockups more often, that is, after a maximum of 2 weeks. I noticed that now that I lowered the threshold of the "housekeeping" script, it hasn't locked up for about 3 weeks. That seems to point at a problem with zfs destroy fs@snapshot - or to anything my script does, so here's a link to it: http://lorenzo.yellowspace.net/zfs_housekeeping.sh.txt haven't seen any adX- timeouts or any other suspicious console messages so far. If there is anything I can provide to help nail down zfs problems please refer to it and I'll do my best... Thanx to everyone working on this great OS and on this cute file/volsystem :) Regards, Lorenzo On 02.06.2008, at 23:55, Antony Mawer wrote: > Jeremy Chadwick wrote: >> On Mon, Jun 02, 2008 at 04:04:12PM +1000, Andrew Hill wrote: > ... >>> unfortunately i couldn't get a backtrace or core dump for >>> 'political' >>> reasons (the system was required for use by others) but i'll see >>> if i can >>> get a panic happening after-hours to get some more info... >> I can't tell you what to do or how to do your job, but honestly you >> should be pulling this system out of production and replacing it >> with a >> different one, or a different implementation, or a different OS. >> Your >> users/employees are probably getting ticked off at the crashes, and >> it >> probably irritates you too. The added benefit is that you could get >> Scott access to the box. > > It's a home fileserver rather than a production "work" system, so > the challenge is finding another system with an equivalent amount of > storage.. :-) As one knows these things are often hard enough to > procure out of a company budget, let alone out of ones own pocket! > > --Antony > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"