From owner-freebsd-current@FreeBSD.ORG Tue Jan 11 21:38:53 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B695E16A4CE for ; Tue, 11 Jan 2005 21:38:53 +0000 (GMT) Received: from just.puresimplicity.net (just.puresimplicity.net [140.177.207.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2E3DD43D3F for ; Tue, 11 Jan 2005 21:38:53 +0000 (GMT) (envelope-from craig@puresimplicity.net) Received: from localhost (CPE0050bf78b8c6-CM023459906096.cpe.net.cable.rogers.com [24.157.84.118]) (authenticated bits=0)j0BLckfq090568 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO); Tue, 11 Jan 2005 15:38:48 -0600 (CST) (envelope-from craig@backfire.ca) Date: Tue, 11 Jan 2005 16:38:41 -0500 From: Craig Reyenga To: Matt Reimer Message-ID: <20050111213841.GA6373@burnout.lan.bluemidnight.ca> References: <200501101547.18892.mreimer@vpop.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200501101547.18892.mreimer@vpop.net> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.80/610/Sun Nov 28 15:32:08 2004 clamav-milter version 0.80j on just.puresimplicity.net X-Virus-Status: Clean X-Spam-Status: No, score=1.8 required=8.0 tests=RCVD_IN_NJABL_DUL, RCVD_IN_SORBS_DUL autolearn=disabled version=3.0.1 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on just.puresimplicity.net X-Mailman-Approved-At: Wed, 12 Jan 2005 14:26:47 +0000 cc: freebsd-current@freebsd.org Subject: Re: Reproducible filesystem deadlock on RELENG_5 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Craig Reyenga List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jan 2005 21:38:53 -0000 On Mon, Jan 10, 2005 at 03:47:18PM -0800, Matt Reimer wrote: > On a UP machine (P4, 128M RAM) running RELENG_5 (as of Friday), I am seeing > what looks like a hang or deadlock on a filesystem with 10 snapshots. Our > problems began when we ran out of disk space, resulting in a series of > these log messages: > > kernel: pid 39 (bufdaemon), uid 0 inumber 7277783 on /backup: filesystem > full > kernel: initiate_write_filepage: already started > > So I tried to delete a snapshot to free up some space, but then the kernel > began panicking. In my effort to workaround the panic, I disabled > softupdates. Then I came across the identical panic in a post by Kris > Kennaway > (http://lists.freebsd.org/pipermail/freebsd-current/2004-September/036946.html), > which he fixed by increasing KSTACK_PAGES. After increasing it to 31, the > kernel no longer panics, but instead filesystem access seems to deadlock: > if I try to even touch a file into existence on that partition, the touch > command hangs in state 'wdrain', and other attempts to access that > filesystem hang as well. This problem is 100% reproducible. > > How to proceed? Serial console access is available if someone wants to > tackle it. > > Matt > [sniiip] Hi, I am seeing this too, and it would appear that I've been beaten to sending a message about it. :) This is what my /var/log/kernel had to say, after rebooting from a (live|dead)lock. Jan 11 00:23:08 burnout kernel: >pid 44 (pagedaemon), uid 0 inumber 8298 on /var: filesystem full Jan 11 00:23:08 burnout kernel: vnode_pager_putpages: I/O error 28 Jan 11 00:23:08 burnout kernel: vnode_pager_putpages: residual I/O 65536 at 21872 Jan 11 00:23:08 burnout kernel: pid 44 (pagedaemon), uid 0 inumber 8298 on /var: filesystem full Jan 11 00:23:08 burnout kernel: vnode_pager_putpages: I/O error 28 Jan 11 00:23:08 burnout kernel: vnode_pager_putpages: residual I/O 65536 at 21872 Over and over and over. Of course, this log is On the /var FS itself. FreeBSD burnout 5.3-RELEASE-p2 FreeBSD 5.3-RELEASE-p2 #1: Wed Jan 5 18:44:27 EST 2005 craig@burnout:/usr/obj/usr/src/sys/BURNOUT5 i386 I'm not sure what other info to paste, my vfs.* sysctls are all defaults, except for vfs.usermount=1. -Craig