From owner-freebsd-stable@FreeBSD.ORG Tue Jan 16 20:54:03 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E79BD16A521 for ; Tue, 16 Jan 2007 20:54:03 +0000 (UTC) (envelope-from wjw@withagen.nl) Received: from mail.digiware.nl (www.tegenbosch28.nl [217.21.251.97]) by mx1.freebsd.org (Postfix) with ESMTP id 9FD8F13C467 for ; Tue, 16 Jan 2007 20:54:03 +0000 (UTC) (envelope-from wjw@withagen.nl) Received: from [212.61.27.67] (opteron.digiware.nl [212.61.27.67]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTP id 0DDB21712B; Tue, 16 Jan 2007 21:24:50 +0100 (CET) Message-ID: <45AD3507.402@withagen.nl> Date: Tue, 16 Jan 2007 21:26:47 +0100 From: Willem Jan Withagen User-Agent: Thunderbird 1.5.0.9 (Windows/20061207) MIME-Version: 1.0 To: Doug Ambrisko References: <200701161934.l0GJY1mh057095@ambrisko.com> In-Reply-To: <200701161934.l0GJY1mh057095@ambrisko.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Scott Oertel , Willem Jan Withagen , freebsd-stable@freebsd.org, Kris Kennaway Subject: Re: running mksnap_ffs X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Jan 2007 20:54:04 -0000 Doug Ambrisko wrote: > | > or things can get wedged. We have some other patches as well that might > | > be required. As a hack on a local server we have been using snap shots > | > to do a "hot" back-up of a data base each morning. This is based on > | > 6.x. > | > | What do you mean by "get wedged"? Are you seeing a deadlock, and if > | so then what are the details? When you say 6.x, do you mean > | up-to-date RELENG_6? There were various snapshot deadlock fixes > | committed over the past year including some in the past few months. > > The file-system would come to a stop, processes stuck on bio, snap-shots > not finishing etc. This was caused by the system running out of usable > buffers. The change forces them to be flushed every so often. This is > independant of locking. 10 might be to aggresive. Some scaling of > nbuf would probably be better. When I run mksnap_ffs it runs to the point where ANY access to the filesystem gives that process a lockup. Getting the file system back is only thru "hard reboot". Trying to do it the gentle way locks the whole system. I'm refering further testing and trying until I have more time to upgrade to 6.2-RELEASE and put some of the debug options in the kernel. On the otherhand is this my main fileserver. So testing too much is sort of dangerous, and running a fsck on 1.5T is very tedious. --WjW