From owner-freebsd-stable@FreeBSD.ORG Mon Mar 7 17:10:50 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 758) id 180D416A4CF; Mon, 7 Mar 2005 17:10:50 +0000 (GMT) Date: Mon, 7 Mar 2005 17:10:50 +0000 From: Kris Kennaway To: Paul Mather Message-ID: <20050307171049.GN22873@hub.freebsd.org> References: <20050307151733.I4264@woozle.rinet.ru> <1110214682.63484.9.camel@zappa.Chelsea-Ct.Org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1110214682.63484.9.camel@zappa.Chelsea-Ct.Org> User-Agent: Mutt/1.4.2.1i cc: freebsd-stable@freebsd.org cc: Dmitry Morozovsky Subject: Re: RELENG_5, snapshots and disk lock time X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2005 17:10:50 -0000 On Mon, Mar 07, 2005 at 11:58:02AM -0500, Paul Mather wrote: > On Mon, 2005-03-07 at 15:21 +0300, Dmitry Morozovsky wrote: > > Dear colleagues, > > > > dumping the snapshot of 140G ufs2 fyle system under contemporary RELENG_5 I > > found that during mksnap_ffs file system is unresponsible even for reading for > > more than 3 minutes (it's on modern SATA disk with 50+ MBps linear transfer). > > Is it normal? > > Oddly enough, this happened to me last night on a RELENG_5 system. In > my case, things were so bad that mksnap_ffs appeared to wedge > everything, meaning I'll have to make a trek in to where the machine is > located and press the ol' reset button to get things going again. :-( Yes, this is normal. See the documentation about the snapshots implementation (a README in the kernel source tree, I think, and paper written by Kirk). > The machine in question makes and mounts snapshots of all its > filesystems for backup each night via Tivoli TSM. This has worked > flawlessly for many months. Last night, I had many BitTorrent sessions > active on the filesystem that wedged. I guess the activity broke the > snapshot mechanism. :-( The odd thing is that it survived the night > before, when there were also BitTorrent sessions active. It's possible there are still deadlock conditions in the snapshot code. Some familiarity with DDB would help to diagnose this (see the chapter on kernel debugging in the developers' handbook). You'd need to work with Kirk to debug these, if you're willing. > I wonder how much activity mksnap_ffs can take? I don't think this is the issue, directly. Kris