From owner-freebsd-fs@FreeBSD.ORG Wed Jan 9 21:47:37 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4ABE3411 for ; Wed, 9 Jan 2013 21:47:37 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) by mx1.freebsd.org (Postfix) with ESMTP id CCFE6FF0 for ; Wed, 9 Jan 2013 21:47:36 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r09LlTHL000789; Thu, 10 Jan 2013 01:47:29 +0400 (MSK) (envelope-from marck@rinet.ru) Date: Thu, 10 Jan 2013 01:47:29 +0400 (MSK) From: Dmitry Morozovsky To: Konstantin Belousov Subject: Re: zfs -> ufs rsync: livelock in wdrain state In-Reply-To: Message-ID: References: <20130108001231.GB82219@kib.kiev.ua> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (woozle.rinet.ru [0.0.0.0]); Thu, 10 Jan 2013 01:47:29 +0400 (MSK) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Jan 2013 21:47:37 -0000 On Tue, 8 Jan 2013, Dmitry Morozovsky wrote: > > Are there any kernel messages about the disk system ? > > > > The wdrain means that the amount of the dirty buffers accumulated exceeds > > the allowed maximum. The transient 'wdrain' state is normal on a machine > > doing lot of writes to a filesystem using buffer cache, say UFS. Failure > > to clean the dirty buffers is usually related to the disk i/o stalling. > > > > It cannot be denied that a bug could cause stuck 'wdrain' state, but > > in the last five or so years all the cases I investigated were due to > > disks. > > Yes, it seems so: > > root@moose:~# camcontrol devlist > load: 0.03 cmd: camcontrol 49735 [devfs] 2.68r 0.00u 0.00s 0% 820k > > and then machine is in well known "hardly alive" state: TCP connects > established, process switching does not go. > > Will investigate the hardware, thank you. It seems flaky eSATA cable was the source of drive sometimes get lost. Sorry for the noise. -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------