From owner-freebsd-fs@FreeBSD.ORG  Wed Jan  9 21:47:37 2013
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id 4ABE3411
 for <freebsd-fs@freebsd.org>; Wed,  9 Jan 2013 21:47:37 +0000 (UTC)
 (envelope-from marck@rinet.ru)
Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68])
 by mx1.freebsd.org (Postfix) with ESMTP id CCFE6FF0
 for <freebsd-fs@freebsd.org>; Wed,  9 Jan 2013 21:47:36 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r09LlTHL000789;
 Thu, 10 Jan 2013 01:47:29 +0400 (MSK) (envelope-from marck@rinet.ru)
Date: Thu, 10 Jan 2013 01:47:29 +0400 (MSK)
From: Dmitry Morozovsky <marck@rinet.ru>
To: Konstantin Belousov <kostikbel@gmail.com>
Subject: Re: zfs -> ufs rsync: livelock in wdrain state
In-Reply-To: <alpine.BSF.2.00.1301081127340.7949@woozle.rinet.ru>
Message-ID: <alpine.BSF.2.00.1301100146320.99812@woozle.rinet.ru>
References: <alpine.BSF.2.00.1301080013520.7949@woozle.rinet.ru>
 <20130108001231.GB82219@kib.kiev.ua>
 <alpine.BSF.2.00.1301081127340.7949@woozle.rinet.ru>
User-Agent: Alpine 2.00 (BSF 1167 2008-08-23)
X-NCC-RegID: ru.rinet
X-OpenPGP-Key-ID: 6B691B03
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7
 (woozle.rinet.ru [0.0.0.0]); Thu, 10 Jan 2013 01:47:29 +0400 (MSK)
Cc: freebsd-fs@freebsd.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jan 2013 21:47:37 -0000

On Tue, 8 Jan 2013, Dmitry Morozovsky wrote:

> > Are there any kernel messages about the disk system ?
> > 
> > The wdrain means that the amount of the dirty buffers accumulated exceeds
> > the allowed maximum. The transient 'wdrain' state is normal on a machine
> > doing lot of writes to a filesystem using buffer cache, say UFS. Failure
> > to clean the dirty buffers is usually related to the disk i/o stalling.
> > 
> > It cannot be denied that a bug could cause stuck 'wdrain' state, but
> > in the last five or so years all the cases I investigated were due to
> > disks.
> 
> Yes, it seems so:
> 
> root@moose:~# camcontrol devlist
> load: 0.03  cmd: camcontrol 49735 [devfs] 2.68r 0.00u 0.00s 0% 820k
> 
> and then machine is in well known "hardly alive" state: TCP connects 
> established, process switching does not go.
> 
> Will investigate the hardware, thank you.

It seems flaky eSATA cable was the source of drive sometimes get lost.

Sorry for the noise.

-- 
Sincerely,
D.Marck                                     [DM5020, MCK-RIPE, DM3-RIPN]
[ FreeBSD committer:                                 marck@FreeBSD.org ]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------