From owner-freebsd-fs@FreeBSD.ORG Tue Jan 8 00:12:41 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 339C7B62 for ; Tue, 8 Jan 2013 00:12:41 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id C6944EEA for ; Tue, 8 Jan 2013 00:12:40 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.5/8.14.5) with ESMTP id r080CW3v011266; Tue, 8 Jan 2013 02:12:32 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.4 kib.kiev.ua r080CW3v011266 Received: (from kostik@localhost) by tom.home (8.14.5/8.14.5/Submit) id r080CVGg011265; Tue, 8 Jan 2013 02:12:31 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 8 Jan 2013 02:12:31 +0200 From: Konstantin Belousov To: Dmitry Morozovsky Subject: Re: zfs -> ufs rsync: livelock in wdrain state Message-ID: <20130108001231.GB82219@kib.kiev.ua> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="gPVs24VLDFKgHP1I" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Jan 2013 00:12:41 -0000 --gPVs24VLDFKgHP1I Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jan 08, 2013 at 12:19:15AM +0400, Dmitry Morozovsky wrote: > Dear colleagues, >=20 > I have archive server with pretty large ZFS (24*2T in single raidz2 raidg= roup) >=20 > Sometimes we moved really old archives to external SATA drives, which are= =20 > formatted with UFS2/SU. Files are copied via rsync >=20 > The system in question is stable/8; upgrade to stable/9 is planned, but n= ot yet=20 > completed. >=20 > Now, during last rsync, the process is stuck as >=20 > dump.2012062219.bin.gz > 3208015437 100% 102.42MB/s 0:00:29 (xfer#66, to-check=3D196/721) > dump.2012062220.bin.gz > load: 0.01 cmd: rsync 47543 [wdrain] 1904.69r 443.01u 241.12s 0% 1736k > ^C > rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(645= )=20 > [sender=3D3.0.9] >=20 > As we can see, rsync writer stops in wdrain state. >=20 > I terminated it by ^C in terminal session, as it was not autogenerated=20 > backup. >=20 > Now, zfs and other system is working seemingly well, but trying to sync= =20 > manually stucks console forever: >=20 > root@moose:/ar# sync > load: 0.00 cmd: sync 67229 [wdrain] 468.17r 0.00u 0.00s 0% 596k >=20 > Any hints? Quick searching throug freebsd mailing lists and/or open PRs d= oes=20 > not reveal much. >=20 Are there any kernel messages about the disk system ? The wdrain means that the amount of the dirty buffers accumulated exceeds the allowed maximum. The transient 'wdrain' state is normal on a machine doing lot of writes to a filesystem using buffer cache, say UFS. Failure to clean the dirty buffers is usually related to the disk i/o stalling. It cannot be denied that a bug could cause stuck 'wdrain' state, but in the last five or so years all the cases I investigated were due to disks. --gPVs24VLDFKgHP1I Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJQ62RuAAoJEJDCuSvBvK1BQbAP/2bUyXPL/GfvgXG/GiaIWBZm 75vlOyeNlQ7+zAR+Z++BmQUCnNPCSAbzEDlmfJ4nxcCCFBG/2slDdcHUsMr6osu5 /20G9UaBRt+tvjhlXiIAU6JgIKyv3o/DDEVTd4RW1lJmVDlFPQVqD9EK4tq/HITf BefQVznBHZHCyBs93YapOtghpJak81/nIMBTwLHe2lTuMTRaP1R8lhqK8TeputHr FcC70CyBwPz1oJqyHVu1fOcqMUWXZOGn0rlYmtv236Ba8z7W5p8wiSw70o4JSrqJ KN4rTzwtC8NsG7c/TaeAqzrMeSnvjBMwIC9SuoK1xhxUZxzCrZklrQEgaVeO2g6V BH4+1yEZDUPdXBvS+7TKA2fHd8cGdGFnil4mkMY2xRt9zpOPg5rrNP0Ubc4/3C+d wDj0LKPE/Uiq2LFlJQxg8cD8yyzoIb7T+4AuFqelGnwkvpgbbq7AQtXedY8afwBq qdeW2Zb3l3qMsF/IUoa1UFtQNPK4hLfcOuATVTPGufyCOwLwNIq13EQwsTQaxJc5 v9l9cU4m3pUybqAGFfMYkM7/W2jd/v9dfMhN9P2pz8HP5UzyoWNfMNYaNaYmd5eZ OeeHyOmPYpkMWlAK/ok+AIDV+qOxynqM532BzK85uk4BWM7Hi8yncT2wxer9N+NZ t5O43VdHbtTQIut0ZWPs =Urrw -----END PGP SIGNATURE----- --gPVs24VLDFKgHP1I--