Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Mar 2010 13:43:58 +0100
From:      Stefan Bethke <stb@lassitu.de>
To:        Pawel Jakub Dawidek <pjd@FreeBSD.org>
Cc:        FreeBSD Stable <freebsd-stable@freebsd.org>
Subject:   Re: Many processes stuck in zfs
Message-ID:  <AD4C8E44-30AE-429D-BC10-B9567090324E@lassitu.de>
In-Reply-To: <20100309122954.GE3155@garage.freebsd.pl>
References:  <864468D4-DCE9-493B-9280-00E5FAB2A05C@lassitu.de> <20100309122954.GE3155@garage.freebsd.pl>

next in thread | previous in thread | raw e-mail | index | archive | help
Am 09.03.2010 um 13:29 schrieb Pawel Jakub Dawidek:

> On Tue, Mar 09, 2010 at 10:15:53AM +0100, Stefan Bethke wrote:
>> Over the past couple of months, I've more or less regularly observed =
machines having more and more processes stuck in the zfs wchan.  The =
processes never recover from that, and trying to reboot only gets the =
entire system stuck, without any console messages.  I can enter the =
debugger, and I have saved a couple of dumps.
>>=20
>> The situation seems to be triggered by zfs receive'ing snapshots from =
the sister machine (both synchronize their active ZFS filesystems to =
each other, using zfs send and zfs receive).  It appears it's the =
receiving causing trouble.
>>=20
>> Both machines run 8-stable from mid-February, with a single-disk ZFS =
pool, with ARC limited to 512M, prefetch and ZIL disabled via =
loader.conf.
>>=20
>> What should I be looking at to further diagnose?
>=20
> What kind of hardware do you have there? There is 3-way deadlock I've =
a
> fix for which would be hard to trigger on single or dual core =
machines.

FreeBSD lokschuppen.zs64.net 8.0-STABLE FreeBSD 8.0-STABLE #24: Sat Feb =
13 11:20:03 UTC 2010     =
root@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT  amd64
Copyrig
ht (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights =
reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-STABLE #24: Sat Feb 13 11:20:03 UTC 2010
    root@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Duo CPU     E7300  @ 2.66GHz (2666.65-MHz =
K8-class CPU)
  Origin =3D "GenuineIntel"  Id =3D 0x10676  Stepping =3D 6
  =
Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE=
,MCA,C
MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  =
Features2=3D0x8e39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,SS=
E4.1>
  AMD Features=3D0x20100800<SYSCALL,NX,LM>
  AMD Features2=3D0x1<LAHF>
  TSC: P-state invariant
real memory  =3D 4294967296 (4096 MB)
avail memory =3D 4081422336 (3892 MB)


> Feel free to try the fix:
>=20
> 	http://people.freebsd.org/~pjd/patches/zfs_3way_deadlock.patch

I'll give it a shot on one of the two boxes.


Stefan

--=20
Stefan Bethke <stb@lassitu.de>   Fon +49 151 14070811






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AD4C8E44-30AE-429D-BC10-B9567090324E>