Date: Tue, 9 Mar 2010 13:43:58 +0100 From: Stefan Bethke <stb@lassitu.de> To: Pawel Jakub Dawidek <pjd@FreeBSD.org> Cc: FreeBSD Stable <freebsd-stable@freebsd.org> Subject: Re: Many processes stuck in zfs Message-ID: <AD4C8E44-30AE-429D-BC10-B9567090324E@lassitu.de> In-Reply-To: <20100309122954.GE3155@garage.freebsd.pl> References: <864468D4-DCE9-493B-9280-00E5FAB2A05C@lassitu.de> <20100309122954.GE3155@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
Am 09.03.2010 um 13:29 schrieb Pawel Jakub Dawidek: > On Tue, Mar 09, 2010 at 10:15:53AM +0100, Stefan Bethke wrote: >> Over the past couple of months, I've more or less regularly observed = machines having more and more processes stuck in the zfs wchan. The = processes never recover from that, and trying to reboot only gets the = entire system stuck, without any console messages. I can enter the = debugger, and I have saved a couple of dumps. >>=20 >> The situation seems to be triggered by zfs receive'ing snapshots from = the sister machine (both synchronize their active ZFS filesystems to = each other, using zfs send and zfs receive). It appears it's the = receiving causing trouble. >>=20 >> Both machines run 8-stable from mid-February, with a single-disk ZFS = pool, with ARC limited to 512M, prefetch and ZIL disabled via = loader.conf. >>=20 >> What should I be looking at to further diagnose? >=20 > What kind of hardware do you have there? There is 3-way deadlock I've = a > fix for which would be hard to trigger on single or dual core = machines. FreeBSD lokschuppen.zs64.net 8.0-STABLE FreeBSD 8.0-STABLE #24: Sat Feb = 13 11:20:03 UTC 2010 = root@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT amd64 Copyrig ht (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights = reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-STABLE #24: Sat Feb 13 11:20:03 UTC 2010 root@lokschuppen.zs64.net:/usr/obj/usr/src/sys/EISENBOOT amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Duo CPU E7300 @ 2.66GHz (2666.65-MHz = K8-class CPU) Origin =3D "GenuineIntel" Id =3D 0x10676 Stepping =3D 6 = Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE= ,MCA,C MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> = Features2=3D0x8e39d<SSE3,DTES64,MON,DS_CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,SS= E4.1> AMD Features=3D0x20100800<SYSCALL,NX,LM> AMD Features2=3D0x1<LAHF> TSC: P-state invariant real memory =3D 4294967296 (4096 MB) avail memory =3D 4081422336 (3892 MB) > Feel free to try the fix: >=20 > http://people.freebsd.org/~pjd/patches/zfs_3way_deadlock.patch I'll give it a shot on one of the two boxes. Stefan --=20 Stefan Bethke <stb@lassitu.de> Fon +49 151 14070811
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AD4C8E44-30AE-429D-BC10-B9567090324E>