From owner-freebsd-fs@FreeBSD.ORG Thu Oct 7 00:15:36 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A9F26106566B for ; Thu, 7 Oct 2010 00:15:36 +0000 (UTC) (envelope-from gallasch@free.de) Received: from smtp.free.de (smtp.free.de [91.204.6.103]) by mx1.freebsd.org (Postfix) with ESMTP id EE8788FC08 for ; Thu, 7 Oct 2010 00:15:35 +0000 (UTC) Received: (qmail 46655 invoked from network); 7 Oct 2010 02:15:34 +0200 Received: from smtp.free.de (HELO orwell.free.de) ([91.204.4.103]) (envelope-sender ) by smtp.free.de (qmail-ldap-1.03) with AES128-SHA encrypted SMTP for ; 7 Oct 2010 02:15:34 +0200 References: <39F05641-4E46-4BE0-81CA-4DEB175A5FBE@free.de> In-Reply-To: <39F05641-4E46-4BE0-81CA-4DEB175A5FBE@free.de> Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii Message-Id: <65756266-A549-4F78-8BBA-414F24A633F9@free.de> Content-Transfer-Encoding: quoted-printable From: Kai Gallasch Date: Thu, 7 Oct 2010 02:15:33 +0200 To: freebsd-fs@freebsd.org X-Mailer: Apple Mail (2.1081) Subject: Re: Locked up processes after upgrade to ZFS v15 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Oct 2010 00:15:36 -0000 Am 06.10.2010 um 14:28 schrieb Kai Gallasch: > Hi. >=20 > Two days ago I upgraded my server to 8.1-STABLE (amd64) and upgraded = ZFS from v14 to v15. > After zpool & zfs upgrade the server was running stable for about half = a day, but then apache processes running inside jails would lock up and = could not be terminated any more. >=20 > In the end apache (both worker and prefork) itself locked up, because = it lost control of its child processes. sorry for replying to my own mail, but there is some new information on = this issue: 'zfs send' triggered a panic: MCA: Bank 0, Status 0xf600000000010015 MCA: Global Cap 0x0000000000000106, = Status 0x0000000000000004 MCA: Vendor "AuthenticAMD", ID 0x100f23, APIC ID 2 MCA: CPU 2 UNCOR PCC OVER DTLB L1 error MCA: Address 0xff80d4611000 Fatal trap 28: machine check trap while in kernel mode cpuid =3D 2; apic id =3D 02 instruction pointer =3D 0x20:0xffffffff80e60f25 stack pointer =3D 0x28:0xffffff832a2e17d0 frame pointer =3D 0x28:0xffffff832a2e1a40 code segment =3D base 0x0, limit 0xfffff, = type 0x1b =3D DPL 0, pres 1, long 1, def32 = 0, gran 1 processor eflags =3D interrupt enabled, IOPL =3D = 0 current process =3D 0 = (zio_write_issue_0) [thread pid 0 tid 101159 ] Stopped at lzjb_compress+0x165: addq = $0x1,%rdx db> bt Tracing pid 0 tid 101159 td 0xffffff00aa64a3e0 lzjb_compress() at lzjb_compress+0x165 zio_compress_data() at zio_compress_data+0xbe zio_write_bp_init() at zio_write_bp_init+0xc2 zio_execute() at zio_execute+0x77 zio_ready() at zio_ready+0x162 zio_execute() at zio_execute+0x77 taskq_run_safe() at taskq_run_safe+0x13 taskqueue_run() at taskqueue_run+0x91 taskqueue_thread_loop() at taskqueue_thread_loop+0x3f fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip =3D 0, rsp =3D 0xffffff832a2e1d30, rbp =3D 0 --- I sure know this one: "CPU 2 UNCOR PCC OVER DTLB L1 error", because this particular server in the past had some problems with FreeBSD 8.0-REL and "super pages" enabled. Workaround was to set vm.pmap.pg_ps_enabled=3D"0" in /boot/loader.conf Later on with 8.0-STABLE setting the tunable was not necessary any more, because a workaround for this was commited to src/sys. So, just to test this I again set vm.pmap.pg_ps_enabled=3D"0" and will = see if processes still lock up. Regards, Kai.