From owner-freebsd-fs@FreeBSD.ORG Thu Jan 19 20:10:26 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 177B41065692 for ; Thu, 19 Jan 2012 20:10:26 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5F9468FC16 for ; Thu, 19 Jan 2012 20:10:24 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id WAA22746; Thu, 19 Jan 2012 22:10:21 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1RnyJR-000CKw-Es; Thu, 19 Jan 2012 22:10:21 +0200 Message-ID: <4F1878AC.6060704@FreeBSD.org> Date: Thu, 19 Jan 2012 22:10:20 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20111222 Thunderbird/9.0 MIME-Version: 1.0 To: Martin Ranne References: <39C592E81AEC0B418EAD826FC1BBB09B25031D@mailgate> <4F18459F.7040309@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B252444@mailgate> <4F1858FE.7020509@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25253F@mailgate> In-Reply-To: <39C592E81AEC0B418EAD826FC1BBB09B25253F@mailgate> X-Enigmail-Version: undefined Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" Subject: Re: zpool import reboots computer X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jan 2012 20:10:26 -0000 on 19/01/2012 21:58 Martin Ranne said the following: > On 2012-01-19 18:55, Andriy Gapon wrote: > on 19/01/2012 19:36 Martin Ranne said the following: > On 2012-01-19 17:32, Andriy Gapon wrote: > on 19/01/2012 17:36 Martin Ranne said the following: >>>>> I had a failure in one server where i try to determine if it is memory or cpu. It shows up as memory failure in memtest86. >>The result is that it managed to damage the zpool which is a raidz2 with 6 disks. > >>>>> If I boot from a FreeBSD 9.0-RELEASE usb stick and import it with zpool -f -R /mnt/zroot zroot it will reboot the computer. >>I have also tried to import it in another computer which is running 9-STABLE with the same result. On the second computer I >>used zpool -f -R /mnt/zroot "zpool-id" serv06zroot > >>>>> Can I get some help on how to be able to debug this and in the end be able to import it to repair it. > >>>>> Data for the second computer can be found attached. The disks in question are da0 to da5 in this. > >>>> And the panic message is? > >>> I am trying to get a crash dump but it hangs when dumping. > >> Alternatives: >> - serial console >> - digital camera >> - eyes plus pen and paper > > Finally here it is. Is there anything i can do in the debugger to make it possible to find what is crashing in there? > > Fatal trap 12: page fault while in kernel mode > Fatal trap 12: page fault while in kernel mode > cpuid = 0; cpuid = 2; apic id = 00 > apic id = 02 > fault virtual address = 0x88 > fault virtual address = 0x38 > fault code = supervisor read data, page not present > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff814a7ef5 > instruction pointer = 0x20:0xffffffff814872a1 > stack pointer = 0x28:0xffffff8c10252ad0 > stack pointer = 0x28:0xffffff8c0d564f00 > frame pointer = 0x28:0xffffff8c10252b40 > frame pointer = 0x28:0xffffff8c0d564f30 > code segment = base 0x0, limit 0xfffff, type 0x1b > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = processor eflags = interrupt enabled, interrupt enabled, resume, resume, IOPL = 0 > IOPL = 0 > current process = current process = 2659 (zpool) > 0 [ thread pid 2659 tid 100592 ] Hmm, two traps running almost perfectly in parallel... > stopped at zio_vdev_child_io+0x25: cmpq $0,0x88(%r10) > db> At least the 'bt' command. It could be that the panic is caused by corrupted vdev label, but not sure... -- Andriy Gapon