From owner-freebsd-fs@FreeBSD.ORG Sat Jan 21 14:20:12 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3B8391065670 for ; Sat, 21 Jan 2012 14:20:12 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 82F178FC0C for ; Sat, 21 Jan 2012 14:20:11 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA14646; Sat, 21 Jan 2012 16:20:07 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Robnb-000JSL-Em; Sat, 21 Jan 2012 16:20:07 +0200 Message-ID: <4F1AC995.7050506@FreeBSD.org> Date: Sat, 21 Jan 2012 16:20:05 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20111222 Thunderbird/9.0 MIME-Version: 1.0 To: Martin Ranne References: <39C592E81AEC0B418EAD826FC1BBB09B25031D@mailgate> <4F18459F.7040309@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B252444@mailgate> <4F1858FE.7020509@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25253F@mailgate> <4F1878AC.6060704@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25284B@mailgate> In-Reply-To: <39C592E81AEC0B418EAD826FC1BBB09B25284B@mailgate> X-Enigmail-Version: undefined Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "freebsd-fs@freebsd.org" Subject: Re: zpool import reboots computer X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Jan 2012 14:20:12 -0000 on 20/01/2012 11:09 Martin Ranne said the following: > I tried again to get into the debugger. It will not always work as it freezes before i get to the prompt most of the times but here it is. Any other commands to run in the debugger to get better information to help solve this? > > I used the command zpool import -F -f -o readonly=on -R /mnt/serv06 zroot > > Result is the following > Fatal trap 12: page fault while in kernel mode > Fatal trap 12: page fault while in kernel mode > cpuid = 0; cpuid = 5; apic id = 00 > apic id = 05 > fault virtual address = 0x38 > fault virtual address = 0x88 > fault code = supervisor read data, page not present > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff814872a1 > instruction pointer = 0x20:0xffffffff814a7ef5 > stack pointer = 0x28:0xffffff8c0d564f00 > stack pointer = 0x28:0xffffff8c0ffd7ad0 > frame pointer = 0x28:0xffffff8c0d564f30 > frame pointer = 0x28:0xffffff8c0ffd7b40 > code segment = base 0x0, limit 0xfffff, type 0x1b > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = processor eflags = interrupt enabled, interrupt enabled, resume, resume, IOPL = 0 > IOPL = 0 > current process = current process = 0 (system_task1_3) > 26[ thread pid 0 tid 100099 ] > Stopped at vdev_is_dead+0x1: cmpq $0x5,0x28(%rdi) > db> bt > Tracing pid 0 tid 100099 td 0xfffffe000e546460 > vdev_is_dead() at vdev_is_dead+0x1 > vdev_mirror_child_select() at vdev_mirror_child_select+0x67 > vdev_mirror_io_start() at vdev_mirror_io_start+0x24c > zio_vdev_io_start() at zio_vdev_io_start+0x232 > zio_execute() at zio_execute+0xc3 > zio_gang_assemble() at zio_gang_assemble+0x1b > zio_execute() at zio_execute+0xc3 > arc_read_nolock() at arc_read_nolock+0x6d1 > arc_read() at arc_read+0x93 > traverse_prefetcher() at traverse_prefetcher+0x103 > traverse_visitbp() at traverse_visitbp+0x21c > traverse_dnode() at traverse_dnode+0x7c > traverse_visitbp() at traverse_visitbp+0x3ff > traverse_visitbp() at traverse_visitbp+0x316 > traverse_visitbp() at traverse_visitbp+0x316 > traverse_visitbp() at traverse_visitbp+0x316 > traverse_visitbp() at traverse_visitbp+0x316 > traverse_visitbp() at traverse_visitbp+0x316 > traverse_visitbp() at traverse_visitbp+0x316 > traverse_dnode() at traverse_dnode+0x7c > traverse_visitbp() at traverse_visitbp+0x48c > traverse_prefetch_thread() at traverse_prefetch_thread+0x78 > taskq_run() at taskq_run+0x13 > taskqueue_run_locked() at taskqueue_run_locked+0x85 > taskqueue_thread_loop() at taskqueue_thread_loop+0x46 > fork_exit() at fork_exit+0x11f > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffff8c0d565d00, rbp = 0 --- > db> To me it looks like in the vdev_mirror_child_select function mc->mc_vd could be NULL although the code doesn't expect it. You can add some code to the function to check if the hypothesis is correct and to skip a loop if mc->mc_vd is NULL. Such a hack is probably not needed in general, but given that your pool could be corrupted, this could be your chance to get access to it. BTW, restoring from backups is what is usually recommended first in a situation like this. -- Andriy Gapon