Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 21 Jan 2012 16:20:05 +0200
From:      Andriy Gapon <avg@FreeBSD.org>
To:        Martin Ranne <martin.ranne@kockumsonics.com>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@FreeBSD.org>
Subject:   Re: zpool import reboots computer
Message-ID:  <4F1AC995.7050506@FreeBSD.org>
In-Reply-To: <39C592E81AEC0B418EAD826FC1BBB09B25284B@mailgate>
References:  <39C592E81AEC0B418EAD826FC1BBB09B25031D@mailgate> <4F18459F.7040309@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B252444@mailgate> <4F1858FE.7020509@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25253F@mailgate> <4F1878AC.6060704@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25284B@mailgate>

next in thread | previous in thread | raw e-mail | index | archive | help
on 20/01/2012 11:09 Martin Ranne said the following:
> I tried again to get into the debugger. It will not always work as it freezes before i get to the prompt most of the times but here it is. Any other commands to run in the debugger to get better information to help solve this?
> 
> I used the command zpool import -F -f -o readonly=on -R /mnt/serv06 zroot
> 
> Result is the following
> Fatal trap 12: page fault while in kernel mode
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; cpuid = 5; apic id = 00
> apic id = 05
> fault virtual address   = 0x38
> fault virtual address   = 0x88
> fault code              = supervisor read data, page not present
> fault code              = supervisor read data, page not present
> instruction pointer     = 0x20:0xffffffff814872a1
> instruction pointer     = 0x20:0xffffffff814a7ef5
> stack pointer           = 0x28:0xffffff8c0d564f00
> stack pointer           = 0x28:0xffffff8c0ffd7ad0
> frame pointer           = 0x28:0xffffff8c0d564f30
> frame pointer           = 0x28:0xffffff8c0ffd7b40
> code segment            = base 0x0, limit 0xfffff, type 0x1b
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = processor eflags      = interrupt enabled, interrupt enabled, resume, resume, IOPL = 0
> IOPL = 0
> current process         = current process               = 0 (system_task1_3)
> 26[ thread pid 0 tid 100099 ]
> Stopped at      vdev_is_dead+0x1:       cmpq    $0x5,0x28(%rdi)
> db> bt
> Tracing pid 0 tid 100099 td 0xfffffe000e546460
> vdev_is_dead() at vdev_is_dead+0x1
> vdev_mirror_child_select() at vdev_mirror_child_select+0x67
> vdev_mirror_io_start() at vdev_mirror_io_start+0x24c
> zio_vdev_io_start() at zio_vdev_io_start+0x232
> zio_execute() at zio_execute+0xc3
> zio_gang_assemble() at zio_gang_assemble+0x1b
> zio_execute() at zio_execute+0xc3
> arc_read_nolock() at arc_read_nolock+0x6d1
> arc_read() at arc_read+0x93
> traverse_prefetcher() at traverse_prefetcher+0x103
> traverse_visitbp() at traverse_visitbp+0x21c
> traverse_dnode() at traverse_dnode+0x7c
> traverse_visitbp() at traverse_visitbp+0x3ff
> traverse_visitbp() at traverse_visitbp+0x316
> traverse_visitbp() at traverse_visitbp+0x316
> traverse_visitbp() at traverse_visitbp+0x316
> traverse_visitbp() at traverse_visitbp+0x316
> traverse_visitbp() at traverse_visitbp+0x316
> traverse_visitbp() at traverse_visitbp+0x316
> traverse_dnode() at traverse_dnode+0x7c
> traverse_visitbp() at traverse_visitbp+0x48c
> traverse_prefetch_thread() at traverse_prefetch_thread+0x78
> taskq_run() at taskq_run+0x13
> taskqueue_run_locked() at taskqueue_run_locked+0x85
> taskqueue_thread_loop() at taskqueue_thread_loop+0x46
> fork_exit() at fork_exit+0x11f
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffff8c0d565d00, rbp = 0 ---
> db>


To me it looks like in the vdev_mirror_child_select function mc->mc_vd could be
NULL although the code doesn't expect it.  You can add some code to the function
to check if the hypothesis is correct and to skip a loop if mc->mc_vd is NULL.
Such a hack is probably not needed in general, but given that your pool could be
corrupted, this could be your chance to get access to it.

BTW, restoring from backups is what is usually recommended first in a situation
like this.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F1AC995.7050506>