Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 20 Jan 2012 09:09:38 +0000
From:      Martin Ranne <martin.ranne@kockumsonics.com>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@FreeBSD.org>
Subject:   RE: zpool import reboots computer
Message-ID:  <39C592E81AEC0B418EAD826FC1BBB09B25284B@mailgate>
In-Reply-To: <4F1878AC.6060704@FreeBSD.org>
References:  <39C592E81AEC0B418EAD826FC1BBB09B25031D@mailgate> <4F18459F.7040309@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B252444@mailgate> <4F1858FE.7020509@FreeBSD.org> <39C592E81AEC0B418EAD826FC1BBB09B25253F@mailgate> <4F1878AC.6060704@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2012-01-19 21:10, Andriy Gapon wrote:=20
>on 19/01/2012 21:58 Martin Ranne said the following:
>>On 2012-01-19 18:55, Andriy Gapon wrote:=20
>>on 19/01/2012 19:36 Martin Ranne said the following:
>>>On 2012-01-19 17:32, Andriy Gapon wrote:=20
>>>on 19/01/2012 17:36 Martin Ranne said the following:
>>>>>>I had a failure in one server where i try to determine if it is memor=
y or cpu. It shows up as memory failure in memtest86. >>The result is that =
it managed to damage the zpool which is a raidz2 with 6 disks.

>>>>>>If I boot from a FreeBSD 9.0-RELEASE usb stick and import it with zpo=
ol -f -R /mnt/zroot zroot it will reboot the computer. >>I have also tried =
to import it in another computer which is running 9-STABLE with the same re=
sult. On the second computer I >>used zpool -f -R /mnt/zroot "zpool-id" ser=
v06zroot=20

>>>>>>Can I get some help on how to be able to debug this and in the end be=
 able to import it to repair it.

>>>>>>Data for the second computer can be found attached. The disks in ques=
tion are da0 to da5 in this.

>>>>>And the panic message is?

>>>>I am trying to get a crash dump but it hangs when dumping.

>>>Alternatives:
>>>- serial console
>>>- digital camera
>>>- eyes plus pen and paper

>>Finally here it is. Is there anything i can do in the debugger to make it=
 possible to find what is crashing in there?

>>Fatal trap 12: page fault while in kernel mode
>>Fatal trap 12: page fault while in kernel mode
>>cpuid =3D 0; cpuid =3D 2; apic id =3D 00
>>apic id =3D 02
>>fault virtual address        =3D 0x88
>>fault virtual address        =3D 0x38
>>fault code                    =3D supervisor read data, page not present
>>fault code                    =3D supervisor read data, page not present
>>instruction pointer            =3D 0x20:0xffffffff814a7ef5
>>instruction pointer            =3D 0x20:0xffffffff814872a1
>>stack pointer                =3D 0x28:0xffffff8c10252ad0
>>stack pointer                =3D 0x28:0xffffff8c0d564f00
>>frame pointer                =3D 0x28:0xffffff8c10252b40
>>frame pointer                =3D 0x28:0xffffff8c0d564f30
>>code segment                =3D base 0x0, limit 0xfffff, type 0x1b
>>code segment                =3D base 0x0, limit 0xfffff, type 0x1b
>>                            =3D DPL 0, pres 1, long 1, def32 0, gran 1
>>                            =3D DPL 0, pres 1, long 1, def32 0, gran 1
>>processor eflags            =3D processor eflags        =3D interrupt ena=
bled, interrupt enabled, resume, resume, IOPL =3D 0
>>IOPL =3D 0
>>current process                =3D current process                =3D 265=
9 (zpool)
>>0 [ thread pid 2659 tid 100592 ]

>Hmm, two traps running almost perfectly in parallel...

>stopped at        zio_vdev_child_io+0x25: cmpq    $0,0x88(%r10)
>db>

>At least the 'bt' command.

>It could be that the panic is caused by corrupted vdev label, but not sure=
...

I tried again to get into the debugger. It will not always work as it freez=
es before i get to the prompt most of the times but here it is. Any other c=
ommands to run in the debugger to get better information to help solve this=
?

I used the command zpool import -F -f -o readonly=3Don -R /mnt/serv06 zroot

Result is the following
Fatal trap 12: page fault while in kernel mode
Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; cpuid =3D 5; apic id =3D 00
apic id =3D 05
fault virtual address   =3D 0x38
fault virtual address   =3D 0x88
fault code              =3D supervisor read data, page not present
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff814872a1
instruction pointer     =3D 0x20:0xffffffff814a7ef5
stack pointer           =3D 0x28:0xffffff8c0d564f00
stack pointer           =3D 0x28:0xffffff8c0ffd7ad0
frame pointer           =3D 0x28:0xffffff8c0d564f30
frame pointer           =3D 0x28:0xffffff8c0ffd7b40
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D processor eflags      =3D interrupt enabled, in=
terrupt enabled, resume, resume, IOPL =3D 0
IOPL =3D 0
current process         =3D current process               =3D 0 (system_tas=
k1_3)
26[ thread pid 0 tid 100099 ]
Stopped at      vdev_is_dead+0x1:       cmpq    $0x5,0x28(%rdi)
db> bt
Tracing pid 0 tid 100099 td 0xfffffe000e546460
vdev_is_dead() at vdev_is_dead+0x1
vdev_mirror_child_select() at vdev_mirror_child_select+0x67
vdev_mirror_io_start() at vdev_mirror_io_start+0x24c
zio_vdev_io_start() at zio_vdev_io_start+0x232
zio_execute() at zio_execute+0xc3
zio_gang_assemble() at zio_gang_assemble+0x1b
zio_execute() at zio_execute+0xc3
arc_read_nolock() at arc_read_nolock+0x6d1
arc_read() at arc_read+0x93
traverse_prefetcher() at traverse_prefetcher+0x103
traverse_visitbp() at traverse_visitbp+0x21c
traverse_dnode() at traverse_dnode+0x7c
traverse_visitbp() at traverse_visitbp+0x3ff
traverse_visitbp() at traverse_visitbp+0x316
traverse_visitbp() at traverse_visitbp+0x316
traverse_visitbp() at traverse_visitbp+0x316
traverse_visitbp() at traverse_visitbp+0x316
traverse_visitbp() at traverse_visitbp+0x316
traverse_visitbp() at traverse_visitbp+0x316
traverse_dnode() at traverse_dnode+0x7c
traverse_visitbp() at traverse_visitbp+0x48c
traverse_prefetch_thread() at traverse_prefetch_thread+0x78
taskq_run() at taskq_run+0x13
taskqueue_run_locked() at taskqueue_run_locked+0x85
taskqueue_thread_loop() at taskqueue_thread_loop+0x46
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip =3D 0, rsp =3D 0xffffff8c0d565d00, rbp =3D 0 ---
db>

//Martin Ranne
________________________________________
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1901 / Virus Database: 2109/4754 - Release Date: 01/19/12



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?39C592E81AEC0B418EAD826FC1BBB09B25284B>