Date: Wed, 03 Feb 2010 20:23:15 +0900 From: Stephane LAPIE <stephane.lapie@darkbsd.org> To: Andriy Gapon <avg@icyb.net.ua> Cc: freebsd-fs@freebsd.org, Julian Elischer <julian@elischer.org>, freebsd-hardware@freebsd.org Subject: Re: [zfs][hardware] Reproducible kernel panic in 8.0-STABLE Message-ID: <4B695CA3.50008@darkbsd.org> In-Reply-To: <4B68641D.9000201@icyb.net.ua> References: <4B682972.6030604@darkbsd.org> <4B682F29.90505@icyb.net.ua> <4B686324.2090308@elischer.org> <4B68641D.9000201@icyb.net.ua>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] Andriy Gapon wrote: > on 02/02/2010 19:38 Julian Elischer said the following: >> Andriy Gapon wrote: >>> on 02/02/2010 15:32 Stephane LAPIE said the following: >>>> I have a case of kernel panic that can be consistently reproduced, and >>>> which I guess is related to the hardware I'm using (Marvell controllers, >>>> check my pciconf -lv output below). >>>> >>>> The kernel panic message is always, consistently, the following : >>>> >>>> Sleeping thread (tid 100021, pid 0) owns a non-sleepable lock >>> I probably won't be able to help you, but to kickstart debugging could >>> you please >>> run 'procstat -t 0' and determine what kernel thread has tid 100021 on >>> your system? >> or in the kernel debugger after the panic, do: bt > > I think that in this case it may not help. I mean the stack trace. > Because, I think that this panic happens after the taskqueue thread is done with > its tasks and is parked waiting. > >> you DO have options kdb and ddb right? (I never leave home without them) >> > > I just rebuilt a kernel with debugger options, and obtained the following output upon pulling out one disk : Sleeping thread (tid 100024, pid 0) owns a non-sleepable lock sched_switch() at sched_switch+0xf8 mi_switch() at mi_switch+0x16f sleepq_timedwait() at sleepq_timedwait+0x42 _cv_timedwait() at _cv_timedwait+0x129 _sema_timedwait() at _sema_timedwait+0x55 ata_queue_request() at ata_queue_request+0x526 ata_controlcmd() at ata_controlcmd+0xa1 ata_setmode() at ata_setmode+0xdc ad_init() at ad_init+0x27 ad_reinit() at ad_reinit+0x48 ata_reinit() at ata_reinit+0x268 ata_conn_event() at ata_conn_event+0x49 taskqueue_run() at taskqueue_run+0x93 taskqueue_thread_loop() at taskqueue_thread_loop+0x46 fork_exit() at fork_exit+0x118 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80000aad30, rbp = 0 --- panic: sleeping thread cpuid = 2 KDB: enter: panic [thread pid 12 tid 100008 ] Stopped at kdb_enter+0x3d: movq $0,0x4943d0(%rip) I think the output below is not really relevant though. db> bt Tracing pid 12 tid 100008 td 0xffffff000187e000 kdb_enter() at kdb_enter+0x3d panic() at panic+0x17b turnstile_adjust() at turnstile_adjust turnstile_wait() at turnstile_wait+0x1aa _mtx_lock_sleep() at _mtx_lock_sleep+0xb0 softclock() at softclock+0x2a9 intr_event_execute_handlers() at intr_event_execute_handlers+0xfd ithread_loop() at ithread_loop+0x8e fork_exit() at fork_exit+0x118 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff800005ad30, rbp = 0 --- If there is anything else I can run to obtain further information, all hints are welcome, though this clearly seems to point to a problem with my controller event handling as I initially thought. I am also very suspicious of that controller because it tends to drop two disks at exactly the same time, which alas belong to the same raidz1 block (BIOS level can't reset properly the port or redetect them after this, I have to go through a cold boot; The disks themselves could be damaged but I don't catch any weird readings via SMART and Reallocated Sectors or such). I am seriously thinking of moving some of these disks to the AHCI controller on my motherboard, and will resort to using my spares at the very least in the meantime. Thanks for your time, -- Stephane LAPIE, EPITA SRS, Promo 2005 "Even when they have digital readouts, I can't understand them." --MegaTokyo [-- Attachment #2 --] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAktpXKkACgkQ24Ql8u6TF2PafgCg0KHN21iTsRKK5bicKqrVo4Rv E68AoKFECb7szXCvNUWvk7k40dKfMI5r =URPh -----END PGP SIGNATURE-----help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4B695CA3.50008>
