Date: Sun, 14 Aug 2011 15:43:26 +0100 From: "Steven Hartland" <killing@multiplay.co.uk> To: "Andriy Gapon" <avg@FreeBSD.org> Cc: freebsd-stable@FreeBSD.org Subject: Re: debugging frequent kernel panics on 8.2-RELEASE Message-ID: <2C4B0D05C8924F24A73B56EA652FA4B0@multiplay.co.uk> References: <47F0D04ADF034695BC8B0AC166553371@multiplay.co.uk><A71C3ACF01EC4D36871E49805C1A5321@multiplay.co.uk><4E4380C0.7070908@FreeBSD.org><EBC06A239BAB4B3293C28D793329F9CA@multiplay.co.uk> <4E43E272.1060204@FreeBSD.org> <62BF25D0ED914876BEE75E2ADF28DDF7@multiplay.co.uk> <4E440865.1040500@FreeBSD.org> <6F08A8DE780545ADB9FA93B0A8AA4DA1@multiplay.co.uk> <4E441314.6060606@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
----- Original Message ----- From: "Andriy Gapon" <avg@FreeBSD.org> > > Maybe test it on couple of machines first just in case I overlooked something > essential, although I have a report from another use that the patch didn't break > anything for him (it was tested for an unrelated issue). We've got this running on a ~40 machines and just had the first panic since the update. Unfortunately it doesn't seem to have changed anything :( We have 352 thread entries starting with:- #0 sched_switch (td=0xffffffff8083e4e0, newtd=0xffffff0012d838c0, flags=Variable "flags" is not available. 23 with:- cpustop_handler () at atomic.h:285 and 16 with:- #0 fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:562 The main message being:- panic: double fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: <118>Aug 14 15:13:33 amsbld15 syslogd: exiting on signal 15 Fatal double fault rip = 0xffffffff8053b691 rsp = 0xffffff8d8f356fb0 rbp = 0xffffff8d8f357210 cpuid = 2; apic id = 02 panic: double fault cpuid = 2 KDB: stack backtrace: #0 0xffffffff803bb75e at kdb_backtrace+0x5e #1 0xffffffff8038956e at panic+0x2ae #2 0xffffffff805802b6 at dblfault_handler+0x96 #3 0xffffffff8056900d at Xdblfault+0xad stack: 0xffffff8d8f357000, 4 rsp = 0xffffff800009ae10 Uptime: 2d21h6m18s Physical memory: 49132 MB Dumping 17080 MB: 17065... Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from /boot/kernel/linprocfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/linprocfs.ko Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from /boot/kernel/nullfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/nullfs.ko #0 sched_switch (td=0xffffffff8083e4e0, newtd=0xffffff0012d838c0, flags=Variable "flags" is not available.) at /usr/src/sys/kern/sched_ule.c:1858 1858 cpuid = PCPU_GET(cpuid); (kgdb) #0 sched_switch (td=0xffffffff8083e4e0, newtd=0xffffff0012d838c0, flags=Variable "flags" is not available.) at /usr/src/sys/kern/sched_ule.c:1858 #1 0xffffffff80391a99 in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:451 #2 0xffffffff803c5112 in sleepq_timedwait (wchan=0xffffffff8083e080, pri=68) at /usr/src/sys/kern/subr_sleepqueue.c:644 #3 0xffffffff80391efb in _sleep (ident=0xffffffff8083e080, lock=0x0, priority=Variable "priority" is not available.) at /usr/src/sys/kern/kern_synch.c:230 #4 0xffffffff8053ebc9 in scheduler (dummy=Variable "dummy" is not available.) at /usr/src/sys/vm/vm_glue.c:807 #5 0xffffffff80341767 in mi_startup () at /usr/src/sys/kern/init_main.c:254 #6 0xffffffff8016efdc in btext () at /usr/src/sys/amd64/amd64/locore.S:81 #7 0xffffffff80863dc8 in sleepq_chains () #8 0xffffffff80848ae0 in cpu_top () #9 0x0000000000000000 in ?? () #10 0xffffffff8083e4e0 in proc0 () #11 0xffffffff80bb3b90 in ?? () #12 0xffffffff80bb3b38 in ?? () #13 0xffffff0012d838c0 in ?? () #14 0xffffffff803aeb19 in sched_switch (td=0x0, newtd=0x0, flags=Variable "flags" is not available.) at /usr/src/sys/kern/sched_ule.c:1852 Previous frame inner to this frame (corrupt stack?) There are some indications that stopping jails could be the cause of the panics so on one test box I've added in invariants to see if we get anything shows up from that. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2C4B0D05C8924F24A73B56EA652FA4B0>