Date: Wed, 30 Mar 2005 15:52:02 -0500 From: John Baldwin <jhb@FreeBSD.org> To: Bruce Evans <bde@zeta.org.au> Cc: Oleg Tarasov <subscriber@osk.com.ua> Subject: Re: sio interrupt-level buffer overflows Message-ID: <200503301552.02472.jhb@FreeBSD.org> In-Reply-To: <20050330155502.E16886@delplex.bde.org> References: <815955888.20050323113529@osk.com.ua> <1101884216.20050323181742@osk.com.ua> <20050330155502.E16886@delplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 30 March 2005 01:06 am, Bruce Evans wrote: > On Wed, 23 Mar 2005, Oleg Tarasov wrote: > > About my panics. They persist and when this server panics it somehow > > overloads my network so it stops functioning until reboot. This is > > very, very bad. > > > > Maybe you could tell me where to write, or you could > > personally tell me what should I do. > > > > Using all my theoretical skills I have come to this data I could > > obtain from my dump: > > > > (kgdb) backtrace > > #0 doadump () at pcpu.h:159 > > #1 0xc060b063 in boot (howto=260) at > > /usr/src/sys/kern/kern_shutdown.c:397 #2 0xc060b389 in panic > > (fmt=0xc080321d "spin lock held too long") at > > /usr/src/sys/kern/kern_shutdown.c:553 > > #3 0xc060270c in _mtx_lock_spin (m=0xc08d7800, td=0xc19ca320, opts=0, > > file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:613 > > #4 0xc077c165 in siointr (arg=0xc1ab8800) at > > /usr/src/sys/dev/sio/sio.c:1710 #5 0xc0790ead in intr_execute_handlers > > (isrc=0xc19b8890, iframe=0xd541ac94) at > > /usr/src/sys/i386/i386/intr_machdep.c:203 > > #6 0xc07932be in lapic_handle_intr (frame= > > {if_vec = 52, if_fs = -717160424, if_es = -1067384816, if_ds = 16, > > if_edi = -1046699232, if_esi = -1064591424, if_ebp = -717116188, if_ebx = > > -1046425600, if_edx = -1064566184, if_ecx = 0, if_eax = -1046425600, > > if_eip = -1067440569, if _cs = 8, if_eflags = 582, if_esp = -1045200000, > > if_ss = 4}) > > at /usr/src/sys/i386/i386/local_apic.c:490 > > #7 0xc078d753 in Xapic_isr1 () at apic_vector.s:110 > > #8 0x00000034 in ?? () > > #9 0xd5410018 in ?? () > > #10 0xc0610010 in coredump (td=0xc08b9fc0) at vnode_if.h:1244 > > #11 0xc05f6f46 in ithread_loop (arg=0xc1981c80) > > at /usr/src/sys/kern/kern_intr.c:546 > > #12 0xc05f6001 in fork_exit (callout=0xc05f6df8 <ithread_loop>, > > arg=0xc1981c80, frame=0xd541ad48) at /usr/src/sys/kern/kern_fork.c:811 > > #13 0xc078d3fc in fork_trampoline () at > > /usr/src/sys/i386/i386/exception.s:209 ... > > I couldn't figure out the problem from this. Your later mail says that > the problem is caused by ppp not being MPSAFE, at least with sio, so I > won't do much more with this stack trace, but I wonder about some of the > strange entries in it: > > #13 - #11 are normal. > #10 is weird. ithread_loop() shouldn't call coredump(). > #8 - #9 seem to be more like stack garbage than module addresses. > #7 is normal, but it looks like someone broke stack traces for interrupts, > giving the garbage in #8 - #10. This is weird as we do match on Xapic_isr as being an interrupt frame. I'm not sure why that didn't work correctly. > #0 - #6 are normal if the spin lock is already held by the same CPU that > is handling the interrupt (except this can't happen :-). I wouldn't > have thought that broken locking in ppp could cause this. It's also normal if another CPU is holding the lock and spins with it for some reason. > > Bruce -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200503301552.02472.jhb>