Date: Wed, 26 Nov 2003 07:50:26 -0800 (PST) From: Daniel Lang <dl@leo.org> To: freebsd-bugs@FreeBSD.org Subject: Re: kern/59260: Panic by integer divide fault in Thinkpad A31p / IRQ Problem? Message-ID: <200311261550.hAQFoQNv082831@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/59260; it has been noted by GNATS. From: Daniel Lang <dl@leo.org> To: freebsd-gnats-submit@FreeBSD.org, dl@leo.org Cc: imp@freebsd.org, jhb@freebsd.org Subject: Re: kern/59260: Panic by integer divide fault in Thinkpad A31p / IRQ Problem? Date: Wed, 26 Nov 2003 16:49:07 +0100 Dear Warner, Dear John. I have tried to live-debug the kernel. Alas with no further results for me. Possibly you can get more out of it. Script follows: ====== Following is a script, that documents my live-kernel debugging in case of such a panic: Script started on Wed Nov 19 07:50:29 2003 spot:~/tmp/thinkpad-debug#gdb -k kernel.debug GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd"... (kgdb) target remote /dev/cuaa0 Remote debugging using /dev/cuaa0 cbb_intr (arg=0xc1d13800) at /usr/src/sys/dev/pccbb/pccbb.c:1126 1126 if (sockevent != 0) { warning: Unable to find dynamic linker breakpoint function. GDB will be unable to debug shared library initializers and track explicitly loaded dynamic code. warning: shared library handler failed to enable breakpoint (kgdb) x 0xc0524182 0xc0524182 <cbb_intr+34>: 0xc085c689 (kgdb) l *0xc0524182 0xc0524182 is in cbb_intr (/usr/src/sys/dev/pccbb/pccbb.c:289). 284 bus_space_write_4(sc->bst, sc->bsh, reg, val); 285 } 286 287 static __inline uint32_t 288 cbb_get(struct cbb_softc *sc, uint32_t reg) 289 { 290 return (bus_space_read_4(sc->bst, sc->bsh, reg)); 291 } 292 293 static __inline void (kgdb) up #1 0xc05a1642 in ithread_loop (arg=0xc1d15880) at /usr/src/sys/kern/kern_intr.c:544 544 ih->ih_handler(ih->ih_argument); (kgdb) up #2 0xc05a0634 in fork_exit (callout=0xc05a14b0 <ithread_loop>, arg=0xd0201000, frame=0xd0201000) at /usr/src/sys/kern/kern_fork.c:793 793 callout(arg, frame); (kgdb) up Initial frame selected; you cannot go up. (kgdb) down #1 0xc05a1642 in ithread_loop (arg=0xc1d15880) at /usr/src/sys/kern/kern_intr.c:544 544 ih->ih_handler(ih->ih_argument); (kgdb) p *arg Attempt to dereference a generic pointer. (kgdb) bt #0 cbb_intr (arg=0xc1d13800) at /usr/src/sys/dev/pccbb/pccbb.c:1126 #1 0xc05a1642 in ithread_loop (arg=0xc1d15880) at /usr/src/sys/kern/kern_intr.c:544 #2 0xc05a0634 in fork_exit (callout=0xc05a14b0 <ithread_loop>, arg=0xd0201000, frame=0xd0201000) at /usr/src/sys/kern/kern_fork.c:793 (kgdb) down #0 cbb_intr (arg=0xc1d13800) at /usr/src/sys/dev/pccbb/pccbb.c:1126 1126 if (sockevent != 0) { (kgdb) l 1121 1122 /* 1123 * This ISR needs work XXX 1124 */ 1125 sockevent = cbb_get(sc, CBB_SOCKET_EVENT); 1126 if (sockevent != 0) { 1127 DPRINTF(("CBB EVENT 0x%x\n", sockevent)); 1128 /* ack the interrupt */ 1129 cbb_setb(sc, CBB_SOCKET_EVENT, sockevent); 1130 (kgdb) p sc $1 = (struct cbb_softc *) 0xc1d13800 (kgdb) p *sc $2 = {dev = 0xc4792600, exca = {dev = 0xc4792600, memalloc = 0, mem = {{memt = 0, memh = 0, addr = 0, size = 0, realsize = 0, cardaddr = 0, kind = 0}, { memt = 0, memh = 0, addr = 0, size = 0, realsize = 0, cardaddr = 0, kind = 0}, {memt = 0, memh = 0, addr = 0, size = 0, realsize = 0, cardaddr = 0, kind = 0}, {memt = 0, memh = 0, addr = 0, size = 0, realsize = 0, cardaddr = 0, kind = 0}, {memt = 0, memh = 0, addr = 0, size = 0, realsize = 0, cardaddr = 0, kind = 0}}, ioalloc = 0, io = {{ iot = 0, ioh = 0, addr = 0, size = 0, flags = 0, width = 0}, {iot = 0, ioh = 0, addr = 0, size = 0, flags = 0, width = 0}}, bst = 1, bsh = 3701633024, flags = 2, offset = 2048, chipset = 0, getb = 0xc04e8bd0 <exca_mem_getb>, putb = 0xc04e8c00 <exca_mem_putb>, event_thread = 0x0, mtx = {mtx_object = {lo_class = 0x0, lo_name = 0x0, lo_type = 0x0, lo_flags = 0, lo_list = {tqe_next = 0x0, tqe_prev = 0x0}, lo_witness = 0x0}, mtx_lock = 0, mtx_recurse = 0}, cv = {cv_waitq = { tqh_first = 0x0, tqh_last = 0x0}, cv_mtx = 0x0, cv_description = 0x0}, pccarddev = 0xc4792100}, base_res = 0xc47932c0, irq_res = 0xc4793280, intrhand = 0xc4793240, bst = 1, bsh = 3701633024, secbus = 1 '\001', subbus = 1 '\001', mtx = {mtx_object = {lo_class = 0xc07aa63c, lo_name = 0xc4753450 "cbb1", lo_type = 0xc07549e4 "cbb", lo_flags = 196608, lo_list = {tqe_next = 0xc4789aa8, tqe_prev = 0xc4767b44}, lo_witness = 0xc07f2a10}, mtx_lock = 4, mtx_recurse = 0}, cv = {cv_waitq = { tqh_first = 0xc4783b40, tqh_last = 0xc4783b58}, cv_mtx = 0xc1d1393c, cv_description = 0xc075fce5 "cbb cv"}, flags = 1342177280, chipset = 4, rl = { slh_first = 0x0}, intr_handlers = {stqh_first = 0x0, stqh_last = 0xc1d1397c}, cbdev = 0xc4792180, event_thread = 0xc4782a98} (kgdb) p sc->bst $3 = 1 (kgdb) p sc->bsh $4 = 3701633024 * * REMARK: * * re is CBB_SOCKET_EVENT = 0x00 * * in this case, looked that up from the source * * now from my previous examinations, I know that * the following is sufficient to call the bus_space_read_4() * functions (kgdb) p (void*)sc->bsh $5 = (void *) 0xdca27000 (kgdb) p *0xdca27000 $6 = 0 * Now this is the address, that is to be read * This kind of address was not accessible in a * post-mortem crash-dump as described in the PR * but here on the live kernel, I can examine it * however, the value the pointer points to is 0 * even if I call the exact line from * bus_space_read_4() which is the following, * I still get '0' as result. * * I don't know if bus_space_read_4() should return 0 * or if this should never happen, since somewehere else * in the code this value is divided by? * causing the "integer division fault" ??? * (kgdb) p (*(volatile u_int32_t *)(0xdca27000)) $7 = 0 * * well, I was curious and wanted the kernel to * continue, which briefly worked ... * (kgdb) cont Continuing. * * The following happened a few minutes after I 'continued' * the paniced kernel * Program received signal SIGEMT, Emulation trap. 0xc05bb89d in critical_exit () at machine/cpufunc.h:358 358 machine/cpufunc.h: No such file or directory. in machine/cpufunc.h (kgdb) bt #0 0xc05bb89d in critical_exit () at machine/cpufunc.h:358 #1 0xc05ab8bd in _mtx_unlock_spin_flags (m=0xc07f1760, opts=0, file=0x1 <Address 0x1 out of bounds>, line=-1043227520) at /usr/src/sys/kern/kern_mutex.c:333 #2 0xc05dc58e in witness_lock (lock=0xc4a58208, flags=8, file=0xc4b856c5 "/usr/src/sys/dev/fxp/if_fxp.c", line=1556) at /usr/src/sys/kern/subr_witness.c:830 #3 0xc05ab61a in _mtx_lock_flags (m=0xc07f1760, opts=0, file=0xc07aa63c "\206þwÀ\t", line=-995786232) at /usr/src/sys/kern/kern_mutex.c:221 #4 0xc4b83404 in ?? () #5 0xc05a1642 in ithread_loop (arg=0xc1d15880) at /usr/src/sys/kern/kern_intr.c:544 #6 0xc05a0634 in fork_exit (callout=0xc05a14b0 <ithread_loop>, arg=0x246, frame=0x246) at /usr/src/sys/kern/kern_fork.c:793 (kgdb) #0 0xc05bb89d in critical_exit () at machine/cpufunc.h:358 #1 0xc05ab8bd in _mtx_unlock_spin_flags (m=0xc07f1760, opts=0, file=0x1 <Address 0x1 out of bounds>, line=-1043227520) at /usr/src/sys/kern/kern_mutex.c:333 #2 0xc05dc58e in witness_lock (lock=0xc4a58208, flags=8, file=0xc4b856c5 "/usr/src/sys/dev/fxp/if_fxp.c", line=1556) at /usr/src/sys/kern/subr_witness.c:830 #3 0xc05ab61a in _mtx_lock_flags (m=0xc07f1760, opts=0, file=0xc07aa63c "\206þwÀ\t", line=-995786232) at /usr/src/sys/kern/kern_mutex.c:221 #4 0xc4b83404 in ?? () #5 0xc05a1642 in ithread_loop (arg=0xc1d15880) at /usr/src/sys/kern/kern_intr.c:544 #6 0xc05a0634 in fork_exit (callout=0xc05a14b0 <ithread_loop>, arg=0x246, frame=0x246) at /usr/src/sys/kern/kern_fork.c:793 (kgdb) call cpu_reset() * * Ok I guessed here, its not of use to continue here * so I wanted to reset (cpu_reset()) * Program received signal SIGSEGV, Segmentation fault. 0x00000000 in ?? () The program being debugged was signaled while in a function called from GDB. GDB remains in the frame where the signal was received. To change this behavior use "set unwindonsignal on" Evaluation of the expression containing the function (cpu_reset) will be abandoned. (kgdb) call cpu_reset() Program received signal SIGSEGV, Segmentation fault. 0x00000000 in ?? () The program being debugged was signaled while in a function called from GDB. GDB remains in the frame where the signal was received. To change this behavior use "set unwindonsignal on" Evaluation of the expression containing the function (cpu_reset) will be abandoned. (kgdb) quit The program is running. Exit anyway? (y or n) y Script done on Wed Nov 19 10:26:49 2003 I just switched off the Thinkpad manually now. So the question, that possibly someone is able to answer: Is "bus_space_read_4()" (or any of these functions) allowed/expected to return 0, or is this a case that should never happen? It appears strange to me, because in THIS case (which is a bit different from the original PR, it seems the panic can occur in slightly different places), it just seems that the call in cbb_intr(): ... sockevent = cbb_get(sc, CBB_SOCKET_EVENT); ... returned 0, thus setting 'sockevent = 0;' I don't see any reason why this should cause an "Integer Divide Fault" -- IRCnet: Mr-Spock - All your .sigs are belong to us - Daniel Lang * dl@leo.org * +49 89 289 18532 * http://www.leo.org/~dl/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200311261550.hAQFoQNv082831>