Date: Sat, 24 Jan 2009 13:00:23 +0200 From: Andriy Gapon <avg@icyb.net.ua> To: FreeBSD Stable <freebsd-stable@freebsd.org> Subject: panic in callout_reset: bad link in callwheel Message-ID: <497AF4C7.3080309@icyb.net.ua>
next in thread | raw e-mail | index | archive | help
System: FreeBSD 7.1-STABLE i386 (revision 187025) Panic message: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode fault virtual address = 0xd2006ad0 fault code = supervisor write, page not present instruction pointer = 0x20:0xc05623aa stack pointer = 0x28:0xdd4f6c34 frame pointer = 0x28:0xdd4f6c40 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 13 (swi4: clock) trap number = 12 panic: page fault KDB: stack backtrace: db_trace_self_wrapper(c074bb2f,dd4f6b14,c05514af,c0749d10,c07b85e0,...) at 0xc0478466 = db_trace_self_wrapper+0x26 kdb_backtrace(c0749d10,c07b85e0,c073b02b,dd4f6b20,dd4f6b20,...) at 0xc057a639 = kdb_backtrace+0x29 panic(c073b02b,c0761cb4,c36104dc,1,1,...) at 0xc05514af = panic+0xaf trap_fatal(c0761bb6,c,c3a89460,c3a8965c,c,...) at 0xc0705723 = trap_fatal+0x353 trap(dd4f6bf4) at 0xc07060ca = trap+0x10a calltrap() at 0xc06f463b = calltrap+0x6 --- trap 0xc, eip = 0xc05623aa, esp = 0xdd4f6c34, ebp = 0xdd4f6c40 --- callout_reset(c3a8552c,13,c0561940,c3a852b8,c3612690,...) at 0xc05623aa = callout_reset+0x14a realitexpire(c3a852b8,2d6100,c3612690,1,dd4f6cbc,...) at 0xc0561ab6 = realitexpire+0x176 softclock(0,0,c0747617,4a1,0,...) at 0xc0562c25 = softclock+0x235 ithread_loop(c35e5a20,dd4f6d38,0,0,0,...) at 0xc053268b = ithread_loop+0x1cb fork_exit(c05324c0,c35e5a20,dd4f6d38) at 0xc052eda1 = fork_exit+0xa1 fork_trampoline() at 0xc06f46b0 = fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xdd4f6d70, ebp = 0 --- Some debugging: (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc05512b3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc05514ff in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc0705723 in trap_fatal (frame=0xdd4f6bf4, eva=3523242704) at /usr/src/sys/i386/i386/trap.c:939 #4 0xc07060ca in trap (frame=0xdd4f6bf4) at /usr/src/sys/i386/i386/trap.c:320 #5 0xc06f463b in calltrap () at /usr/src/sys/i386/i386/exception.s:159 #6 0xc05623aa in callout_reset (c=0xc3a8552c, to_ticks=19, ftn=0xc0561940 <realitexpire>, arg=0xc3a852b8) at /usr/src/sys/kern/kern_timeout.c:471 #7 0xc0561ab6 in realitexpire (arg=0xc3a852b8) at /usr/src/sys/kern/kern_time.c:684 #8 0xc0562c25 in softclock (dummy=0x0) at /usr/src/sys/kern/kern_timeout.c:274 #9 0xc053268b in ithread_loop (arg=0xc35e5a20) at /usr/src/sys/kern/kern_intr.c:1088 #10 0xc052eda1 in fork_exit (callout=0xc05324c0 <ithread_loop>, arg=0xc35e5a20, frame=0xdd4f6d38) at /usr/src/sys/kern/kern_fork.c:804 #11 0xc06f46b0 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:264 (kgdb) fr 6 #6 0xc05623aa in callout_reset (c=0xc3a8552c, to_ticks=19, ftn=0xc0561940 <realitexpire>, arg=0xc3a852b8) at /usr/src/sys/kern/kern_timeout.c:471 471 /usr/src/sys/kern/kern_timeout.c: No such file or directory. in /usr/src/sys/kern/kern_timeout.c (kgdb) p *c $1 = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xd2006ad0}}, c_time = 2974104, c_arg = 0xc3a852b8, c_func = 0xc0561940 <realitexpire>, c_mtx = 0x0, c_flags = 22} (kgdb) p c->c_links.tqe.tqe_prev $2 = (struct callout **) 0xd2006ad0 (kgdb) p *c->c_links.tqe.tqe_prev Cannot access memory at address 0xd2006ad0 (kgdb) p callwheel[c->c_time & callwheelmask] $4 = {tqh_first = 0x0, tqh_last = 0xd2006ad0} The code: 467 c->c_arg = arg; 468 c->c_flags |= (CALLOUT_ACTIVE | CALLOUT_PENDING); 469 c->c_func = ftn; 470 c->c_time = ticks + to_ticks; 471 TAILQ_INSERT_TAIL(&callwheel[c->c_time & callwheelmask], 472 c, c_links.tqe); Additional info: I recently added some new memory to this system. The memory survived several passes of memtest86 before booting to FreeBSD. It also survived one pass after the incident. Still I wouldn't exclude a possibility of it being bad. Small analysis: If this is not because of bad memory, then it probably means that a struct callout was earlier deallocated somewhere (possibly as a part of a bigger object), but not unregistered/removed from callout mechanism. I guess it is quite hard to backtrack that now. All I can say that was nothing "funny" happening on the machine from the point of view of attaching/detaching any HW or loading/unloading modules or anything like that. Just "normal" work. So it could be something that it is always "on", like network stack or ata subsystem, etc. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?497AF4C7.3080309>