From owner-freebsd-current@FreeBSD.ORG Thu Nov 3 19:28:55 2005 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8F93C16A41F for ; Thu, 3 Nov 2005 19:28:55 +0000 (GMT) (envelope-from snezhko@indorsoft.ru) Received: from indor.net.tomline.ru (indor.net.tomline.ru [213.183.100.90]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6079343D45 for ; Thu, 3 Nov 2005 19:28:54 +0000 (GMT) (envelope-from snezhko@indorsoft.ru) Received: from SNEZHKO by indorsoft.ru (MDaemon.PRO.v7.2.2.R) with ESMTP id md50000027567.msg for ; Fri, 04 Nov 2005 01:28:44 +0600 X-AntiVirus: Checked by Dr.Web [version: 4.32b, engine: 4.32b, virus records: 88046, updated: 3.11.2005] To: freebsd-current@freebsd.org References: <20051027022313.R675@kushnir1.kiev.ua> <200511021035.22990.max@love2party.net> <20051103001509.C85422@kushnir1.kiev.ua> <200511030059.05946.max@love2party.net> From: Victor Snezhko Date: Fri, 04 Nov 2005 01:28:39 +0600 In-Reply-To: <200511030059.05946.max@love2party.net> (Max Laier's message of "Thu, 3 Nov 2005 00:58:48 +0100") Message-ID: User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (windows-nt) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Processed: indor.net.tomline.ru, Fri, 04 Nov 2005 01:28:45 +0600 (not processed: spam filter disabled) X-Return-Path: snezhko@indorsoft.ru X-MDaemon-Deliver-To: freebsd-current@freebsd.org X-VVS-Spam: false Cc: Max Laier Subject: Re: CURRENT + amd64 + user-ppp = panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Nov 2005 19:28:55 -0000 Max Laier writes: >> > That brings me back to ln_timer_ch as the culprit. Could you please >> > build with this patch and get me the console output shortly before the >> > panic? >> >> Here it is: right before panic it prints >> >> For 0xffffff0017531100 -1 ticks >> For 0xffffff0017531100 -1 ticks > > This results in two consecutive callout_stop() calls, but shouldn't hurt as > callout_stop is protected against that. > > Do you get a dump for this? Can you compare this pointer to the global > llinfo_nd6 and see if "c" from the softclock() frame is related (or > maybe the previous item in the list TAILQ). I have finally analysed this. Applied the following patch with more verbose printf against 1.62: --- nd6-1.62.c Fri Nov 4 01:14:50 2005 +++ nd6.c Fri Nov 4 01:02:30 2005 @@ -1,4 +1,4 @@ -/* $FreeBSD: /repoman/r/ncvs/src/sys/netinet6/nd6.c,v 1.62 2005/10/22 05:07:16 suz Exp $ */ +/* $FreeBSD: src/sys/netinet6/nd6.c,v 1.62 2005/10/22 05:07:16 suz Exp $ */ /* $KAME: nd6.c,v 1.144 2001/05/24 07:44:00 itojun Exp $ */ /*- @@ -49,6 +49,7 @@ #include #include #include +#include #include #include @@ -395,6 +396,9 @@ struct llinfo_nd6 *ln; long tick; { + printf("For %p %ld (llinfo_nd6=%p, prev=%p, next=%p)\n", \ + ln, tick, &llinfo_nd6, llinfo_nd6.ln_prev, llinfo_nd6.ln_next); + kdb_enter("in nd6_llinfo_settimer"); if (tick < 0) { ln->ln_expire = 0; ln->ln_ntick = 0; Thanks to John Baldwin, kdb_enter is really helpful. Here are the results of the first call (at lo0 initialization): For 0xc17722c0 -1 (llinfo_nd6 = 0xc090eac0, prev=0xc17722c0, next=0xc17722c0) And here - the results of two consecutive calls (at ppp startup, later in the same session): For 0xc188e840 -1 (llinfo_nd6 = 0xc090eac0, prev=0xc17722c0, next=0xc188e840) For 0xc188e840 -1 (llinfo_nd6 = 0xc090eac0, prev=0xc17722c0, next=0xc17722c0) As we can see, ln formal parameter is certainly related to the global llinfo_nd6, but (see crashdump below) I can't find any relation to the "c" from softclock(), pointers there are distinct from any of llinfo_nd6 members I have seen. I can obtain crashdumps for both consecutive calls if somebody thinks there is something broken here, but it seems unrelated to the trap in question. What else can I do? Here is brief softclock() crashdump brief analysis: ~ # kgdb /usr/obj/usr/src/sys/VVS/kernel.debug /var/crash/vmcore.7 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xdeadc0e6 fault code = supervisor read, page not present instruction pointer = 0x20:0xc066e0c2 stack pointer = 0x28:0xcbfd3cac frame pointer = 0x28:0xcbfd3cd4 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = resume, IOPL = 0 current process = 12 (swi4: clock sio) panic: from debugger cpuid = 0 Uptime: 45s Dumping 255 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 255MB (65264 pages) 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc06613fc in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0661711 in panic (fmt=0xc08594a0 "from debugger") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc046d151 in db_panic (addr=-1066999614, have_addr=0, count=-1, modif=0xcbfd3aa8 "") at /usr/src/sys/ddb/db_command.c:435 #4 0xc046d0e8 in db_command (last_cmdp=0xc094a1a4, cmd_table=0x0, aux_cmd_tablep=0xc08c009c, aux_cmd_tablep_end=0xc08c00b8) at /usr/src/sys/ddb/db_command.c:404 #5 0xc046d1b0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 #6 0xc046edc9 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:221 #7 0xc067b408 in kdb_trap (type=12, code=0, tf=0xcbfd3c6c) at /usr/src/sys/kern/subr_kdb.c:485 #8 0xc0823f88 in trap_fatal (frame=0xcbfd3c6c, eva=3735929062) at /usr/src/sys/i386/i386/trap.c:846 #9 0xc0823612 in trap (frame= {tf_fs = 8, tf_es = -872611800, tf_ds = -1067122648, tf_edi = -1049930560, tf_esi = -1051228800, tf_ebp = -872596268, tf_isp = -872596328, tf_ebx = 1, tf_edx = -966781544, tf_ecx = -559038242, tf_eax = 43761, tf_trapno = 12, tf_err = 0, tf_eip = -1066999614, tf_cs = 32, tf_eflags = 589826, tf_esp = -1051359088, tf_ss = 0}) at /usr/src/sys/i386/i386/trap.c:269 #10 0xc0810e4a in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #11 0xc066e0c2 in softclock (dummy=0x0) at /usr/src/sys/kern/kern_timeout.c:220 #12 0xc064ed32 in ithread_execute_handlers (p=0xc1558890, ie=0xc1578580) at /usr/src/sys/kern/kern_intr.c:662 #13 0xc064ee63 in ithread_loop (arg=0xc151f8e0) at /usr/src/sys/kern/kern_intr.c:745 #14 0xc064dfc4 in fork_exit (callout=0xc064edfc , arg=0xc151f8e0, frame=0xcbfd3d38) at /usr/src/sys/kern/kern_fork.c:789 #15 0xc0810eac in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208 (kgdb) up 11 #11 0xc066e0c2 in softclock (dummy=0x0) at /usr/src/sys/kern/kern_timeout.c:220 220 if (c->c_time != curticks) { (kgdb) list 215 curticks = softticks; 216 bucket = &callwheel[curticks & callwheelmask]; 217 c = TAILQ_FIRST(bucket); 218 while (c) { 219 depth++; 220 if (c->c_time != curticks) { 221 c = TAILQ_NEXT(c, c_links.tqe); 222 ++steps; 223 if (steps >= MAX_SOFTCLOCK_STEPS) { 224 nextsoftcheck = c; (kgdb) print *bucket $1 = {tqh_first = 0xc1891d80, tqh_last = 0xc1891d80} (kgdb) print c $2 = (struct callout *) 0xdeadc0de (kgdb) print *(bucket->tqh_first) $3 = {c_links = {sle = {sle_next = 0xdeadc0de}, tqe = {tqe_next = 0xdeadc0de, tqe_prev = 0xdeadc0de}}, c_time = -559038242, c_arg = 0xdeadc0de, c_func = 0xdeadc0de, c_mtx = 0xdeadc0de, c_flags = -559038242} (kgdb) print steps $4 = 1 -- WBR, Victor V. Snezhko EMail: snezhko@indorsoft.ru