From owner-freebsd-current@FreeBSD.ORG Tue Nov 8 20:52:52 2005 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F31C116A41F; Tue, 8 Nov 2005 20:52:51 +0000 (GMT) (envelope-from snezhko@indorsoft.ru) Received: from indor.net.tomline.ru (indor.net.tomline.ru [213.183.100.90]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2117B43D45; Tue, 8 Nov 2005 20:52:49 +0000 (GMT) (envelope-from snezhko@indorsoft.ru) Received: from SNEZHKO by indorsoft.ru (MDaemon.PRO.v7.2.2.R) with ESMTP id md50000028576.msg; Wed, 09 Nov 2005 02:52:40 +0600 X-AntiVirus: Checked by Dr.Web [version: 4.32b, engine: 4.32b, virus records: 126749, updated: 7.11.2005] To: freebsd-current@freebsd.org References: <20051027022313.R675@kushnir1.kiev.ua> <200511030059.05946.max@love2party.net> <200511031500.00839.jhb@freebsd.org> From: Victor Snezhko Date: Wed, 09 Nov 2005 02:52:39 +0600 In-Reply-To: (Victor Snezhko's message of "Fri, 04 Nov 2005 17:01:46 +0600") Message-ID: User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (windows-nt) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Processed: indor.net.tomline.ru, Wed, 09 Nov 2005 02:52:40 +0600 (not processed: spam filter disabled) X-Return-Path: snezhko@indorsoft.ru X-VVS-Spam: false Cc: Max Laier Subject: Re: CURRENT + amd64 + user-ppp = panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Nov 2005 20:52:52 -0000 --=-=-= Victor Snezhko writes: >> Tomorrow I'll try to trace all callout-related operations in nd6 >> and/or the whole netinet6. > > Hmmm... trace shows that the callout_stop/callout_drain call > always receives a pointer that has not been initialized via > callout_init, at least not in /usr/src/sys/netinet6/* Sorry, I lied here. Details below. > I'll debug this further and report the results. Hmm, debugging was not so fast... So, here is what I have for the moment: 1) I was dead wrong about stop called on a pointer that was not initialized(there was a typo in one of my printf's). Call trace shows the following: At first, nd6_rtrequest(RTM_ADD) is called, it calls callout_init on the timer and then callout_stop (through nd6_llinfo_settimer). Then nd6_rtrequest(RTM_DELETE) is called, and it calls callout_stop without callout_init. This seemed to be one of the possible cause of the problem, but when I commented out the last call to nd6_llinfo_settimer(ln, -1), nothing changed. 2) I tried to trace where corruption occurs. In order to do this I inserted slow checks into all the code in kern_timeout.c that could modify the callwheel. No success - first 0xdeadc0de's were detected by softclock(), not by any other call. Here is the corresponding patch: --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=kern_timeout.diff --- kern_timeout.c.orig Mon Nov 7 17:26:56 2005 +++ kern_timeout.c Wed Nov 9 02:14:50 2005 @@ -110,6 +110,47 @@ static struct cv callout_wait; static int wakeup_done_ctr; +void callout_check_callwheel(void); + +/* + * callout_check_callwheel() - check all the callwheel for deadc0de'd entries + * + * This code cycles through the callwheel and if it finds + * broken callout, it panics. + * + * callout_lock must be acquired before calling this function + * + */ +void +callout_check_callwheel(void) +{ + struct callout *c, *c2; + struct callout_tailq *bucket; + int i; + static int in_panic=0; + + if (in_panic) return; + + for (i = 0; i < callwheelsize; ++i) { + bucket = &callwheel[i]; + c = TAILQ_FIRST(bucket); + if ((int)c == 0xdeadc0de) { + in_panic = 1; + panic("deadc0de found at the beginning of bucket"); + break; + } + while (c) { + c2 = TAILQ_NEXT(c, c_links.tqe); + if ((int)c2 == 0xdeadc0de) { + in_panic = 1; + panic("c->TAILQ_NEXT = 0xdeadc0de"); + break; + } + c = c2; + } + } +} + /* * kern_timeout_callwheel_alloc() - kernel low level callwheel initialization * @@ -206,6 +247,9 @@ depth = 0; steps = 0; mtx_lock_spin(&callout_lock); + + callout_check_callwheel(); + while (softticks != ticks) { softticks++; /* @@ -335,6 +379,9 @@ avg_mtxcalls += (mtxcalls * 1000 - avg_mtxcalls) >> 8; avg_gcalls += (gcalls * 1000 - avg_gcalls) >> 8; nextsoftcheck = NULL; + + callout_check_callwheel(); + mtx_unlock_spin(&callout_lock); } @@ -365,16 +412,23 @@ mtx_lock_spin(&callout_lock); + callout_check_callwheel(); + /* Fill in the next free callout structure. */ new = SLIST_FIRST(&callfree); if (new == NULL) /* XXX Attempt to malloc first */ panic("timeout table full"); SLIST_REMOVE_HEAD(&callfree, c_links.sle); - + + callout_check_callwheel(); + callout_reset(new, to_ticks, ftn, arg); handle.callout = new; + + callout_check_callwheel(); + mtx_unlock_spin(&callout_lock); return (handle); } @@ -395,8 +449,14 @@ return; mtx_lock_spin(&callout_lock); + + callout_check_callwheel(); + if (handle.callout->c_func == ftn && handle.callout->c_arg == arg) callout_stop(handle.callout); + + callout_check_callwheel(); + mtx_unlock_spin(&callout_lock); } @@ -437,6 +497,9 @@ #endif mtx_lock_spin(&callout_lock); + + callout_check_callwheel(); + if (c == curr_callout) { /* * We're being asked to reschedule a callout which is @@ -463,6 +526,8 @@ cancelled = 1; + callout_check_callwheel(); + /* * Part of the normal "stop a pending callout" process * is to clear the CALLOUT_ACTIVE and CALLOUT_PENDING @@ -481,14 +546,20 @@ if (to_ticks <= 0) to_ticks = 1; + callout_check_callwheel(); + c->c_arg = arg; c->c_flags |= (CALLOUT_ACTIVE | CALLOUT_PENDING); c->c_func = ftn; c->c_time = ticks + to_ticks; + TAILQ_INSERT_TAIL(&callwheel[c->c_time & callwheelmask], c, c_links.tqe); - mtx_unlock_spin(&callout_lock); + callout_check_callwheel(); + + mtx_unlock_spin(&callout_lock); + return (cancelled); } @@ -511,6 +582,9 @@ } mtx_lock_spin(&callout_lock); + + callout_check_callwheel(); + /* * Don't attempt to delete a callout that's not on the queue. */ @@ -547,15 +621,25 @@ } c->c_flags &= ~(CALLOUT_ACTIVE | CALLOUT_PENDING); + callout_check_callwheel(); + if (nextsoftcheck == c) { nextsoftcheck = TAILQ_NEXT(c, c_links.tqe); } + + callout_check_callwheel(); + TAILQ_REMOVE(&callwheel[c->c_time & callwheelmask], c, c_links.tqe); + callout_check_callwheel(); + if (c->c_flags & CALLOUT_LOCAL_ALLOC) { c->c_func = NULL; SLIST_INSERT_HEAD(&callfree, c, c_links.sle); } + + callout_check_callwheel(); + mtx_unlock_spin(&callout_lock); return (1); } @@ -641,6 +725,9 @@ /* don't collide with softclock() */ mtx_lock_spin(&callout_lock); + + callout_check_callwheel(); + for (p = calltodo.c_next; p != NULL; p = p->c_next) { p->c_time -= delta_ticks; @@ -651,6 +738,9 @@ /* take back the ticks the timer didn't use (p->c_time <= 0) */ delta_ticks = -p->c_time; } + + callout_check_callwheel(); + mtx_unlock_spin(&callout_lock); return; --=-=-= pointer comparisons are probably not 64bit-clean, correct me please if those pointer->int casts are bad. If the backtrace is needed I can provide it, just ask. My check triggers panic when called from a softclock(). 3) Recently, thanks to Mark Tinguely, I have found out that I could do additional checks in trash_dtor() in uma_dbg.c No results again. I inserted the following check: --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=uma_dbg.diff --- uma_dbg.c.orig Mon Nov 7 23:05:09 2005 +++ uma_dbg.c Tue Nov 8 17:37:24 2005 @@ -41,6 +41,8 @@ #include #include #include +#include +#include #include #include @@ -86,8 +88,33 @@ { int cnt; u_int32_t *p; + struct callout *c; + struct callout_tailq *bucket; + int i; cnt = size / sizeof(uma_junk); + + mtx_lock_spin(&callout_lock); + + for (i = 0; i < callwheelsize; ++i) { + bucket = &callwheel[i]; + for (c = TAILQ_FIRST(bucket); c != NULL; + c = TAILQ_NEXT(c, c_links.tqe)) { + int c2 = (int)c; + int mem2 = (int)mem; + if ((u_int32_t)c == uma_junk) { + kdb_enter("trash_dtor: uma_junk found in a "\ + "callwheel element"); + break; + } + if (c2 >= mem2 && c2 < mem2 + size) { + kdb_enter("trash_dtor: found invalid "\ + "callwhel element"); + } + } + } + + mtx_unlock_spin(&callout_lock); for (p = mem; cnt > 0; cnt--, p++) *p = uma_junk; --=-=-= (Again not quite 64bit-ready, the same nasty pointer casts) Here is the backtrace: --=-=-= Content-Disposition: inline; filename=bt2.txt /var/crash # kgdb /usr/obj/usr/src/sys/VVS/kernel vmcore.24 [... skipped ...] Unread portion of the kernel message buffer: KDB: enter: trash_dtor: uma_junk found in a callwheel element panic: from debugger cpuid = 0 Uptime: 4m2s Dumping 63 MB (3 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 62MB (15856 pages) 46 30 14 ... ok chunk 2: 1MB (256 pages) #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc0660824 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:399 #2 0xc0660b39 in panic (fmt=0xc0856fa0 "from debugger") at /usr/src/sys/kern/kern_shutdown.c:555 #3 0xc046cee1 in db_panic (addr=-1066954485, have_addr=0, count=-1, modif=0xc60ac934 "") at /usr/src/sys/ddb/db_command.c:434 #4 0xc046ce78 in db_command (last_cmdp=0xc0947a84, cmd_table=0x0, aux_cmd_tablep=0xc08bda78, aux_cmd_tablep_end=0xc08bda94) at /usr/src/sys/ddb/db_command.c:403 #5 0xc046cf40 in db_command_loop () at /usr/src/sys/ddb/db_command.c:454 #6 0xc046eb59 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221 #7 0xc06793a4 in kdb_trap (type=3, code=0, tf=0xc60acacc) at /usr/src/sys/kern/subr_kdb.c:473 #8 0xc0821630 in trap (frame= {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = -1053265664, tf_esi = -1053265920, tf_ebp = -972371188, tf_isp = -972371208, tf_ebx = -559038242, tf_edx = 0, tf_ecx = -1056755712, tf_eax = 62, tf_trapno = 3, tf_err = 0, tf_eip = -1066954485, tf_cs = 32, tf_eflags = 646, tf_esp = -972371156, tf_ss = -1065665882}) at /usr/src/sys/i386/i386/trap.c:610 #9 0xc080ecba in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #10 0xc067910b in kdb_enter (msg=0x3e
) ---Type to continue, or q to quit--- at cpufunc.h:60 #11 0xc07b3aa6 in trash_dtor (mem=0xc1387000, size=256, arg=0x0) at /usr/src/sys/vm/uma_dbg.c:104 #12 0xc0657dd8 in mb_dtor_mbuf (mem=0xc1387000, size=256, arg=0x0) at /usr/src/sys/kern/kern_mbuf.c:245 #13 0xc07b23e4 in uma_zfree_arg (zone=0xc1042960, item=0xc1387000, udata=0x0) at /usr/src/sys/vm/uma_core.c:2270 #14 0xc069b9ba in soreceive (so=0xc143d9bc, psa=0xc60acbe8, uio=0xc60acbf4, mp0=0x0, controlp=0x0, flagsp=0xc60acc74) at uma.h:303 #15 0xc06a08c7 in kern_recvit (td=0xc12f3320, s=5, mp=0xc60acc5c, namelenp=0xbfbfdacc, segflg=62) at /usr/src/sys/kern/uipc_syscalls.c:986 #16 0xc06a0af6 in recvit (td=0xc12f3320, s=5, mp=0xc60acc5c, namelenp=0xbfbfdacc) at /usr/src/sys/kern/uipc_syscalls.c:1094 #17 0xc06a0b64 in recvfrom (td=0xc12f3320, uap=0xc60acd04) at /usr/src/sys/kern/uipc_syscalls.c:1131 #18 0xc0821e96 in syscall (frame= {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 134570084, tf_esi = 134570080, tf_ebp = -1077940616, tf_isp = -972370588, tf_ebx = 134546020, tf_edx = -1077940864, tf_ecx = 5, tf_eax = 29, tf_trapno = 32, tf_err = 2, tf_eip = 672166495, tf_cs = 51, tf_eflags = 66198, tf_esp = -1077945812, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:1001 ---Type to continue, or q to quit--- #19 0xc080ed0f in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 #20 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) up 11 #11 0xc07b3aa6 in trash_dtor (mem=0xc1387000, size=256, arg=0x0) at /usr/src/sys/vm/uma_dbg.c:104 104 kdb_enter("trash_dtor: uma_junk found in a "\ (kgdb) list 99 for (c = TAILQ_FIRST(bucket); c != NULL; 100 c = TAILQ_NEXT(c, c_links.tqe)) { 101 int c2 = (int)c; 102 int mem2 = (int)mem; 103 if ((u_int32_t)c == uma_junk) { 104 kdb_enter("trash_dtor: uma_junk found in a "\ 105 "callwheel element"); 106 break; 107 } 108 if (c2 >= mem2 && c2 < mem2 + size) { (kgdb) up 1 #12 0xc0657dd8 in mb_dtor_mbuf (mem=0xc1387000, size=256, arg=0x0) at /usr/src/sys/kern/kern_mbuf.c:245 245 trash_dtor(mem, size, arg); (kgdb) list 240 241 m = (struct mbuf *)mem; 242 if ((m->m_flags & M_PKTHDR) != 0) 243 m_tag_delete_chain(m, NULL); 244 #ifdef INVARIANTS 245 trash_dtor(mem, size, arg); 246 #endif 247 } 248 249 /* XXX Only because of stats */ (kgdb) up 1 #13 0xc07b23e4 in uma_zfree_arg (zone=0xc1042960, item=0xc1387000, udata=0x0) at /usr/src/sys/vm/uma_core.c:2270 2270 zone->uz_dtor(item, keg->uk_size, udata); (kgdb) list 2265 #endif 2266 CTR2(KTR_UMA, "uma_zfree_arg thread %x zone %s", curthread, 2267 zone->uz_name); 2268 2269 if (zone->uz_dtor) 2270 zone->uz_dtor(item, keg->uk_size, udata); 2271 #ifdef INVARIANTS 2272 ZONE_LOCK(zone); 2273 if (keg->uk_flags & UMA_ZONE_MALLOC) 2274 uma_dbg_free(zone, udata, item); (kgdb) up 1 #14 0xc069b9ba in soreceive (so=0xc143d9bc, psa=0xc60acbe8, uio=0xc60acbf4, mp0=0x0, controlp=0x0, flagsp=0xc60acc74) at uma.h:303 303 uma_zfree_arg(zone, item, NULL); (kgdb) list 298 static __inline void uma_zfree(uma_zone_t zone, void *item); 299 300 static __inline void 301 uma_zfree(uma_zone_t zone, void *item) 302 { 303 uma_zfree_arg(zone, item, NULL); 304 } 305 306 /* 307 * XXX The rest of the prototypes in this header are h0h0 magic for the VM. (kgdb) up 1 #15 0xc06a08c7 in kern_recvit (td=0xc12f3320, s=5, mp=0xc60acc5c, namelenp=0xbfbfdacc, segflg=62) at /usr/src/sys/kern/uipc_syscalls.c:986 986 error = so->so_proto->pr_usrreqs->pru_soreceive(so, &fromsa, &auio, (kgdb) list 981 #ifdef KTRACE 982 if (KTRPOINT(td, KTR_GENIO)) 983 ktruio = cloneuio(&auio); 984 #endif 985 len = auio.uio_resid; 986 error = so->so_proto->pr_usrreqs->pru_soreceive(so, &fromsa, &auio, 987 (struct mbuf **)0, mp->msg_control ? &control : (struct mbuf **)0, 988 &mp->msg_flags); 989 if (error) { 990 if (auio.uio_resid != (int)len && (error == ERESTART || (kgdb) print *(so->so_proto->pr_usrreqs) $5 = {__Break_the_struct_layout_for_now = 0, pru_abort = 0xc06a24ec , pru_accept = 0xc06a2600 , pru_attach = 0xc06a26b4 , pru_bind = 0xc06a26d0 , pru_connect = 0xc06a2740 , pru_connect2 = 0xc06a27c8 , pru_control = 0xc069f300 , pru_detach = 0xc06a2838 , pru_disconnect = 0xc06a28a0 , pru_listen = 0xc06a2908 , pru_peeraddr = 0xc06a2980 , pru_rcvd = 0xc06a2a34 , pru_rcvoob = 0xc069f348 , pru_send = 0xc06a2bc4 , pru_sense = 0xc06a3024 , pru_shutdown = 0xc06a30f0 , pru_sockaddr = 0xc06a3168 , pru_sosend = 0xc069ad84 , pru_soreceive = 0xc069b5b0 , pru_sopoll = 0xc069d090 , pru_sosetlabel = 0xc069f3b8 } (kgdb) up 1 #16 0xc06a0af6 in recvit (td=0xc12f3320, s=5, mp=0xc60acc5c, namelenp=0xbfbfdacc) at /usr/src/sys/kern/uipc_syscalls.c:1094 1094 return (kern_recvit(td, s, mp, namelenp, UIO_USERSPACE)); (kgdb) list 1089 int s; 1090 struct msghdr *mp; 1091 void *namelenp; 1092 { 1093 1094 return (kern_recvit(td, s, mp, namelenp, UIO_USERSPACE)); 1095 } 1096 1097 /* 1098 * MPSAFE --=-=-= The thing I want to note is that callwheel corruptions after applying this patch stop to be reproducible. They may occur, may not. 4) I inserted a kdb_enter() call in the beginning of nd6_llinfo_settimer(). Then tried to step some functions up the stack. Pressed "n" a couple of times, then exit via "c". Panic didn't occur! (It would certainly occur if I exited the kdb immediately after kdb_enter). Don't know what to think about the last. Improper locking? Freeing a callwheel entry and then it's slow removal? Strange... P.S.: Perharps I should file a PR about all this as the problem turned out to be not so easy. To what degree should I include the details above? -- WBR, Victor V. Snezhko EMail: snezhko@indorsoft.ru --=-=-=--