Date: Wed, 20 Jul 2005 11:41:56 -0400 From: Edwin <edwin@verolan.com> To: Giorgos Keramidas <keramida@freebsd.org> Cc: Edwin <edwin@verolan.com>, freebsd-hackers@freebsd.org Subject: Re: help w/panic under heavy load - 5.4 Message-ID: <20050720154156.GA26755@asx01.verolan.com> In-Reply-To: <20050720100623.GA1470@beatrix.daedalusnetworks.priv> References: <20050719034215.GB20752@asx01.verolan.com> <200507191120.37526.jhb@FreeBSD.org> <20050720020302.GA24474@asx01.verolan.com> <20050720100623.GA1470@beatrix.daedalusnetworks.priv>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Giorgos, Yes - I'm using polling, but it still panics even w/ polling disabled or not compiled in. Still reproducible - same scenario (high load - actually, not even really high load - relative load,- small network packets). I did both (output included below): - disable polling via sysctl - re-compile new kernel w/o option It appears to be still the same error - traces the same w/ the exception of sis_poll versus sis_intr. I have tried various different options in my kernel before posting - w/ and/wo ipff, ipfw, polling, didn't seem to make a difference - but then again - I wasn't getting traces from DDB w/ INVARIANTS - so not for sure. I'm trying to understand the particulars about this - I get the null pointer part, but as to ip_fragment - it's fragmenting mbufs to handle ip packets during switching? and its failing trying to copy data past the end of the chain? Thanks! /edwin Giorgos Keramidas (keramida@freebsd.org) wrote: <....> > > ether_input(c0f90000,c11fee00,c0f902d0,0,c08336ab) at ether_input+0x25d > > sis_rxeof(c0f90000,1,5,c08e5500,c76d1ce0) at sis_rxeof+0x1ab > > sis_poll(c0f90000,0,5) at sis_poll+0x7f > > netisr_poll(0) at netisr_poll+0x188 > > swi_net(0) at swi_net+0x81 > > ithread_loop(c0ec6580,c76d1d48,c0ec6580,c060030c,0) at ithread_loop+0x124 > > fork_exit(c060030c,c0ec6580,c76d1d48) at fork_exit+0xa4 > > fork_trampoline() at fork_trampoline+0x8 > > --- trap 0x1, eip = 0, esp = 0xc76d1d7c, ebp = 0 --- > > Both tracebacks contain sis_poll() somewhere in the call stack? Are you > using POLLING? If yes, can you try without POLLING and see if the crash > can still be reproduced? > > - Giorgos > DDB output from disabling polling via sysctl - trace fb54c# sysctl kern.polling.enable=0 kern.polling.enable: 1 -> 0 fb54c# panic: m_copym, offset > size of mbuf chain KDB: enter: panic [thread pid 21 tid 100015 ] Stopped at kdb_enter+0x2b: nop db> where Tracing pid 21 tid 100015 td 0xc0ecc780 kdb_enter(c0821a6a) at kdb_enter+0x2b panic(c0826049,0,c076b79c,c102b400,100) at panic+0xbb m_copym(0,5dc,5c8,1,14) at m_copym+0x60 ip_fragment(c11bd80e,c76bfc6c,5dc,0,1) at ip_fragment+0x214 ip_fastforward(c11ab100) at ip_fastforward+0x6ed ether_demux(c0f90000,c11ab100,52,c0f8abc0,29) at ether_demux+0x259 ether_input(c0f90000,c11ab100,c0f902d0,0,c08336ab) at ether_input+0x25d sis_rxeof(c0f90000) at sis_rxeof+0x1ab sis_intr(c0f90000) at sis_intr+0xf3 ithread_loop(c0ec6880,c76bfd48,c0ec6880,c060030c,0) at ithread_loop+0x124 fork_exit(c060030c,c0ec6880,c76bfd48) at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc76bfd7c, ebp = 0 --- db> call doadump Dumping 128 MB 16 32 48 64 80 96 112 Dump complete 0xf db> mbsd05# kgdb kernel.debug /tmp/crash/vmcore.3 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". #0 doadump () at pcpu.h:159 159 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:159 #1 0xc04611f6 in db_fncall (dummy1=0, dummy2=0, dummy3=-1, dummy4=0xc76bf9f4 "(�k�") at /usr/src/sys/ddb/db_command.c:531 #2 0xc0461004 in db_command (last_cmdp=0xc08c9264, cmd_table=0x0, aux_cmd_tablep=0xc08483b8, aux_cmd_tablep_end=0xc08483d4) at /usr/src/sys/ddb/db_command.c:349 #3 0xc04610cc in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 #4 0xc0462c51 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221 #5 0xc0627af2 in kdb_trap (type=3, code=0, tf=0xc76bfb30) at /usr/src/sys/kern/subr_kdb.c:468 #6 0xc07b6394 in trap (frame= {tf_fs = -949288936, tf_es = -1067319280, tf_ds = -1065222128, tf_edi = 1, tf_esi = -1065 197495, tf_ebp = -949224592, tf_isp = -949224612, tf_ebx = -949224548, tf_edx = 0, tf_ecx = -10 60921344, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1067288461, tf_cs = -1065222136, tf_eflags = 658, tf_esp = -949224560, tf_ss = -1067376657}) at /usr/src/sys/i386/i386/trap.c:584 #7 0xc07a69ca in calltrap () at /usr/src/sys/i386/i386/exception.s:140 #8 0xc76b0018 in ?? () #9 0xc0620010 in schedcpu () at /usr/src/sys/kern/sched_4bsd.c:461 #10 0xc0611fef in panic (fmt=0xc0820008 "default") at /usr/src/sys/kern/kern_shutdown.c:550 #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 #12 0xc069b694 in ip_fragment (ip=0xc11bd80e, m_frag=0xc76bfc6c, mtu=-1056787456, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 6933c1 in ip_fastforward (m=0xc11ab100) at /usr/src/sys/netinet/ip_fastfwd.c:572 #14 0xc0672a59 in ether_demux (ifp=0xc0f90000, m=0xc11ab100) at /usr/src/sys/net/if_ethersubr.c:770 #15 0xc06727f5 in ether_input (ifp=0xc0f90000, m=0xc11ab100) at /usr/src/sys/net/if_ethersubr.c:631 #16 0xc0713507 in sis_rxeof (sc=0xc0f90000) at /usr/src/sys/pci/if_sis.c:1636 #17 0xc071398f in sis_intr (arg=0xc0f90000) at /usr/src/sys/pci/if_sis.c:1841 #18 0xc0600430 in ithread_loop (arg=0xc0ec6880) at /usr/src/sys/kern/kern_intr.c:547 #19 0xc05ff8a4 in fork_exit (callout=0xc060030c <ithread_loop>, arg=0xc0ec6880, frame=0xc76bfd48) at /usr/src/sys/kern/kern_fork.c:791 #20 0xc07a6a2c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:209 (kgdb) f 12 #12 0xc069b694 in ip_fragment (ip=0xc11bd80e, m_frag=0xc76bfc6c, mtu=-1056787456, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 967 m->m_next = m_copy(m0, off, len); (kgdb) l 962 len = ip->ip_len - off; 963 m->m_flags |= M_LASTFRAG; 964 } else 965 mhip->ip_off |= IP_MF; 966 mhip->ip_len = htons((u_short)(len + mhlen)); 967 m->m_next = m_copy(m0, off, len); 968 if (m->m_next == NULL) { /* copy failed */ 969 m_free(m); 970 error = ENOBUFS; /* ??? */ 971 ipstat.ips_odropped++; (kgdb) f 11 #11 0xc0641a2c in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 385 KASSERT(m != NULL, ("m_copym, offset > size of mbuf chain")); (kgdb) l 380 KASSERT(len >= 0, ("m_copym, negative len %d", len)); 381 MBUF_CHECKSLEEP(wait); 382 if (off == 0 && m->m_flags & M_PKTHDR) 383 copyhdr = 1; 384 while (off > 0) { 385 KASSERT(m != NULL, ("m_copym, offset > size of mbuf chain")); 386 if (off < m->m_lenek; 388 off -= m->m_len; 389 m = m->m_next; (kgdb) i loc n = (struct mbuf *) 0xc102b400 np = (struct mbuf **) 0xc102b454 off = 1432 top = (struct mbuf *) 0x1 copyhdr = 0 (kgdb) p m $1 = (struct mbuf *) 0x0 (kgdb) *** end of sysctl polling disabled *** *** begin of no POLLING option kernel *** fb54c# panic: m_copym, offset > size of mbuf chain KDB: enter: panic [thread pid 21 tid 100015 ] Stopped at kdb_enter+0x2b: nop db> where Tracing pid 21 tid 100015 td 0xc0ecc780 kdb_enter(c081f47b) at kdb_enter+0x2b panic(c0823a5a,0,c07694e0,c102bb00,100) at panic+0xbb m_copym(0,5dc,5c8,1,14) at m_copym+0x60 ip_fragment(c130100e,c76bfc6c,5dc,0,1) at ip_fragment+0x214 ip_fastforward(c12e5700) at ip_fastforward+0x6ed ether_demux(c0f90000,c12e5700,52,c0f8ab18,22) at ether_demux+0x259 ether_input(c0f90000,c12e5700,c0f902cc,0,c08310a3) at ether_input+0x25d sis_rxeof(c0f90000) at sis_rxeof+0x18b sis_intr(c0f90000) at sis_intr+0xa3 ithread_loop(c0ec6880,c76bfd48,c0ec6880,c05fedf0,0) at ithread_loop+0x124 fork_exit(c05fedf0,c0ec6880,c76bfd48) at fork_exit+0xa4 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xc76bfd7c, ebp = 0 --- db> call doadump Dumping 128 MB 16 32 48 64 80 96 112 Dump complete 0xf db> reset mbsd05# kgdb kernel.debug /tmp/crash/vmcore.5 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". #0 doadump () at pcpu.h:159 159 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) where #0 doadump () at pcpu.h:159 #1 0xc0461106 in db_fncall (dummy1=0, dummy2=0, dummy3=43, dummy4=0xc76bf9f4 "(�k�") at /usr/src/sys/ddb/db_command.c:531 #2 0xc0460f14 in db_command (last_cmdp=0xc08c6764, cmd_table=0x0, aux_cmd_tablep=0xc0845d60, aux_cmd_tablep_end=0xc0845d7c) at /usr/src/sys/ddb/db_command.c:349 #3 0xc0460fdc in db_command_loop () at /usr/src/sys/ddb/db_command.c:455 #4 0xc0462b61 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221 #5 0xc06265d6 in kdb_trap (type=3, code=0, tf=0xc76bfb30) at /usr/src/sys/kern/subr_kdb.c:468 #6 0xc07b40bc in trap (frame= {tf_fs = -949288936, tf_es = -1067319280, tf_ds = -1065222128, tf_edi = 1, tf_esi = -1065 207206, tf_ebp = -949224592, tf_isp = -949224612, tf_ebx = -949224548, tf_edx = 0, tf_ecx = -10 60921344, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1067293865, tf_cs = -1065222136, tf_eflags = 658, tf_esp = -949224560, tf_ss = -1067382061}) at /usr/src/sys/i386/i386/trap.c:584 #7 0xc07a470a in calltrap () at /usr/src/sys/i386/i386/exception.s:140 #8 0xc76b0018 in ?? () #9 0xc0620010 in blist_free (bl=0xc76bfb9c, blkno=Unhandled dwarf expression opcode 0x93 ) at /usr/src/sys/kern/subr_blist.c:245 #10 0xc0610ad3 in panic (fmt=0xc0820008 "_scope_sys") at /usr/src/sys/kern/kern_shutdown.c:550 #11 0xc0640510 in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 #12 0xc069a170 in ip_fragment (ip=0xc130100e, m_frag=0xc76bfc6c, mtu=-1056785664, if_hwassist_flags=0, sw_csum=1) at /usr/src/sys/netinet/ip_output.c:967 #13 0xc0691e9d in ip_fastforward (m=0xc12e5700) at /usr/src/sys/netinet/ip_fastfwd.c:572 #14 0xc067153d in ether_demux (ifp=0xc0f90000, m=0xc12e5700) at /usr/src/sys/net/if_ethersubr.c:770 #15 0xc06712d9 in ether_input (ifp=0xc0f90000, m=0xc12e5700) at /usr/src/sys/net/if_ethersubr.c:631 #16 0xc0711963 in sis_rxeof (sc=0xc0f90000) at /usr/src/sys/pci/if_sis.c:1636 #17 0xc0711c53 in sis_intr (arg=0xc0f90000) at /usr/src/sys/pci/if_sis.c:1841 #18 0xc05fef14 in ithread_loop (arg=0xc0ec6880) at /usr/src/sys/kern/kern_intr.c:547 #19 0xc05fe388 in fork_exit (callout=0xc05fedf0 <ithread_loop>, arg=0xc0ec6880, frame=0xc76bfd48) at /usr/src/sys/kern/kern_fork.c:791 #20 0xc07a476c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:209 (kgdb) f 11 #11 0xc0640510 in m_copym (m=0x0, off0=1500, len=1480, wait=1) at /usr/src/sys/kern/uipc_mbuf.c:385 385 KASSERT(m != NULL, ("m_copym, offset > size of mbuf chain")); (kgdb) p m $1 = (struct mbuf *) 0x0 (kgdb) *** end of no POllING option kernel ***
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050720154156.GA26755>