From owner-freebsd-hackers@FreeBSD.ORG Fri Jul 1 12:04:18 2005 Return-Path: X-Original-To: freebsd-hackers@FreeBSD.org Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6467C16A41C; Fri, 1 Jul 2005 12:04:18 +0000 (GMT) (envelope-from oli@aker.isnic.is) Received: from aker.isnic.is (aker.isnic.is [193.4.58.91]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2B2E043D1F; Fri, 1 Jul 2005 12:04:16 +0000 (GMT) (envelope-from oli@aker.isnic.is) Received: by aker.isnic.is (Postfix, from userid 1000) id 8F8958A01D; Fri, 1 Jul 2005 12:04:13 +0000 (UTC) Date: Fri, 1 Jul 2005 12:04:13 +0000 From: Olafur Osvaldsson To: freebsd-hackers@FreeBSD.org Message-ID: <20050701120413.GL14411@isnic.is> Mail-Followup-To: freebsd-hackers@FreeBSD.org, rwatson@FreeBSD.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Mutt/1.4.2.1i Cc: rwatson@FreeBSD.org Subject: Help regarding panics within sb* (sbdrop) functions in FreeBSD 5 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2005 12:04:18 -0000 Hi, I've been experiencing for some time now crashes of a few of my FreeBSD systems wich seem to panic in the sbdrop functions, I have atleast two dumps that point to this. Both seem to happen because of an unexpected NULL value. This is on a 5.4-RELEASE-p1 FreeBSD 5.4-RELEASE-p1 #6: Tue Jun 28 13:42:14 = UTC 2005 And these crashes have been happening since early 5 series. Every time one of these crashes occur there are entries in the system log: Jun 21 14:21:38 bastet kernel: bge0: discard frame w/o packet header Jun 21 14:23:39 bastet kernel: bge0: discard frame w/o packet header Jun 21 14:23:59 bastet kernel: bge0: discard frame w/o packet header Jun 21 14:24:33 bastet kernel: me w/o packet header Jun 21 14:24:38 bastet kernel: bge0: discard frame w/o packet header =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D For instance, the last panic was in sbdrop_locked: #0 doadump () at pcpu.h:159 #1 0xc060c5ef in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:4= 10 #2 0xc060c915 in panic (fmt=3D0xc081c602 "sbdrop") at /usr/src/sys/kern/ke= rn_shutdown.c:566 #3 0xc0647000 in sbdrop_locked (sb=3D0xc5620484, len=3D3793) at /usr/src/s= ys/kern/uipc_socket2.c:1149 #4 0xc06a3957 in tcp_input (m=3D0xc39d4600, off0=3D20) at /usr/src/sys/net= inet/tcp_input.c:2195 #5 0xc069b2e5 in ip_input (m=3D0xc39d4600) at /usr/src/sys/netinet/ip_inpu= t.c:776 #6 0xc067e01b in netisr_processqueue (ni=3D0xc08da998) at /usr/src/sys/net= /netisr.c:233 #7 0xc067e216 in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:346 #8 0xc05f8599 in ithread_loop (arg=3D0xc351d400) at /usr/src/sys/kern/kern= _intr.c:547 #9 0xc05f7635 in fork_exit (callout=3D0xc05f8440 , arg=3D0xc= 351d400, frame=3D0xe684dd48) at /usr/src/sys/kern/kern_fork.c:791 #10 0xc079fe1c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:= 209 #3 0xc0647000 in sbdrop_locked (sb=3D0xc5620484, len=3D3793) at /usr/src/s= ys/kern/uipc_socket2.c:1149 1149 panic("sbdrop"); (kgdb) list 1144 1145 next =3D (m =3D sb->sb_mb) ? m->m_nextpkt : 0; 1146 while (len > 0) { 1147 if (m =3D=3D 0) { 1148 if (next =3D=3D 0) 1149 panic("sbdrop"); 1150 m =3D next; 1151 next =3D m->m_nextpkt; 1152 continue; 1153 } 1154 if (m->m_len > len) { 1155 m->m_len -=3D len; 1156 m->m_data +=3D len; 1157 sb->sb_cc -=3D len; 1158 if (m->m_type !=3D MT_DATA && m->m_type != =3D MT_HEADER && 1159 m->m_type !=3D MT_OOBDATA) 1160 sb->sb_ctl -=3D len; 1161 break; 1162 } 1163 len -=3D m->m_len; 1164 sbfree(sb, m); 1165 m =3D m_free(m); 1166 } My questions are these: Is it not possible that m_nextpkt is NULL if the current is the last one in the buffer? Why does that require a panic, could it not drop the frames for a resend? =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D Another dump I have from earlier on this same machine panics in sbappendadd= r_locked: #0 doadump () at pcpu.h:159 #1 0xc060560f in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown.c:3= 97 #2 0xc0605935 in panic (fmt=3D0xc080f8d8 "sbappendaddr_locked") at /usr/src/sys/kern/kern_shutdown.c:553 #3 0xc063f6f9 in sbappendaddr_locked (sb=3D0xc923941c, asa=3D0xc086cd60, m= 0=3D0xc9ca0100,=20 control=3D0xc9ca0100) at /usr/src/sys/kern/uipc_socket2.c:934 #4 0xc0696082 in raw_append (last=3D0x0, ip=3D0xc4054810, n=3D0xc9ca0100) at /usr/src/sys/netinet/raw_ip.c:169 #5 0xc06962ed in rip_input (m=3D0xc9ca0100, off=3D20) at /usr/src/sys/neti= net/raw_ip.c:231 #6 0xc0692b71 in ip_input (m=3D0xc9ca0100) at /usr/src/sys/netinet/ip_inpu= t.c:739 #7 0xc06761ef in netisr_processqueue (ni=3D0xc08ccf38) at /usr/src/sys/net= /netisr.c:233 #8 0xc06763ea in swi_net (dummy=3D0x0) at /usr/src/sys/net/netisr.c:346 #9 0xc05f14fd in ithread_loop (arg=3D0xc3491d00) at /usr/src/sys/kern/kern= _intr.c:547 #10 0xc05f05ad in fork_exit (callout=3D0xc05f13a4 , arg=3D0xc= 3491d00, frame=3D0xe680ad48) at /usr/src/sys/kern/kern_fork.c:811 #11 0xc079558c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:= 209 #3 0xc063f6f9 in sbappendaddr_locked (sb=3D0xc923941c, asa=3D0xc086cd60, m= 0=3D0xc9ca0100,=20 control=3D0xc9ca0100) at /usr/src/sys/kern/uipc_socket2.c:934 934 panic("sbappendaddr_locked"); (kgdb) list 929 int space =3D asa->sa_len; 930 931 SOCKBUF_LOCK_ASSERT(sb); 932 933 if (m0 && (m0->m_flags & M_PKTHDR) =3D=3D 0) 934 panic("sbappendaddr_locked"); 935 if (m0) 936 space +=3D m0->m_pkthdr.len; 937 space +=3D m_length(control, &n); 938 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D For me these crashes have only happened on machines running Bind 9, and all= of those that have contacted me as far as I can remember are also running bind on th= e machines that get the "discard frame w/o packet header" message before crash, this s= eems to indicate that bind is triggering something that is not expected. Unfortunately this is not repeatable as far as I know but I'm willing to do= almost anything to help with finding the cause of this problem and fixing it. I'm not an experienced hacker so please bear with me. I know that there are quite a few people experiencing this problem since ma= ny have sent me queries regarding the "discard frame w/o packet header" since I ask= ed about that on the lists a long time ago. /Oli --=20 Olafur Osvaldsson Systems Administrator Internet a Islandi hf. Tel: +354 525-5291 Email: oli@isnic.is