From owner-freebsd-net Wed Mar 21 5:52: 6 2001 Delivered-To: freebsd-net@freebsd.org Received: from ady.warpnet.ro (ady.warpnet.ro [194.102.224.8]) by hub.freebsd.org (Postfix) with ESMTP id 8BAEB37B719; Wed, 21 Mar 2001 05:51:40 -0800 (PST) (envelope-from ady@warpnet.ro) Received: from localhost (ady@localhost) by ady.warpnet.ro (8.9.3/8.9.3) with ESMTP id PAA79098; Wed, 21 Mar 2001 15:57:59 +0200 (EET) (envelope-from ady@warpnet.ro) Date: Wed, 21 Mar 2001 15:57:59 +0200 (EET) From: Adrian Penisoara To: Bosko Milekic Cc: freebsd-stable@FreeBSD.ORG, freebsd-net@FreeBSD.ORG Subject: Re: Kernel crush due to frag attack In-Reply-To: <00d001c09f8d$8ee4d360$becbca18@jehovah> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Hi, Coming back to you with a reply I should have given earlier. See below. In the mean time I have applied one of Ruslan Emirov's patches and had no panics ever since. Clearly not a hardware issue but faulty code :) ... I also see that there has been applied a patch (by jkh) in the CVS tree for this matter, I hope it solves the problem, I haven't checked yet. On Sun, 25 Feb 2001, Bosko Milekic wrote: > > Adrian Penisoara wrote: > > > Hi, > > > > As we are facing a heavy fragments attack (40-60byte packets in a > > ~ 1000 pkts/sec flow) I see some sporadic panics. Kernel/world is > > 4.2-STABLE as of 18 Jan 2001 -- it's a production machine and I > hadn't yet > > the chance for another update; if it's been fixed in the mean time I > would > > be glad to hear it... > > > > I have attached a gdb trace and a snip of a tcpdump log. When I > rebuilt > > the kernel with debug options it seemed to crush less often. I > remember > > that at the time of this panic I had an ipfw rule to deny IP > fragments. > > This is one of those "odd" faults I've seen in -STABLE sometimes. > Thanks to good debugging information you've provided, to be noted: > > #16 0xc014de98 in m_copym (m=0xc07e7c00, off0=0, len=40, wait=1) > at ../../kern/uipc_mbuf.c:621 > 621 n->m_pkthdr.len -= off0; > (kgdb) list > 616 if (n == 0) > 617 goto nospace; > 618 if (copyhdr) { > 619 M_COPY_PKTHDR(n, m); > 620 if (len == M_COPYALL) > 621 n->m_pkthdr.len -= off0; <-- fault happens here (XXX) > 622 else > 623 n->m_pkthdr.len = len; > 624 copyhdr = 0; > 625 } > (kgdb) print n > $1 = (struct mbuf *) 0x661c20 > (kgdb) print *n > cannot read proc at 0 > (kgdb) print m > $2 = (struct mbuf *) 0xc07e7c00 > > Where the fault happens (XXX), the possible problem is that the mbuf > pointer n is bad, and as printed from the debugger, it does appear to > be bad. However, there are two things to note: > > 1. the fault virtual address displayed in the trap message: > > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0x89c0c800 > [...] > > is different from the one printed in your analysis (even though > 0x89c0c800 seems bogus as well, although it is at a correct boundry). > > 2. Nothing bad happens in M_COPY_PKTHDR() which dereferences an > equivalent pointer. > > Something seriously evil is happening here and, unfortunately, I have > no idea what. > > Does this only happen on this one machine? Or is it reproducable on > several different machines? I used to stress test -STABLE for mbuf > starvation and never stumbled upon one of these `spontaneous pointer > deaths' myself. Although I have seen other weird problems reported by > other people, but only in RELENG_3. I see this exhibited so far only on our server (other machines don't get the fragments as they are dropped on entry in our gateway). > > If you cannot reproduce it on any other machines, I would start > looking at possibly bad hardware... unless someone else sees something > I'm not. As I said, ever since I applied Ruslan Emirov's patch the panics didn't exhibit anymore. RGDS, Ady (@warpnet.ro) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message