From owner-freebsd-hackers Thu Dec 18 14:56:46 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id OAA00245 for hackers-outgoing; Thu, 18 Dec 1997 14:56:46 -0800 (PST) (envelope-from owner-freebsd-hackers) Received: from smtp03.primenet.com (smtp03.primenet.com [206.165.6.133]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id OAA29773 for ; Thu, 18 Dec 1997 14:51:18 -0800 (PST) (envelope-from tlambert@usr01.primenet.com) Received: (from daemon@localhost) by smtp03.primenet.com (8.8.8/8.8.8) id PAA02707; Thu, 18 Dec 1997 15:51:16 -0700 (MST) Received: from usr01.primenet.com(206.165.6.201) via SMTP by smtp03.primenet.com, id smtpd002686; Thu Dec 18 15:51:09 1997 Received: (from tlambert@localhost) by usr01.primenet.com (8.8.5/8.8.5) id PAA11470; Thu, 18 Dec 1997 15:51:09 -0700 (MST) From: Terry Lambert Message-Id: <199712182251.PAA11470@usr01.primenet.com> Subject: Re: panic: blkfree: freeling free block/frag To: ken@plutotech.com (Kenneth Merry) Date: Thu, 18 Dec 1997 22:51:08 +0000 (GMT) Cc: tlambert@primenet.com, ivt@gamma.ru, mike@smith.net.au, sclawson@bottles.cs.utah.edu, freebsd-hackers@FreeBSD.ORG In-Reply-To: <199712182144.OAA27429@panzer.plutotech.com> from "Kenneth Merry" at Dec 18, 97 02:44:30 pm X-Mailer: ELM [version 2.4 PL23] Content-Type: text Sender: owner-freebsd-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > > > I do not believe this is a single bit error. I believe this is the same > > > > problem I have been seeing. > > > > > > > > Does your ethernet hardware address begin with 00 00? > > > > > > news.gamma.ru (158.250.39.26) at 0:0:c0:a4:2e:61 > > > Is there a problem with 0:0 ? > > > > Look at your corrupt variables and nearby variables in the stack, and > > see if your ethernet address is being blown onto the stack somewhere. > > > > There is no problem with the 0:0, but it reinforces my feeling that > > this could be resulting from a trashed kernel stack. > > I doubt that his ethernet address indicates any problems. 00:00:c0 > is an SMC ethernet address prefix. I've got two SMC 10/100 cards in one of > my systems, and a SMC Elite Ultra 16 (or something to that effect) in > another system, and all of the ethernet addresses begin with 00:00:c0. You are missing it. There exists a bug such that a struct sockaddr can be written to an arbitrary kernel stack. This struct sockaddr contains a source addr, a dest addr (your machine), a protocol type of 0x8000, and so on. I have a system dump that procves this is occurring. I still don't know why, but I *do* know that it is. Probably it involves a stack address being erroneously stored over a sleep and then referenced at interrupt time. I am asking him to look to see if his problem is because his kernel stack was tromped by this bug. To do that, he has to look at the kernel stack around the error and see if he sees something that looks like a sockaddr has been written to his stack. He will have to x/16x (or whatever) his stack, and ignore whether or not something belongs in one variable or another. Where things belong and where things are are two different things for this crash. The reason I asked about his address at all was to see if his address could account for the particular instance of damage he reported. I think it can because the thing is in reverse order (ie: not Intel byte order ie: network byte order ie: God's byte order). This means that it will blow zero's at the front of anything it only partially corrupts. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.