Date: Thu, 8 Feb 2001 09:11:52 -0800 (PST) From: Rich Wales <richw@webcom.com> To: Luigi Rizzo <rizzo@aciri.org> Cc: bmilekic@technokratis.com, luigi@FreeBSD.ORG, freebsd-net@FreeBSD.ORG Subject: Re: Fw: if_ed.c && BRIDGE Message-ID: <20010208163904.85396.richw@wyattearp.stanford.edu> In-Reply-To: <200102080800.f1880s410219@iguana.aciri.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Luigi Rizzo wrote: > interesting dump... because it shows a bogus "length" > parameter passed to ed_pio_readmem(). Bosko and I were discussing my problem offline a couple of weeks ago, and in the course of a single morning I managed to create about 15 crashes. A couple of them were just like this latest one -- in ed_pio_readmem() -- but most of the others were in rl_encap(), which was called via a chain including rl_start(), bdg_forward(), ether_input(), and ed_get_packet(). It looks, to me, like some sort of data corruption that is causing a crash later on (but not immediately). And, as I reported earlier, these crashes went away completely after I commented out the bridge- specific code in if_ed.c (per a suggestion from Bosko). I don't have crash dumps from those earlier crashes -- I was (and still am) having unresolved problems getting the system to copy a crash dump into swap space automatically after a panic, and I hadn't yet hit upon the kludge of typing "call dumpsys" at the DDB prompt -- but I did write down some tracing info from DDB for the earlier crashes. See below for what I have; please understand that this is all I have. > Can you by chance find out what is the "len" value passed > to ed_get_packet? The printout in line #10 below is par- > tially deleted by some error message. Sure. I did "up 10", followed by "print len", and the value of this parameter was reported as 10. > Now, NE2000 clones if you look at the driver are known > for occasionally swapping the bytes of the length, but > the driver was supposed to take care of this. Evidently > there is some odd thing... Please note that I've tried two different NE2000 clone cards -- one ISA, one PCI -- and I got the same kinds of crashes from both. (The crash I described last night was with the PCI card.) Also, as I said, these crashes went away completely after I commented out the BRIDGE-specific code in if_ed.c, per Bosko's recommendation. I'm willing to agree with Bosko that leaving out this code is a tem- porary workaround, and not a true fix -- but do you think it might be possible to take it out in -STABLE for now, until the problem is found? Rich Wales richw@webcom.com http://www.webcom.com/richw/ ======================================================================== (5 times) Fatal trap 12: page fault while in kernel mode rl_encap + 0x120 rl_start + 0x23 bdg_forward + 0x468 ether_input + 0x7b ed_get_packet + 0x3b8 edintr + 0x5af Xresume5 + 0x2b (4 times) Fatal trap 12: page fault while in kernel mode rl_encap + 0x78 rl_start + 0x23 bdg_forward + 0x468 ether_input + 0x7b ed_get_packet + 0x3b8 edintr + 0x5af Xresume5 + 0x2b (2 times) Fatal trap 12: page fault while in kernel mode rl_encap + 0x117 rl_start + 0x23 bdg_forward + 0x468 ether_input + 0x7b ed_get_packet + 0x3b8 edintr + 0x5af Xresume5 + 0x2b (2 times) Fatal trap 12: page fault while in kernel mode ed_pio_readmem + 0x161 ed_get_packet + 0x393 edintr + 0x5af Xresume5 + 0x2b (1 time) Fatal trap 12: page fault while in kernel mode bpf_mtap + 0x18 ether_input + 0x2f pcn_rxeof + 0xe1 pcn_intr + 0x8a Xresume9 + 0x2b (1 time) Fatal trap 12: page fault while in kernel mode (DDB went into endless loop) ======================================================================== To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010208163904.85396.richw>