Date: Mon, 13 Oct 2003 09:28:38 -0700 From: "Kevin Oberman" <oberman@es.net> To: Sam Leffler <sam@errno.com> Cc: Andre Guibert de Bruet <andy@siliconlandmark.com> Subject: Re: What's up with the IP stack? Message-ID: <20031013162838.73FE25D07@ptavv.es.net> In-Reply-To: Message from Sam Leffler <sam@errno.com> of "Sun, 12 Oct 2003 11:56:53 PDT." <200310121156.53425.sam@errno.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> From: Sam Leffler <sam@errno.com> > Date: Sun, 12 Oct 2003 11:56:53 -0700 > Sender: owner-freebsd-current@freebsd.org > > On Sunday 12 October 2003 11:03 am, Andre Guibert de Bruet wrote: > > On Sun, 12 Oct 2003, Josef Karthauser wrote: > > > On Sun, Oct 12, 2003 at 02:48:01PM +0200, Soren Schmidt wrote: > > > > It seems Josef Karthauser wrote: > > > > > I've just built and installed a new kernel, the first since Aug 6th. > > > > > There appears to be a problem with the IP stack. What happens is > > > > > that everything is fine for a few hours, and then the IP stack stops > > > > > working. I can no longer ping anything on the local network, my > > > > > default route drops out (which is probably dhclient's doing). > > > > > Perhaps it is ARP that is broken, it's hard to tell. All I know is > > > > > that I need to reboot to make it work again. > > > > > > > > > > Is anyone else experiencing this kind of problem? > > > > > > > > Do you have dummynet included in the kernel ? > > > > That has been broken for me since sam's latest commit as a backout > > > > of ip_dummynet.c fixes the problem for me... > > > > > > No, I've not got dummynet in there. My current kernel config is: > > > > I experienced this a week ago. I found that ifconfig'ing the interface > > down and back up again "fixed" the problem. I've since reverted to a > > kernel compiled on September 25th. > > It would be good to know more details; I still don't have much to go on. Try > to identify, for example, if the problem is specific to a particular > device/interface or feature you're using (e.g dummynet). If you have ddb in > your system, then when the system gets into a bad state break into the > debugger and look for threads that are blocked on locks. If you have witness > in your kernel then show locks would also be useful. If you don't have > witness in your system then rebuild your kernel with it. > > The most recent round of changes were to lock the routing table. These went > in 10/3 and were extensive. They could easily be the problem but w/o more > info I can't really help. Just a few more data points. I am seeing the problem on my ThinkPad T30 only on the wireless interface. I have never seen it when connected by 10/100 via fxp0. When I see this I can reach some LAN hosts, but not others. I can always seem to reach the access point. I can usually, but not always, reach most other systems on the LAN, but not the gateway router, a Sonic Wall firewall. I have logged onto another system and then connected to the firewall, so it looks like the physical path is OK. The problem is intermittent and I have only scattered data. I've been seeing it sice about the beginning of October. I was blaming it on hardware, but now that I see these reports, maybe it's not. (I just replaced my Apple Airport AP with a D-Link, so there is something to suspect.) In may case things just start working again. The pause can vary from a few seconds to about 10 minutes. netstat -rnf inet and arp -a output both look to be fine. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031013162838.73FE25D07>