From owner-freebsd-questions Thu Dec 13 11: 0:15 2001 Delivered-To: freebsd-questions@freebsd.org Received: from catalyst.sasknow.net (catalyst.sasknow.net [207.195.92.130]) by hub.freebsd.org (Postfix) with ESMTP id 692B937B416 for ; Thu, 13 Dec 2001 11:00:02 -0800 (PST) Received: from localhost (ryan@localhost) by catalyst.sasknow.net (8.11.6/8.11.6) with ESMTP id fBDJ1xv94448; Thu, 13 Dec 2001 13:01:59 -0600 (CST) (envelope-from ryan@sasknow.com) X-Authentication-Warning: catalyst.sasknow.net: ryan owned process doing -bs Date: Thu, 13 Dec 2001 13:01:58 -0600 (CST) From: Ryan Thompson X-X-Sender: To: Anthony Atkielski Cc: FreeBSD Questions Subject: Re: Uptime not so good after all -- why does my net connection go dead? In-Reply-To: <002201c183fd$6d028210$0a00000a@atkielski.com> Message-ID: <20011213122631.L94416-100000@catalyst.sasknow.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Anthony Atkielski wrote to FreeBSD Questions: > I thought my FreeBSD system was going to stay up forever, based on > what I had heard, Yes, and it should, barring hardware problems, pilot error, or extended power outage, or managerial downtime. > but I had to boot it today. For the umpteenth time, the OS > abruptly and silently decided to stop communicating with my > router. It had no trouble talking to the other PC on my LAN, but > it absolutely would not talk to the router. As far as I could > tell, it would not respond to traffic from the router, nor would > it send traffic to the router. To give you a more detailed response, we'll need to see what's actually going on with FreeBSD. You're reporting, for the most part, application-level symptoms. ICMP echo requests (ping) in this case aren't much different. If the problem is with your LAN, you need to go to the link layer... From the router, AND the NT machine, try arp lookups for the FreeBSD machine's public IP address. Do you get the same MAC address as is shown in by the output of ifconfig(8) in FreeBSD? If no, then perhaps your router has claimed the IP, or the IP was assigned to another machine, etc, and you need to pinpoint that. This sort of thing can happen behind your back. On the FreeBSD box, put your NIC in promiscuous mode and start analyzing frames. What actually gets sent out on the wire? Is the machine seeing the IP packets, but not actually passing them up to the transport layer? Or maybe it just isn't sending anything out? I assume your IP address and netmask are set correctly with ifconfig(8)? Does the router agree with you in terms of netmask? The output of `netstat -rn` would be extremely helpful. The output and network config of the router would also be helpful. Some things You can do: Try plugging your FreeBSD machine directly into a port on your router, and unplugging everything else (except your uplink :-). If THIS works, then another device on the wire is misbehaving. > - It's not the FreeBSD machine's NIC; the NIC continues to talk to the NT > machine, and I can also make it work with the router by adding a new IP > address to the interface ("ifconfig xl0 xxx.xxx.xxx.xxx alias"). This suggests that either something is wrong with ARP, and/or the routing tables on the FreeBSD machine or the router. > Nothing seemed to make the problem go away, so after two weeks of > continuous uptime, I finally bit the bullet and rebooted the > machine. The problem was gone when the machine came back up. I > did not power-cycle the hardware. I'd hardly be "biting the bullet" after 2 weeks: $ uptime 12:41PM up 261 days, 9:56, 3 users, load averages: 2.37, 2.46, 2.42 $ uname -a FreeBSD ren.sasknow.com 3.5-STABLE FreeBSD 3.5-STABLE #0: Sun Mar 25 22:28:19 CST 2001 hutenosa@ren.sasknow.com:/usr/src/sys/compile/REN i386 After 10 months or so, I think twice about rebooting. In this time, this machine has survived two power failures, several brownouts, one particularly memorable surge, a dead CPU fan, experimental code which resulted in a fork bomb that filled up the proc table, exhausted the swap space, and killed just about everything that was running on the machine, not to mention the abuse it takes from all of our web clients :-) And, 261 days isn't anywhere near the potential a properly maintained FreeBSD system can achieve, but it definitely shows it is sustainable. 10 months ago, the system was taken down to be moved to a different room and be connected to a different UPS. I had a kernel upgrade ready for that. Total downtime < 5 min. If not for the "managerial decisions" I have made, this system probably wouldn't have been down for the past 4 years (when it was installed). FWIW, you most did NOT have to reboot the FreeBSD machine :-) There are plenty of problems that can be "solved" by a reboot, but the vast majority of those can be solved WITHOUT a reboot if you know what to fix. That is how many UNIX systems stay operational for several months or even a few years. > This means that the NT machine still holds the record for uptime > by a very handsome margin (several weeks). > > I'd like to know exactly what is happening inside FreeBSD when it > decides to consign this particular IP address to the Twilight Zone > for one particular destination/source (the router). Sure, send answers to the questions I've posed, and we'll be able to get much closer to an explanation. > Obviously, this is a mission-critical issue, as no production > system can afford to be completely deprived of external network > connectivity. > > I used to have this problem a lot more until I discovered that the > router was sending out DHCP and RIP traffic to the LAN. I turned > that off and the problem _seemed_ to go away. Unfortunately, it > looks like it simply became less frequent instead. Once in two > weeks is still completely unacceptable, however. Which is exactly why you'll have to fix it! :-) Hope this helps, - Ryan -- Ryan Thompson Network Administrator, Accounts SaskNow Technologies - http://www.sasknow.com #106-380 3120 8th St E - Saskatoon, SK - S7H 0W2 Tel: 306-664-3600 Fax: 306-664-1161 Saskatoon Toll-Free: 877-727-5669 (877-SASKNOW) North America To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message