Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Dec 2001 17:05:22 -0500 (EST)
From:      Kenneth Wayne Culver <culverk@wam.umd.edu>
To:        Ryan Thompson <ryan@sasknow.com>
Cc:        Anthony Atkielski <anthony@atkielski.com>, FreeBSD Questions <freebsd-questions@FreeBSD.ORG>
Subject:   Re: Uptime not so good after all -- why does my net connection go dead?
Message-ID:  <Pine.GSO.4.21.0112131704390.4337-100000@rac4.wam.umd.edu>
In-Reply-To: <20011213122631.L94416-100000@catalyst.sasknow.net>

next in thread | previous in thread | raw e-mail | index | archive | help
This whole thing basically sounds to me like some form of
misconfiguration, either of the router, or of the FreeBSD machine, or
both. I've had uptimes greater than 2 years, and then the machine only
went down because of a power failure.

Ken

On Thu, 13 Dec 2001, Ryan Thompson wrote:

> Anthony Atkielski wrote to FreeBSD Questions:
> 
> > I thought my FreeBSD system was going to stay up forever, based on
> > what I had heard,
> 
> Yes, and it should, barring hardware problems, pilot error, or
> extended power outage, or managerial downtime.
> 
> > but I had to boot it today.  For the umpteenth time, the OS
> > abruptly and silently decided to stop communicating with my
> > router.  It had no trouble talking to the other PC on my LAN, but
> > it absolutely would not talk to the router.  As far as I could
> > tell, it would not respond to traffic from the router, nor would
> > it send traffic to the router.
> 
> To give you a more detailed response, we'll need to see what's
> actually going on with FreeBSD. You're reporting, for the most part,
> application-level symptoms. ICMP echo requests (ping) in this case
> aren't much different. If the problem is with your LAN, you need to go
> to the link layer...
> 
> >From the router, AND the NT machine, try arp lookups for the FreeBSD
> machine's public IP address. Do you get the same MAC address as is
> shown in by the output of ifconfig(8) in FreeBSD? If no, then perhaps
> your router has claimed the IP, or the IP was assigned to another
> machine, etc, and you need to pinpoint that. This sort of thing can
> happen behind your back.
> 
> On the FreeBSD box, put your NIC in promiscuous mode and start
> analyzing frames. What actually gets sent out on the wire? Is the
> machine seeing the IP packets, but not actually passing them up to the
> transport layer? Or maybe it just isn't sending anything out?
> 
> I assume your IP address and netmask are set correctly with
> ifconfig(8)? Does the router agree with you in terms of netmask?
> 
> The output of `netstat -rn` would be extremely helpful. The output and
> network config of the router would also be helpful.
> 
> 
> Some things You can do:
> 
> Try plugging your FreeBSD machine directly into a port on your router,
> and unplugging everything else (except your uplink :-). If THIS works,
> then another device on the wire is misbehaving.
> 
> 
> > - It's not the FreeBSD machine's NIC; the NIC continues to talk to the NT
> > machine, and I can also make it work with the router by adding a new IP
> > address to the interface ("ifconfig xl0 xxx.xxx.xxx.xxx alias").
> 
> This suggests that either something is wrong with ARP, and/or the
> routing tables on the FreeBSD machine or the router.
> 
> 
> > Nothing seemed to make the problem go away, so after two weeks of
> > continuous uptime, I finally bit the bullet and rebooted the
> > machine.  The problem was gone when the machine came back up.  I
> > did not power-cycle the hardware.
> 
> I'd hardly be "biting the bullet" after 2 weeks:
> $ uptime
> 12:41PM  up 261 days,  9:56, 3 users, load averages: 2.37, 2.46, 2.42
> $ uname -a
> FreeBSD ren.sasknow.com 3.5-STABLE FreeBSD 3.5-STABLE #0: Sun Mar 25
> 22:28:19 CST 2001 hutenosa@ren.sasknow.com:/usr/src/sys/compile/REN i386
> 
> After 10 months or so, I think twice about rebooting. In this time,
> this machine has survived two power failures, several brownouts, one
> particularly memorable surge, a dead CPU fan, experimental code which
> resulted in a fork bomb that filled up the proc table, exhausted the
> swap space, and killed just about everything that was running on the
> machine, not to mention the abuse it takes from all of our web clients
> :-) And, 261 days isn't anywhere near the potential a properly
> maintained FreeBSD system can achieve, but it definitely shows it is
> sustainable.
> 
> 10 months ago, the system was taken down to be moved to a different
> room and be connected to a different UPS. I had a kernel upgrade ready
> for that. Total downtime < 5 min. If not for the "managerial
> decisions" I have made, this system probably wouldn't have been down
> for the past 4 years (when it was installed).
> 
> FWIW, you most did NOT have to reboot the FreeBSD machine :-) There
> are plenty of problems that can be "solved" by a reboot, but the vast
> majority of those can be solved WITHOUT a reboot if you know what to
> fix. That is how many UNIX systems stay operational for several months
> or even a few years.
> 
> 
> > This means that the NT machine still holds the record for uptime
> > by a very handsome margin (several weeks).
> >
> > I'd like to know exactly what is happening inside FreeBSD when it
> > decides to consign this particular IP address to the Twilight Zone
> > for one particular destination/source (the router).
> 
> Sure, send answers to the questions I've posed, and we'll be able to
> get much closer to an explanation.
> 
> 
> > Obviously, this is a mission-critical issue, as no production
> > system can afford to be completely deprived of external network
> > connectivity.
> >
> > I used to have this problem a lot more until I discovered that the
> > router was sending out DHCP and RIP traffic to the LAN.  I turned
> > that off and the problem _seemed_ to go away.  Unfortunately, it
> > looks like it simply became less frequent instead.  Once in two
> > weeks is still completely unacceptable, however.
> 
> Which is exactly why you'll have to fix it! :-)
> 
> 
> 
> Hope this helps,
> - Ryan
> 
> -- 
>   Ryan Thompson <ryan@sasknow.com>
>   Network Administrator, Accounts
> 
>   SaskNow Technologies - http://www.sasknow.com
>   #106-380 3120 8th St E - Saskatoon, SK - S7H 0W2
> 
>         Tel: 306-664-3600   Fax: 306-664-1161   Saskatoon
>   Toll-Free: 877-727-5669     (877-SASKNOW)     North America
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-questions" in the body of the message
> 


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.4.21.0112131704390.4337-100000>