Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Oct 2004 20:23:04 -0400
From:      Jim Durham <durham@jcdurham.com>
To:        freebsd-hackers@freebsd.org
Cc:        Kris Kennaway <kris@obsecurity.org>
Subject:   Re: Sudden Reboots
Message-ID:  <200410012023.04922.durham@jcdurham.com>
In-Reply-To: <20041001223802.GA90717@xor.obsecurity.org>
References:  <200409301003.00492.durham@jcdurham.com> <20041001223802.GA90717@xor.obsecurity.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday 01 October 2004 06:38 pm, Kris Kennaway wrote:
> On Thu, Sep 30, 2004 at 10:03:00AM -0400, Jim Durham wrote:
> > I have had this problem now with at least 3 FreeBSD servers over a period
> > of about 2 years. I had put it down to some hardware problem but it seems
> > to be too much of a coincidence with 3 different machines doing the same
> > thing.
> >
> > The first time was when I put 4.5-RELEASE on a brand new Dell Poweredge
> > 2650. I ran it on the bench for a week or so, then decided all was well
> > and put it in the server rack and started doing the company's email
> > service on it. After a few weeks, it suddenly would 'reboot' for no
> > apparent reason. No log entries, nothing at all except the usual stuff in
> > /var/log/messages about '/ was not unmounted correctly', etc. Just like
> > you had pulled the power plug.
> >
> > The 2nd instance was a server that I maintain for an ISP that was a
> > mirror image of their primary server, a 'hot spare' so to speak. The
> > primary, running the same software was solid, but the backup would reboot
> > at about 5:20 every morning with the same syndrome..no log entries of any
> > sort and just the usual entries in /var/log messages saying the the /
> > partition was not unmounted properly. The odd thing was that it was
> > happening at virtually the same time every morning.
> >
> >  I upgraded both systems to the latest -RELEASE and it made no
> > difference. Then, they both just *stopped doing it by themselves* with no
> > apparent correlation to anything installed software-wise. Neither server
> > has had any problem for over a year now.
> >
> > The 3rd instance is happening now. Another server I maintain for my
> > 'night job' is doing the same thing for a customer. It just 'stops' like
> > you pulled the power plug. However, this time I thought to check using
> > 'last' and found that I had accidentally left an ssh session open and
> > that entry said 'crash'. There are no other log entries I can find
> > related to the 'reboot'.
>
> Do you have ddb enabled?  If not, the machine may be panicking and
> rebooting automatically. 

No. Not on any of the 3 boxes. Like I said, the problem has gone away and not 
returned on the Dell and the ISP's box and the loads on those boxes are 
always increasing and they've been fine for over a year now. It was just when 
this same thing started with a customer's server box that I started to wonder 
if it was some very intermittent problem in the kernel.

> Actual "spontaneous reboots" are very rare 

These are very rare.... except they seem to happen about once a day for a 
while and then stop... very strange..

> and usually caused by hardware problems (e.g. faulty power supply,
> overheating CPU, bad RAM). 

Possible, but if so, the hardware fixed itself on the first two boxes I 
mentioned. 

> Enable DDB, and see what happens the next 
> time it crashes.

I'll try that on the one that's doing it now. Any suggestions as to how to log 
this to get the moset info ? I've not played with ddb, but I'll read the docs 
and get it going.

Thanks much to all who responded!
>
> Kris
-- 
-Jim



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200410012023.04922.durham>