Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Mar 2002 22:32:23 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        ? ?? <talley@neowiz.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: a question related to NFS
Message-ID:  <3C9ACFF7.88AAE8B8@mindspring.com>
References:  <FDE6222CD314D5118B1600508B6A8F9B3454A7@msg.ds.neowiz.com>

next in thread | previous in thread | raw e-mail | index | archive | help
If it just plain reboots, with no message or anything, then
you will have to put debugging into the double fault handler.

For most cases, you need to enable DDB and breaking to the
debugger on a panic, and the panic will cause the node to
stop functioning entirely, until a human has examined the
problem.

By "just rebooting".... if it happens around a certain time,
then you need to be there around that time.  We had a problem
with one system that turned out to be the Janirot plugging
his floor buffer into the same circuit as the external disk
enclosure.  8-).

In any case, if it's a FreeBSD box rebooting, you need to
trap the reboot.

One thing you could do is, on reboot, send a signal to a
different machine, using a program in rc.local, and have
that different machine doing a tcpdump.  That way you can
capture the packet sequence that results in the reboot and
recreate it at will, which would help immensely in debugging
it.

IMO, you are actually hitting a resource barrier that you
don't know about, and that's why you are getting the failure.


Do you know if there is a panic message on the reboot, or not?

Also, the "netstat -m" only helps if you so it every minute
up to the crash, saving them off, so that you can plot the
values and se if there is a trend.  If there isn't, then
the next thing you will need to plot is vmstat -m values,
and so on.

Basically, until you capture the failure, all we can do is
be sympathetic about the fact you had one, because the only
information we have to help you with is what you give us:
there is no well known bug that could be causing the behaviour
you are seeing, so it has to be peculiar to your situation.

-- Terry

? ?? wrote:
> =

> 1) one of server(NAS)  running on NeApp and the other server(SAN) runni=
ng on Hitach
> =

> 2) the client running on FreeBSD
> =

> 3) the client rebooting once or twice in peak time. ( there was no erro=
r log)
> =

> 4) netstat -rm say
> boardr4 /sayclub 1 % netstat -rm
> 885/2048/34816 mbufs in use (current/peak/max):
>         730 mbufs allocated to data
>         155 mbufs allocated to packet headers
> 581/1104/8704 mbuf clusters in use (current/peak/max)
> 2720 Kbytes allocated to network (10% of mb_map in use)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines
> =

> 5) I thought it's not mbuf problem.
>     could you help me once more I find out about this problem ?
> =

> -----Original Message-----
> From: Terry Lambert [mailto:tlambert2@mindspring.com]
> Sent: Wednesday, March 20, 2002 5:30 PM
> To: =C2=B1=C3=A8 =C3=80=C2=B1=C3=81=C2=A4
> Cc: freebsd-fs@freebsd.org
> Subject: Re: a question related to NFS
> =

> Please don't send MIME to the list.  It makes the archives
> unreadable.
> =

> > I have been operating NFS server that mount data from storage(NAS)
> > on FreeBSD4.2-RELEASE.
> >
> > I got problem that it automatically reboot itself at peak time more
> > than once in a day.
> > I guessed that is kind of FreeBSD BUG related to NFS.
> >
> > So, I upgrade FreeBSD version FreeBSD4.5-RELEASE because the release
> > note mentoned "A number of bugs in the filesystem code, discovered
> > through the use of the fsx filesystem test tool, have been fixed.
> > Under certain circumstances (primarily related to use of NFS), these
> > bugs could cause data corruption or kernel panics."
> >
> > But It still reboot itself once or twice in a day after upgrade.
> > there were no hints in /var/log/messages. It just automatically reboo=
t
> > at peak time suddenly.
> >
> > Has anyone heard of this kind of problem?
> =

> 1)      Is the NAS running FreeBSD?
> =

> 2)      Is the client running FreeBSD?
> =

> 3)      Is the NAS rebooting, or is the client rebooting?
> =

> 4)      If it's a FreeBSD machine rebooting, what does the
>         "netstat -m" say, for several samples leading up to
>         the reboot?
> =

> If it's a FreeBSD box rebooting, you are probably running
> out of mbufs for some reason (probably bad tuning for the
> load you are putting on it, but also possibly a bug).  If
> it's a bug, everyone would be complaining about it, so it's
> probably not a bug, it's probably tuning, if it's the FreeBSD
> box.
> =

> -- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3C9ACFF7.88AAE8B8>