Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 May 2006 10:18:44 -0400
From:      "Rong-en Fan" <grafan@gmail.com>
To:        "Howard Leadmon" <howard@leadmon.net>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@freebsd.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: Trouble with NFSd under 6.1-Stable, any ideas?
Message-ID:  <6eb82e0605230718l337a58efmc85637afdec4fffb@mail.gmail.com>
In-Reply-To: <013b01c67e71$23aaacf0$071872cf@Leadmon.local>
References:  <6eb82e0605230556n31b86e55y1b07a2ef6ad9ca14@mail.gmail.com> <013b01c67e71$23aaacf0$071872cf@Leadmon.local>

next in thread | previous in thread | raw e-mail | index | archive | help
On 5/23/06, Howard Leadmon <howard@leadmon.net> wrote:
>
>
> > > > If there are any thing I can provide to help tracking this down.
> > > > Please let me know. By the way, I tried with truss/kdump
> > to see what
> > > > happens when nfsd eats lot of CPUs, but in vain. They do
> > not return anything.
> > > >
> > > I tried your recipe on 7-CURRENT with locally exported fs,
> > remounted
> > > over nfs. I did not get the behaviour your described.
> >
> > As noted in my previous thread, I have another 6.1-RELEASE
> > nfs server, which does not have this problem.
> >
> > > Could you, please, provide the backtrace for the nfsd that eats the
> > > CPU (from the ddb). I think it would be helpful to get several
> > > backtraces (i.e., bt <nfsd pid>, cont, bt <nfsd pid> ...)
> > to see where
> > > it running.
> >
> > I'm afraid that I can not do that. Last time I tried breaking
> > into ddb (on 5.x), it hangs my serial console and the server
> > is miles away :-( . Perhaps we can ask Howard to do that?
>
>  I am more than willing to do that, as this machine runs here with me, so=
 if
> needed I can easily get on a console, or perform a reboot.  Can one of yo=
u
> shed a little light on exactly what I need to do, and how to do this?  I =
ask
> as I have never used this ddb stuff, so not clue one on how to go about
> getting the information your looking to find.  Guess I have been lucky, a=
nd
> just never had an issue that took things to this level.

At least you have to add the following to your kernel:

options         KDB
options         DDB

Recompile it, reboot. You would better to setup a serial console
so you can easily copy thing from ddb output. To do it, you have
to put "device sio" in your kernel configuration and some files
below:

/boot.config
-Dh

/boot/loader.conf
comconsole_speed=3D115200
machdep.conspeed=3D115200

/etc/ttys
ttyd0   "/usr/libexec/getty std.115200" cons25  on secure

On the other machine, /etc/remote:
com1:dv=3D/dev/cuad0:br#115200:pa=3Dnone:

Then, use "tip com1" to attach the nfs server. The above settings
assume your serial console on nfs server is on COM1 and on the
client side is also COM1. If that's not the case, please follow
Handbook for howto setup a serial console other than COM1. To
break into ddb, either use ctrl+alt+esc or send a BREAK (I think ^b
will do) via serial line. After that, you should see

db>

Then you first use "ps" to find out the nfsd pid (better to remember
the pid which eats  lots of cpu before enter ddb). After that, do
what Konstantin suggests. I have never tried "cont" in db. I guess
that will return the execution back to kernel and you need to break
into ddb again to do another "bt <pid>".

By the way, could you verify that backing out vfs_lookup.c rev 1.90
helps in your situation? If not, maybe we are seeing different problems,
and then I have to figure out how to make my serial console work
here.

Thanks,
Rong-En Fan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6eb82e0605230718l337a58efmc85637afdec4fffb>