Date: Thu, 24 Sep 2015 17:24:35 -0400 (EDT) From: Rick Macklem <rmacklem@uoguelph.ca> To: Frank de Bot <lists@searchy.net> Cc: freebsd-stable@FreeBSD.org Subject: Re: kernel process [nfscl] high cpu Message-ID: <486700591.10025468.1443129875818.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <5603FC85.8070105@searchy.net> References: <1031959302.30289198.1430594914473.JavaMail.root@uoguelph.ca> <5603AE3D.5090407@searchy.net> <1887696626.8730412.1443097925392.JavaMail.zimbra@uoguelph.ca> <5603FC85.8070105@searchy.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Frank de Bot wrote: > Rick Macklem wrote: > > Frank de Bot wrote: > >> Rick Macklem wrote: > >>> Frank de Bot wrote: > >>>> Hi, > >>>> > >>>> On a 10.1-RELEASE-p9 server I have several NFS mounts used for a > >>>> jail. > >>>> Because it's a server only to test, there is a low load. But the > >>>> [nfscl] > >>>> process is hogging a CPU after a while. This happens pretty fast, > >>>> within > >>>> 1 or 2 days. I'm noticing the high CPU of the process when I want to > >>>> do > >>>> some test after a little while (those 1 or 2 days). > >>>> > >>>> My jail.conf look like: > >>>> > >>>> exec.start = "/bin/sh /etc/rc"; > >>>> exec.stop = "/bin/sh /etc/rc.shutdown"; > >>>> exec.clean; > >>>> mount.devfs; > >>>> exec.consolelog = "/var/log/jail.$name.log"; > >>>> #mount.fstab = "/usr/local/etc/jail.fstab.$name"; > >>>> > >>>> test01 { > >>>> host.hostname = "test01_hosting"; > >>>> ip4.addr = somepublicaddress; > >>>> ip4.addr += someprivateaddress; > >>>> > >>>> mount = "10.13.37.2:/tank/hostingbase /opt/jails/test01 > >>>> nfs nfsv4,minorversion=1,pnfs,ro,noatime 0 0"; > >>>> mount += "10.13.37.2:/tank/hosting/test > >>>> /opt/jails/test01/opt nfs nfsv4,minorversion=1,pnfs,noatime > >>>> 0 0"; > >>>> > >>>> path = "/opt/jails/test01"; > >>>> } > >>>> > >>>> Last test was with NFS 4.1, I also worked with NFS 4.(0) with the > >>>> same > >>>> result. In the readonly nfs share there are symbolic links point to > >>>> the > >>>> read-write share for logging, storing .run files, etc. When I monitor > >>>> my > >>>> network interface with tcpdump, there is little nfs traffic, only > >>>> when I > >>>> do try to access the shares there is activity. > >>>> > >>>> What is causing nfscl to run around in circles, hogging the CPU (it > >>>> makes the system slow to respond too) or how can I found out what's > >>>> the > >>>> cause? > >>>> > >>> Well, the nfscl does server->client RPCs referred to as callbacks. I > >>> have no idea what the implications of running it in a jail is, but I'd > >>> guess that these server->client RPCs get blocked somehow, etc... > >>> (The NFSv4.0 mechanism requires a separate IP address that the server > >>> can connect to on the client. For NFSv4.1, it should use the same > >>> TCP connection as is used for the client->server RPCs. The latter > >>> seems like it should work, but there is probably some glitch.) > >>> > >>> ** Just run without the nfscl daemon (it is only needed for delegations > >>> or > >>> pNFS). > >> > >> How can I disable the nfscl daemon? > >> > > Well, the daemon for the callbacks is called nfscbd. > > You should check via "ps ax", to see if you have it running. > > (For NFSv4.0 you probably don't want it running, but for NFSv4.1 you > > do need it. pNFS won't work at all without it, but unless you have a > > server that supports pNFS, it won't work anyhow. Unless your server is > > a clustered Netapp Filer, you should probably not have the "pnfs" option.) > > > > To run the "nfscbd" daemon you can set: > > nfscbd_enable="TRUE" > > in your /etc/rc.conf will start it on boot. > > Alternately, just type "nfscbd" as root. > > > > The "nfscl" thread is always started when an NFSv4 mount is done. It does > > an assortment of housekeeping things, including a Renew op to make sure the > > lease doesn't expire. If for some reason the jail blocks these Renew RPCs, > > it will try to do them over and over and ... because having the lease > > expire is bad news for NFSv4. How could you tell? > > Well, capturing packets between the client and server, then looking at them > > in wireshark is probably the only way. (Or maybe a large count for Renew > > in the output from "nfsstat -e".) > > > > "nfscbd" is optional for NFSv4.0. Without it, you simply don't do > > callbacks/delegations. > > For NFSv4.1 it is pretty much required, but doesn't need a separate > > server->client TCP > > connection. > > --> I'd enable it for NFSv4.1, but disable it for NFSv4.0 at least as a > > starting point. > > > > And as I said before, none of this is tested within jails, so I have no > > idea > > what effect the jails have. Someone who understands jails might have some > > insight > > w.r.t. this? > > > > rick > > > > Since last time I haven't tried to use pnfs and just sticked with > nfsv4.0. nfscbd is not running. The server is now running 10.2. The > number of renews is not very high (56k, getattr is for example 283M) > View with wireshark, renew calls look good ,the nfs status is ok. > > Is there a way to know what [nfscl] is active with? > Btw, I'm an old-school debugger, which means I'd add a bunch of "printf()s" to the function called nfscl_renewthread() in sys/fs/nfsclient/nfs_clstate.c. (That's the nfscl thread. It should only do the "for(;;)" loop once/sec, but if you get lots of loop iterations, you might be able to isolate why via printf()s.) You did say it was a test system. Good luck with it, rick > I do understand nfs + jails could have issues, but I like to understand > them. > > > Frank > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?486700591.10025468.1443129875818.JavaMail.zimbra>