Date: Tue, 2 Jun 2020 04:30:23 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net> Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Mark Johnston <markj@FreeBSD.org>, "patrykkotlowski@gmail.com" <patrykkotlowski@gmail.com> Subject: Re: how to fix an interesting issue with mountd? Message-ID: <YTBPR01MB36647DA1465C7D35CDCE9681DD8B0@YTBPR01MB3664.CANPRD01.PROD.OUTLOOK.COM>
next in thread | raw e-mail | index | archive | help
Rodney Grimes wrote:=0A= >> Hi,=0A= >> =0A= >> I'm posting this one to freebsd-net@ since it seems vaguely similar=0A= >> to a network congestion problem and thought that network types=0A= >> might have some ideas w.r.t. fixing it?=0A= >> =0A= >> PR#246597 - Reports a problem (which if I understand it is) where a sigh= up=0A= >> is posted to mountd and then another sighup is posted to mountd while= =0A= >> it is reloading exports and the exports are not reloaded again.=0A= >> --> The simple patch in the PR fixes the above problem, but I think w= ill=0A= >> aggravate another one.=0A= >> For some NFS servers, it can take minutes to reload the exports file(s).= =0A= >> (I believe Peter Erriksonn has a server with 80000+ file systems exporte= d.)=0A= >> r348590 reduced the time taken, but it is still minutes, if I recall cor= rectly.=0A= Actually, my recollection w.r.t. the times was way off.=0A= I just looked at the old PR#237860 and, without r348590, it was 16seconds= =0A= (aka seconds, not minutes) and with r348590 that went down to a fraction=0A= of a second (there was no exact number in the PR, but I noted milliseconds = in=0A= the commit log entry.=0A= =0A= I still think there is a risk of doing the reloads repeatedly.=0A= =0A= >> --> If you apply the patch in the PR and sighups are posted to mountd as= =0A= >> often as it takes to reload the exports file(s), it will simply r= eload the=0A= >> exports file(s) over and over and over again, instead of processi= ng=0A= >> Mount RPC requests.=0A= >> =0A= >> So, finally to the interesting part...=0A= >> - It seems that the code needs to be changed so that it won't "forget"= =0A= >> sighup(s) posted to it, but it should not reload the exports file(s) t= oo=0A= >> frequently.=0A= >> --> My thoughts are something like:=0A= >> - Note that sighup(s) were posted while reloading the exports file(s) = and=0A= >> do the reload again, after some minimum delay.=0A= >> --> The minimum delay might only need to be 1second to allow some=0A= >> RPCs to be processed before reload happens again.=0A= >> Or=0A= >> --> The minimum delay could be some fraction of how long a reload ta= kes.=0A= >> (The code could time the reload and use that to calculate how = long to=0A= >> delay before doing the reload again.)=0A= >> =0A= >> Any ideas or suggestions? rick=0A= >> ps: I've actually known about this for some time, but since I didn't hav= e a good=0A= >> solution...=0A= >=0A= >Build a system that allows adding and removing entries from the=0A= >in mountd exports data so that you do not have to do a full=0A= >reload every time one is added or removed?=0A= >=0A= >Build a system that used 2 exports tables, the active one, and the=0A= >one that was being loaded, so that you can process RPC's and reloads=0A= >at the same time.=0A= Well, r348590 modified mountd so that it built a new set of linked list=0A= structures from the modified exports file(s) and then compared them with=0A= the old ones, only doing updates to the kernel exports for changes.=0A= =0A= It still processes the entire exports file each time, to produce the in mou= ntd=0A= memory linked lists (using hash tables and a binary tree).=0A= =0A= Peter did send me a patch to use a db frontend, but he felt the only=0A= performance improvements would be related to ZFS.=0A= Since ZFS is something I avoid like the plague I never pursued it.=0A= (If anyone willing to ZFS stuff wants to pursue this,=0A= just email and I can send you the patch.)=0A= Here's a snippet of what he said about it.=0A= > It looks like a very simple patch to create and even though it wouldn=92= t really > improve the speed for the work that mountd does it would= make possible really > drastic speed improvements in the zfs commands. Th= ey (zfs commands) currently > reads the thru text-based exports file multi= ple times when you do work with zfs > filesystems (mounting/sharing/chang= ing share options etc). Using a db based =0A= > exports file for the zfs exports (b-tree based probably) would allow the= zfs code > to be much faster.=0A= =0A= At this point, I am just interested in fixing the problem in the PR, rick= =0A= =0A=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTBPR01MB36647DA1465C7D35CDCE9681DD8B0>