Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 2 Jun 2020 04:30:23 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Mark Johnston <markj@FreeBSD.org>, "patrykkotlowski@gmail.com" <patrykkotlowski@gmail.com>
Subject:   Re: how to fix an interesting issue with mountd?
Message-ID:  <YTBPR01MB36647DA1465C7D35CDCE9681DD8B0@YTBPR01MB3664.CANPRD01.PROD.OUTLOOK.COM>

next in thread | raw e-mail | index | archive | help
Rodney Grimes wrote:=0A=
>> Hi,=0A=
>> =0A=
>> I'm posting this one to freebsd-net@ since it seems vaguely similar=0A=
>> to a network congestion problem and thought that network types=0A=
>> might have some ideas w.r.t. fixing it?=0A=
>> =0A=
>> PR#246597 - Reports a problem (which if I understand it is) where a sigh=
up=0A=
>>    is posted to mountd and then another sighup is posted to mountd while=
=0A=
>>    it is reloading exports and the exports are not reloaded again.=0A=
>>    --> The simple patch in the PR fixes the above problem, but I think w=
ill=0A=
>>           aggravate another one.=0A=
>> For some NFS servers, it can take minutes to reload the exports file(s).=
=0A=
>> (I believe Peter Erriksonn has a server with 80000+ file systems exporte=
d.)=0A=
>> r348590 reduced the time taken, but it is still minutes, if I recall cor=
rectly.=0A=
Actually, my recollection w.r.t. the times was way off.=0A=
I just looked at the old PR#237860 and, without r348590, it was 16seconds=
=0A=
(aka seconds, not minutes) and with r348590 that went down to a fraction=0A=
of a second (there was no exact number in the PR, but I noted milliseconds =
in=0A=
the commit log entry.=0A=
=0A=
I still think there is a risk of doing the reloads repeatedly.=0A=
=0A=
>> --> If you apply the patch in the PR and sighups are posted to mountd as=
=0A=
>>        often as it takes to reload the exports file(s), it will simply r=
eload the=0A=
>>        exports file(s) over and over and over again, instead of processi=
ng=0A=
>>        Mount RPC requests.=0A=
>> =0A=
>> So, finally to the interesting part...=0A=
>> - It seems that the code needs to be changed so that it won't "forget"=
=0A=
>>   sighup(s) posted to it, but it should not reload the exports file(s) t=
oo=0A=
>>   frequently.=0A=
>> --> My thoughts are something like:=0A=
>>   - Note that sighup(s) were posted while reloading the exports file(s) =
and=0A=
>>     do the reload again, after some minimum delay.=0A=
>>     --> The minimum delay might only need to be 1second to allow some=0A=
>>            RPCs to be processed before reload happens again.=0A=
>>      Or=0A=
>>     --> The minimum delay could be some fraction of how long a reload ta=
kes.=0A=
>>           (The code could time the reload and use that to calculate how =
long to=0A=
>>            delay before doing the reload again.)=0A=
>> =0A=
>> Any ideas or suggestions? rick=0A=
>> ps: I've actually known about this for some time, but since I didn't hav=
e a good=0A=
>>      solution...=0A=
>=0A=
>Build a system that allows adding and removing entries from the=0A=
>in mountd exports data so that you do not have to do a full=0A=
>reload every time one is added or removed?=0A=
>=0A=
>Build a system that used 2 exports tables, the active one, and the=0A=
>one that was being loaded, so that you can process RPC's and reloads=0A=
>at the same time.=0A=
Well, r348590 modified mountd so that it built a new set of linked list=0A=
structures from the modified exports file(s) and then compared them with=0A=
the old ones, only doing updates to the kernel exports for changes.=0A=
=0A=
It still processes the entire exports file each time, to produce the in mou=
ntd=0A=
memory linked lists (using hash tables and a binary tree).=0A=
=0A=
Peter did send me a patch to use a db frontend, but he felt the only=0A=
performance improvements would be related to ZFS.=0A=
Since ZFS is something I avoid like the plague I never pursued it.=0A=
(If anyone willing to ZFS stuff wants to pursue this,=0A=
just email and I can send you the patch.)=0A=
Here's a snippet of what he said about it.=0A=
>  It looks like a very simple patch to create and even though it wouldn=92=
t really        >  improve the speed for the work that mountd does it would=
 make possible really >  drastic speed improvements in the zfs commands. Th=
ey (zfs commands) currently >  reads the thru text-based exports file multi=
ple times when you do work with zfs  >  filesystems (mounting/sharing/chang=
ing share options etc). Using a db based  =0A=
>  exports file for the zfs exports (b-tree based probably) would allow the=
 zfs code > to be much faster.=0A=
=0A=
At this point, I am just interested in fixing the problem in the PR, rick=
=0A=
=0A=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTBPR01MB36647DA1465C7D35CDCE9681DD8B0>