Date: Wed, 5 Jun 2019 09:50:06 +0200 From: Peter Eriksson <pen@lysator.liu.se> To: Rick Macklem <rmacklem@uoguelph.ca>, Alexander Motin <mav@FreeBSD.org>, "mmacy@ixsystems.com" <mmacy@ixsystems.com>, "ryan@ixsystems.com" <ryan@ixsystems.com>, "pjd@freebsd.org" <pjd@freebsd.org>, "freebsd-fs@freebsd.org" <freebsd-fs@FreeBSD.org> Subject: Re: RFC: patching fsshare in ZFS Message-ID: <FFB04DDA-8DC8-4DBA-89AB-943E9638175D@lysator.liu.se> In-Reply-To: <YQXPR01MB3128B7972F1DCDF2A163859DDD160@YQXPR01MB3128.CANPRD01.PROD.OUTLOOK.COM> References: <YQXPR01MB3128B7972F1DCDF2A163859DDD160@YQXPR01MB3128.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi all! =20 I=E2=80=99ve been experimenting a little with adding support for a = simple BerkelyDB-based =E2=80=9Cexports=E2=80=9D database to mountd in = order to speed things up for the ZFS share code. The changes to mountd = are fairly simple, and the corresponding stuff was pretty simple to add = to the ZFS code too last I tried it. Speeds things up quite a bit - no = need to do linear searches through the /etc/zfs/exports file and no need = to rewrite the file for changes either=E2=80=A6 With N*10000 NFS shared = filesystems like we do this can be pretty nice to have.=20 My current DB-based code supports multiple exports entries per filsystem = by separating the =E2=80=9Crows=E2=80=9D in the database entry for a = filesystem with NUL characters. Let me know if there is some interest in this for others than just me. - Peter > 2 - Peter has some NFS servers with 20000-72000+ file systems being = exported. > The current code in fsshare.c copies the exports file and then = appends the new > entry for a file system and then replaces the exports file with = the new one. > I think this file copying happens for every file system, which = seems like a lot > of overhead. (I forget what Peter said w.r.t. how long this = takes, but I think it > was quite a while.) > My guess is that Pawel did this so that the update to the file = would happen > atomically. > It seems to me that if mountd held a read lock on the export = file while reading it > and fsshare() held a write lock on the file while appending the = new entry, that > the file copying could be avoided? > - The main problem I see w.r.t. doing this is that an old mountd = binary that doesn't > read lock the file could be broken by the fsshare() change. > --> One way to avoid this would be to have the new mountd = write more than > just the pid in the MOUNTD_PID file so that fsshare() = could tell if mountd was > going to be read locking the file. > OR > Just don't MFC the change and assume that the new mountd = would be > released when the new fsshare() is (in FreeBSD13?). >=20 > Anyhow, I can tweak mountd.c and fsshare.c, but that's as far as I can = take it. >=20 > Others would need to do testing and whatever it takes to get a change = to fsshare.c > into the ZFS sources. >=20 > So, what do you think about this? rick >=20 >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FFB04DDA-8DC8-4DBA-89AB-943E9638175D>