Date: Fri, 28 Dec 2018 00:20:08 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Peter Eriksson <peter@ifm.liu.se>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: Suggestion for hardware for ZFS fileserver Message-ID: <YQBPR01MB0388B1A87193C374F69E6F86DDB70@YQBPR01MB0388.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <YQBPR01MB038805DBCCE94383219306E1DDB80@YQBPR01MB0388.CANPRD01.PROD.OUTLOOK.COM> References: <CAEW%2BogZnWC07OCSuzO7E4TeYGr1E9BARKSKEh9ELCL9Zc4YY3w@mail.gmail.com> <C839431D-628C-4C73-8285-2360FE6FFE88@gmail.com> <CAEW%2BogYWKPL5jLW2H_UWEsCOiz=8fzFcSJ9S5k8k7FXMQjywsw@mail.gmail.com> <4f816be7-79e0-cacb-9502-5fbbe343cfc9@denninger.net>, <3160F105-85C1-4CB4-AAD5-D16CF5D6143D@ifm.liu.se>, <YQBPR01MB038805DBCCE94383219306E1DDB80@YQBPR01MB0388.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
I wrote: >Peter Eriksson wrote: >[good stuff snipped] >>This has caused some interesting problems=85 >> >>First thing we noticed was that booting would take forever=85 Mounting th= e 20-100k >>filesystems _and_ enabling them to be shared via NFS is not don= e efficient at all (for each filesystem it re-reads /etc/zfs/exports (a cou= ple of times) befor appending one line to the end. Repeat 20-100,000 times= =85 Not to mention the big kernel lock for NFS =93hold all NFS activity whi= le we flush and reinstalls all sharing information per filesystem=94 being = done by mountd=85 >Yes, /etc/exports and mountd were implemented in the 1980s, when a dozen >file systems would have been a large server. Scaling to 10,000 or more fil= e systems wasn't even conceivable back then. >Wish list item #1: A BerkeleyDB-based =92sharetab=92 that replaces the hor= ribly >slow /etc/zfs/exports text file. >Wish list item #2: A reimplementation of mountd and the kernel interface t= o allow >a =93diff=94 between the contents of the DB-based sharetab above b= e input into the >kernel instead of the brute-force way it=92s done now.. >The parser in mountd for /etc/exports is already an ugly beast and I think >implementing a "diff" version will be difficult, especially figuring out w= hat needs >to be deleted. > >I do have a couple of questions related to this: >1 - Would your case work if there was an "add these lines to /etc/exports"= ? > (Basically adding entries for file systems, but not trying to delete = anything > previously exported. I am not a ZFS guy, but I think ZFS just genera= tes another > exports file and then gets mountd to export everything again.) >2 - Are all (or maybe most) of these ZFS file systems exported with the sa= me > arguments? > - Here I am thinking that a "default-for-all-ZFS-filesystems" line c= ould be > put in /etc/exports that would apply to all ZFS file systems not = exported > by explicit lines in the exports file(s). > This would be fairly easy to implement and would avoid trying to han= dle > 1000s of entries. > >In particular, #2 above could be easily implemented on top of what is alre= ady >there, using a new type of line in /etc/exports and handling that as a spe= cial >case by the NFS server code, when no specific export for the file system t= o the >client is found. Unfortunately, it doesn't sound like #2 above would be useful for Peter. Al= though it is easy to implement a single default export for all ZFS file systems not alre= ady exported, it would not be easy to say "export all file systems below /foo/bar this wa= y", since the kernel code basically doesn't know the directory structure. It has vnod= es for file objects and mount points to work with. (The kernel exports hang off of= the mount points.) >>(I=92ve written some code that implements item #1 above and it helps quit= e a bit. >>Nothing near production quality yet though. I have looked at ite= m #2 a bit too but >>not done anything about it.) Btw, this "item #2" is not what I am referring to. [more good stuff snipped] rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR01MB0388B1A87193C374F69E6F86DDB70>