Date: Fri, 21 Dec 2018 23:49:58 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Peter Eriksson <peter@ifm.liu.se>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: Re: Suggestion for hardware for ZFS fileserver Message-ID: <YQBPR01MB038805DBCCE94383219306E1DDB80@YQBPR01MB0388.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <3160F105-85C1-4CB4-AAD5-D16CF5D6143D@ifm.liu.se> References: <CAEW%2BogZnWC07OCSuzO7E4TeYGr1E9BARKSKEh9ELCL9Zc4YY3w@mail.gmail.com> <C839431D-628C-4C73-8285-2360FE6FFE88@gmail.com> <CAEW%2BogYWKPL5jLW2H_UWEsCOiz=8fzFcSJ9S5k8k7FXMQjywsw@mail.gmail.com> <4f816be7-79e0-cacb-9502-5fbbe343cfc9@denninger.net>, <3160F105-85C1-4CB4-AAD5-D16CF5D6143D@ifm.liu.se>
next in thread | previous in thread | raw e-mail | index | archive | help
Peter Eriksson wrote: [good stuff snipped] >This has caused some interesting problems=85 > >First thing we noticed was that booting would take forever=85 Mounting the= 20-100k >filesystems _and_ enabling them to be shared via NFS is not done = efficient at all (for >each filesystem it re-reads /etc/zfs/exports (a coup= le of times) befor appending one >line to the end. Repeat 20-100,000 times= =85 Not to mention the big kernel lock for >NFS =93hold all NFS activity wh= ile we flush and reinstalls all sharing information per >filesystem=94 bein= g done by mountd=85 Yes, /etc/exports and mountd were implemented in the 1980s, when a dozen file systems would have been a large server. Scaling to 10,000 or more file systems wasn't even conceivable back then. >Wish list item #1: A BerkeleyDB-based =92sharetab=92 that replaces the hor= ribly >slow /etc/zfs/exports text file. >Wish list item #2: A reimplementation of mountd and the kernel interface t= o allow >a =93diff=94 between the contents of the DB-based sharetab above b= e input into the >kernel instead of the brute-force way it=92s done now.. The parser in mountd for /etc/exports is already an ugly beast and I think implementing a "diff" version will be difficult, especially figuring out wh= at needs to be deleted. I do have a couple of questions related to this: 1 - Would your case work if there was an "add these lines to /etc/exports"? (Basically adding entries for file systems, but not trying to delete a= nything previously exported. I am not a ZFS guy, but I think ZFS just generat= es another exports file and then gets mountd to export everything again.) 2 - Are all (or maybe most) of these ZFS file systems exported with the sam= e arguments? - Here I am thinking that a "default-for-all-ZFS-filesystems" line co= uld be put in /etc/exports that would apply to all ZFS file systems not e= xported by explicit lines in the exports file(s). This would be fairly easy to implement and would avoid trying to hand= le 1000s of entries. In particular, #2 above could be easily implemented on top of what is alrea= dy there, using a new type of line in /etc/exports and handling that as a spec= ial case by the NFS server code, when no specific export for the file system to= the client is found. >(I=92ve written some code that implements item #1 above and it helps quite= a bit. >Nothing near production quality yet though. I have looked at item = #2 a bit too but >not done anything about it.) [more good stuff snipped] Btw, although I put the questions here, I think a separate thread discussin= g how to scale to 10000+ file systems might be useful. (On freebsd-fs@ or freebsd-current@. The latter sometimes gets the attention of more developer= s.) rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR01MB038805DBCCE94383219306E1DDB80>