Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Dec 2018 23:49:58 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Peter Eriksson <peter@ifm.liu.se>, "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: Suggestion for hardware for ZFS fileserver
Message-ID:  <YQBPR01MB038805DBCCE94383219306E1DDB80@YQBPR01MB0388.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <3160F105-85C1-4CB4-AAD5-D16CF5D6143D@ifm.liu.se>
References:  <CAEW%2BogZnWC07OCSuzO7E4TeYGr1E9BARKSKEh9ELCL9Zc4YY3w@mail.gmail.com> <C839431D-628C-4C73-8285-2360FE6FFE88@gmail.com> <CAEW%2BogYWKPL5jLW2H_UWEsCOiz=8fzFcSJ9S5k8k7FXMQjywsw@mail.gmail.com> <4f816be7-79e0-cacb-9502-5fbbe343cfc9@denninger.net>, <3160F105-85C1-4CB4-AAD5-D16CF5D6143D@ifm.liu.se>

next in thread | previous in thread | raw e-mail | index | archive | help
Peter Eriksson wrote:
[good stuff snipped]
>This has caused some interesting problems=85
>
>First thing we noticed was that booting would take forever=85 Mounting the=
 20-100k >filesystems _and_ enabling them to be shared via NFS is not done =
efficient at all (for >each filesystem it re-reads /etc/zfs/exports (a coup=
le of times) befor appending one >line to the end. Repeat 20-100,000 times=
=85 Not to mention the big kernel lock for >NFS =93hold all NFS activity wh=
ile we flush and reinstalls all sharing information per >filesystem=94 bein=
g done by mountd=85
Yes, /etc/exports and mountd were implemented in the 1980s, when a dozen
file systems would have been a large server. Scaling to 10,000 or more file
systems wasn't even conceivable back then.

>Wish list item #1: A BerkeleyDB-based =92sharetab=92 that replaces the hor=
ribly >slow /etc/zfs/exports text file.
>Wish list item #2: A reimplementation of mountd and the kernel interface t=
o allow >a =93diff=94 between the contents of the DB-based sharetab above b=
e input into the >kernel instead of the brute-force way it=92s done now..
The parser in mountd for /etc/exports is already an ugly beast and I think
implementing a "diff" version will be difficult, especially figuring out wh=
at needs
to be deleted.

I do have a couple of questions related to this:
1 - Would your case work if there was an "add these lines to /etc/exports"?
     (Basically adding entries for file systems, but not trying to delete a=
nything
      previously exported. I am not a ZFS guy, but I think ZFS just generat=
es another
      exports file and then gets mountd to export everything again.)
2 - Are all (or maybe most) of these ZFS file systems exported with the sam=
e
      arguments?
      - Here I am thinking that a "default-for-all-ZFS-filesystems" line co=
uld be
         put in /etc/exports that would apply to all ZFS file systems not e=
xported
         by explicit lines in the exports file(s).
      This would be fairly easy to implement and would avoid trying to hand=
le
      1000s of entries.

In particular, #2 above could be easily implemented on top of what is alrea=
dy
there, using a new type of line in /etc/exports and handling that as a spec=
ial
case by the NFS server code, when no specific export for the file system to=
 the
client is found.

>(I=92ve written some code that implements item #1 above and it helps quite=
 a bit. >Nothing near production quality yet though. I have looked at item =
#2 a bit too but >not done anything about it.)
[more good stuff snipped]
Btw, although I put the questions here, I think a separate thread discussin=
g
how to scale to 10000+ file systems might be useful. (On freebsd-fs@ or
freebsd-current@. The latter sometimes gets the attention of more developer=
s.)

rick




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR01MB038805DBCCE94383219306E1DDB80>