Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Mar 2019 21:31:50 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        "bugzilla-noreply@freebsd.org" <bugzilla-noreply@freebsd.org>, "fs@FreeBSD.org" <fs@FreeBSD.org>
Subject:   Re: [Bug 235774] [FUSE]: Need to evict invalidated cache contents on fuse_write_directbackend()
Message-ID:  <QB1PR01MB35379C69D7EE70F000B1975FDD730@QB1PR01MB3537.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <bug-235774-3630-gv5OeBYwCK@https.bugs.freebsd.org/bugzilla/>
References:  <bug-235774-3630@https.bugs.freebsd.org/bugzilla/>, <bug-235774-3630-gv5OeBYwCK@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
--- Comment #4 from Conrad Meyer <cem@freebsd.org> ---
>I think fuse's IO_DIRECT path is a mess.  Really all IO should go through =
the
>buffer cache, and B_DIRECT and ~B_CACHE are just flags that control the
>buffer's lifetime once the operation is complete.  Removing the "direct"
>backends entirely (except as implementation details of strategy()) would
>simplify and correct the caching logic.
Hmm, I'm not sure that I agree that all I/O should go through the buffer ca=
che,
in general. (I won't admit to knowing the fuse code well enough to comment
specifically on it.)

There is a code path on the NFS client that bypasses the buffer cache for O=
_DIRECT,
but it isn't enabled by default, so I doubt many use it. (It is enabled via=
 a sysctl
which defaults to 0.)

I can see an argument for enabling it, since having the NFS (or FUSE) clien=
t do a
large amount of writing to a file can flood the buffer cache and avoiding t=
his
for the case where the client won't be reading the file would be nice.
What I am not sure is whether O_DIRECT is a good indicator of "doing a lot =
of
writing that won't be read back".

>Looking at UFS; it really only has a non-bufcache "rawread" path that uses
>pbufs (and flushes all dirty bufs on the vnode first!).  There is no equiv=
alent
>for O_DIRECT writes.  And ffs_rawread basically duplicates the ordinary re=
ad
>path for extremely limited cases (single iov, must be sector sized/aligned=
,
>etc) =97 it's unclear to me why it exists.
>
>ffs_write() just uses the ordinary buf cache, paying attention to ioflag &
>IO_DIRECT and using vfs_bio_set_flags(, ioflag) to propagate it to b_flags=
 &
>B_DIRECT.  (B_DIRECT causes the buffer to be released immediately when it =
is
>freed, instead of being cached.)
>
>I think we should probably learn from UFS for FUSE's IO modes:
>
>1. Keep and enable the direct_io option, for users who truly want to bypas=
s the
>buf cache entirely.  Preferably this is a per-mountpoint option rather tha=
n a
>global, but that is an orthogonal enhancement.  Confusingly, this is disti=
nct
>from opening a file O_DIRECT.  Maybe the sysctl/option can be renamed.  "r=
aw
>io?"
>
>2. Do not actually use the "direct" paths in FUSE outside of global direct=
_io
>mode (or a future MP-specific always-direct mode).
>
>3. A caveat here is: FUSE filesystems (?)don't have a native sector/block =
size,
>but the buf cache is in block units.  And, we translate O_WRONLY opens int=
o
>FUSE FUFH_WRONLY opens.  So there will be some trickiness in partial block
>writes with a O_WRONLY handle when the block is not in cache.  Today that =
is
>sidestepped by invoking direct mode, but shouldn't be.
>
>Anyway, this is all future cleanup ideas for this area.  For the more limi=
ted
>scope of fixing just this PR, we can probably draw inspiration from
>ffs_rawread_sync().

rick




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?QB1PR01MB35379C69D7EE70F000B1975FDD730>