Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Jan 2022 06:37:35 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        peterj@freebsd.org
Cc:        freebsd-fs@freebsd.org, freebsd-geom@freebsd.org
Subject:   Re: bio re-ordering
Message-ID:  <YfTEj1KLhQhoR3xP@kib.kiev.ua>
In-Reply-To: <YfTCs7j3TPZFcFCD@server.rulingia.com>
References:  <YfTCs7j3TPZFcFCD@server.rulingia.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jan 29, 2022 at 03:29:39PM +1100, peterj@freebsd.org wrote:
> I'm working on a GEOM Gate network client to better handle high-latency
> connections and have some questions regarding bio ordering assumptions
> (alternatively, how much should I be able to re-order bio requests without
> breaking things).  Within geom_gate, an incoming bio request is retrieved
> from the kernel using a G_GATE_CMD_START ioctl, processed in userland
> (typically by forwarding it to a remote system) and then returned via a
> G_GATE_CMD_DONE ioctl.  My GEOM Gate client can reorder requests quite
> aggressively and I suspect it's breaking some kernel assumptions regarding
> bio behaviour.  The following questions assume that BIO_READ, BIO_WRITE and
> BIO_FLUSH are valid but BIO_DELETE isn't supported.
> 
> a) In the absence of BIO_FLUSH operations, what (if any) are the limits on
>    reordering operations?  Given a block that initially contains A, followed
>    by a write B, read and write C, is there any constraint on which content
>    the read returns?
There are no limits.  Either other software entities, or hardware itself,
can process requests in arbitrary order.  This is why things are typically
done in the completion handler, and part of the reason why the complexity
of UFS SU exists.

> 
> b) Are individual BIO_READ and BIO_WRITE operations expected to be atomic
>    with respect to other BIO_WRITE operations?  Give 2 adjacent blocks that
>    initially contain AB, and successive write CD, read and write EF
>    operations to those blocks, is it expected that the read would return CD
>    (or maybe AD or EF, assuming that's valid from the previous question) or
>    could the write operations partially complete in different orders,
>    resulting in something like AD, CF, EB etc?
No.  At very least, underlying entities can split request into several,
each of which is ordered individiually.  Typically, it is higher-level
code that ensures that there are no concurrent modifications of the same
block.  For instance, we exclusively lock vnodes and buffers around 
metadata updates.  Similarly, we lock buffers until the data is written
to the device.

> 
> b) I assume that a BIO_FLUSH should not return DONE until all preceeding
>    write operations have completed issued.  Is it required that write
>    operations issued after the BIO_FLUSH must not complete before the
>    BIO_FLUSH completes?
UFS SU relies on BIO_FLUSH being the full barrier.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YfTEj1KLhQhoR3xP>