Date: Thu, 17 Feb 2022 17:48:14 -0800 From: John-Mark Gurney <jmg@funkthat.com> To: Peter Jeremy <peterj@freebsd.org> Cc: FreeBSD FS <freebsd-fs@freebsd.org>, "freebsd-geom@FreeBSD.org" <freebsd-geom@freebsd.org> Subject: Re: bio re-ordering Message-ID: <20220218014814.GJ97875@funkthat.com> In-Reply-To: <Yf5IUCWW/tgI/Cse@server.rulingia.com> References: <YfTCs7j3TPZFcFCD@server.rulingia.com> <YfTEj1KLhQhoR3xP@kib.kiev.ua> <CANCZdfoqQ3Ze%2BcMTsk_ho9x8hsSM9=fTavSao%2BUtwc2nSAEJpQ@mail.gmail.com> <Yfo3i9Yy/uCUpss1@server.rulingia.com> <CANCZdfqBQOvzMCrJxWq9GzqCKyK_AubBE1CxAW5FULnE7D_jrg@mail.gmail.com> <b75872f4-521b-5eab-68d0-4b1c04a10add@FreeBSD.org> <CANCZdfp=0rbBkr4SoXhvn7hrQniPQzTeZra2HGBwXDGsJjN8XQ@mail.gmail.com> <9848cde6-5c12-cdd4-e722-42fe26fa0349@FreeBSD.org> <Yf5IUCWW/tgI/Cse@server.rulingia.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--Md/poaVZ8hnGTzuv Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Peter Jeremy wrote this message on Sat, Feb 05, 2022 at 20:50 +1100: > On 2022-Feb-02 11:49:44 +0200, Andriy Gapon <avg@freebsd.org> wrote: > >On 02/02/2022 11:14, Warner Losh wrote: > >> On Wed, Feb 2, 2022 at 2:05 AM Andriy Gapon <avg@freebsd.org=20 > >> <mailto:avg@freebsd.org>> wrote: > >> Hmm... it looks like both the old and new (Open)ZFS use BIO_FLUSH = command > >> without BIO_ORDERED flag.=A0 Not sure if it happens to do the righ= t thing anyway > >> or not. > >>=20 > >>=20 > >> It's an unordered flush then. The flush will happen whenever. I have a= vague > >> memory that ZFS will only issue this command in cases where there's no= other I/O > >> pending. > > > >I think that there is still a potential problem that an earlier write re= quest=20 > >might get re-ordered after the flush. > >I think that we should add BIO_ORDERED for correctness. >=20 > I've raised https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D261731 to > make geom_gate support BIO_ORDERED. Exposing the BIO_ORDERED flag to > userland is quite easy (once a decision is made as to how to do that). > Enhancing the geom_gate clients to correctly implement BIO_ORDERED is > somewhat harder. The clients are single threaded wrt IOs, so I don't think updating them are required. I do have patches to improve things by making ggated multithreaded to improve IOPs, and so making this improvement would allow those patches to be useful. I do have a question though, what is the exact semantics of _ORDERED? Does all the previous IOs have to be ack'd/received by the kernel before executing them, OR can once ggated, for example, received notification that the writes before an _ORDERED completes, that it can then execute the _ORDERED command w/o the other side receiving it? The reason I ask, is that if the connection is broken before the kernel ack's the pre-_ORDERED bios, but after the _ORDERED bio has been written, what are the implications? I can think of an issue where the pre and _ORDERED bio is overlapping that might cause issue. Here is the scenario that I'm thinking of. _WRITE 16 sectors at offset 0 _WRITE _ORDERED 16 sectors at offset 8 connection is now broken ggate reconnects kernel reissues both IOs. _WRITE 16 sectors at offset 0 kernel crashes before the second _WRITE happens and needs to read the data. We now have a situation where sectors 16-24 have "new" data, while sectors 8-16 have "old" data on them, which may corrupt what a FS thinks. And right now, the ggate protocol (from what I remember) doesn't have a way to know when the remote kernel has received notification that an IO is complete. I guess this situation isn't any worse than it is right now w/o passing the _ORDERED flag down though. > I've done some experiments and OpenZFS doesn't generate BIO_ORDERED > operations so I've also raised https://github.com/openzfs/zfs/issues/13065 > I haven't looked into how difficult that would be to fix. --=20 John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." --Md/poaVZ8hnGTzuv Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQJ8BAEBCgBmBQJiDvrdXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXQ2MEI1RTRGMTNDNzYyMDZDNjEyMDBCNjAy MDVGMEIzM0REMDA2QURBAAoJECBfCzPdAGraMXQP+wYeZjbb7MhdsnrY5nkmPzlY IUdJgEuU1obovHyakyUrhRLaRmnseyQriRtCm0kBgbcn2+hrq1CCA6+5fqifOfnX 9LS52440vXSbpQn9fybLNKcBLVZiaunqkG9NuuQEJO+b1Svdvfafz3EddH35xLMd ITxWh3uzEFYra/tsAZjZLfC1D3nbEKJt1WaEMINu+x6Chw8v9u3Gd+yUR+C51aVi 2K1JD/oEFBplB5uKBrMm4Cl/xBjoDwoOCsInWCR9D+YDrmLopZ0Ssj6GMO4HHFxA Lr+VWRGaY6Vx/2u48bTcxaye/TIMkc94wLeqFa32pIYdC/fSRWz71O+cJcupj0DD KOgmldm819FZPjT8+yq28nX4YptyU5YDxH8Un+z7a98AbqP7pfQ8sx4tmJhxgVZM OddFW9VrGXOLGYSqL1J3ILvZmN+WUhWtt4ffSLfWT3iZhX1qCuoYrPu0Wt5I1QYa x3E3zFF8KHlFwq8hU2EMOxrDZYKqhEW1umq81mifVKRmvYf/6hpDiij11CVf3mfw 8yZjYnu+4hFCYJoXTSKh9GYue80eLFUBNIpM9bXPphzUIng4uQEG2AjhsCMkzTS/ 7cejdHcgDT8xDYnKu8/+QD0w9ehDj7shT9lQFjcIMpqUtr9YVre6OR3vK5kCs9tk W85q5bZU+rZF2I8/70IH =GiPK -----END PGP SIGNATURE----- --Md/poaVZ8hnGTzuv--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20220218014814.GJ97875>