Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 6 Mar 2021 08:37:24 +0200
From:      Christos Chatzaras <chris@cretaforce.gr>
To:        FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: Filesystem operations slower in 13.0 than 12.2
Message-ID:  <A04014E7-6B81-466C-8B5B-C0CD58A91E77@cretaforce.gr>
In-Reply-To: <YEK6wbfgMa4vgFTV@kib.kiev.ua>
References:  <202103051842.125IgNl9013402@nuc.oldach.net> <D89C3A09-53F9-420C-B92F-17B7DE6D8B2A@cretaforce.gr> <YEK6wbfgMa4vgFTV@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello Konstantin,

> On 6 Mar 2021, at 01:12, Konstantin Belousov <kostikbel@gmail.com> =
wrote:
>=20
> There was (is) bugs in FreeBSD UFS SU < 13
> - some LoR existed in SU code, where it needed to lock a containing =
directory
>  to provide posix guarantees for fsync(), while owning the vnode lock. =
 I
>  do not believe it is observable in a real-world uses

If you are talking about these changes:

https://svnweb.freebsd.org/base?view=3Drevision&revision=3D367672 =
<https://svnweb.freebsd.org/base?view=3Drevision&revision=3D367672>;

then only during doing Prestashop translations, and after clicking on =
"Save" it removes and recreates Prestashop cache in /var/cache/prod =
directory could trigger a "processes hanging in ufs state". I use =
FreeBSD since 6.x and it was the first time I could trigger it (maybe =
it's related to specific Prestashop version too).

> - in some situations UFS SU in < 13 did not performed necessary =
fsync()
>  of the directory, related to the previous item
> The end result was that after sucessfull fsync() followed by a system
> failure e.g. power or panic, the parent directory for the synced
> vnode would not be synced and the vnode dirent' is not written to the
> permanent store. This volatiles posix requirement that after fsync, =
the
> data can be read, since you plain cannot open the file.
>=20
> During the development of the patch to fix both LoR and related
> ommission of fsync, a mistake was made resulting in much more =
aggessive
> syncing of directories. It was not exactly that, but approximately, on
> most of metadata operations that created or removed directory entry,
> the directory was fully synced. This resulted in the significant slow
> down, which was eliminated around BETA4..RC1. I.e. most of fixes come =
to
> BETA4, but minor parts were only discovered later and ready for RC1.

I ask these questions to better understand how a FreeBSD developer works =
(and more specifically when a bug is not reported).

1) How you discover about this LoR / fsync ommission bug? Someone else =
found it and report it (I couldn't find a PR for this)? Is it discovered =
by a test suite? You found it by doing other work in this part of the =
code?

2) When I report the slowdown with BETA2 few weeks ago, you replied that =
this is a known bug and it will be fixed in BETA3 or BETA4.
After the initial patches that made more aggessive syncing of =
directories, how did you discover the slowdown?

> There are still more fsync(dir) in 13RC1 than it is in any 12, by the =
nature
> of the bug and its fix, but the current belief is that all fsync calls =
left
> in the flow are required for correctness.

Thank you for explaining these changes.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A04014E7-6B81-466C-8B5B-C0CD58A91E77>