Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 6 Mar 2021 01:12:01 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Christos Chatzaras <chris@cretaforce.gr>
Cc:        FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, Kirk McKusick <mckusick@mckusick.com>
Subject:   Re: Filesystem operations slower in 13.0 than 12.2
Message-ID:  <YEK6wbfgMa4vgFTV@kib.kiev.ua>
In-Reply-To: <D89C3A09-53F9-420C-B92F-17B7DE6D8B2A@cretaforce.gr>
References:  <202103051842.125IgNl9013402@nuc.oldach.net> <D89C3A09-53F9-420C-B92F-17B7DE6D8B2A@cretaforce.gr>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 06, 2021 at 12:27:55AM +0200, Christos Chatzaras wrote:
> I did some more tests. Finally I see similar results (with the exception of one "portsnap extract" test). Also with 13.0 I can't trigger a bug that I describe here:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=250576
> 
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> Command: /usr/bin/time -l rm -fr /usr/ports /usr/src (these tests done with exactly the same hardware - I upgrade 12.2p4 to 13.0-RC1 for the 2nd test)
> 
> FreeBSD 12.2p4
> 
>        12.67 real         0.36 user         1.94 sys
>        13.18 real         0.41 user         1.81 sys
>        12.16 real         0.36 user         1.85 sys

> FreeBSD 13.0-RC1
> 
>        16.71 real         0.63 user         3.02 sys
>        14.53 real         0.48 user         2.98 sys
>        13.97 real         0.70 user         2.85 sys
> 
> Command: /usr/bin/time -l tar xf src.tar (these tests done with 2 different idle servers but with same 4TB HDDs models)
> 
> FreeBSD 12.2p4
> 
>        37.35 real         1.03 user         3.34 sys
> 
> FreeBSD 13.0-RC1
> 
>        44.97 real         1.15 user         3.34 sys
> 
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> Command: /usr/bin/time -l tar xf ports.tar (these tests done with 2 different idle servers but with same 4TB HDDs models)
> 
> FreeBSD 12.2p4
> 
>        50.80 real         1.55 user         4.62 sys
> 
> FreeBSD 13.0-RC1
> 
>        59.93 real         1.69 user         4.73 sys
> 
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> 
> Command: /usr/bin/time -l portsnap extract (these tests done with 2 different idle servers but with same 4TB HDDs models)
> 
> FreeBSD 12.2p4
> 
>        99.45 real        34.90 user        59.63 sys
>       100.00 real        34.91 user        59.97 sys
>        82.95 real        35.98 user        60.68 sys
> 
> FreeBSD 13.0-RC1
> 
>       217.43 real        75.67 user       110.97 sys
>       125.50 real        63.00 user        96.47 sys
>       118.93 real        62.91 user        96.28 sys
I trimmed the data above to show the interesting numbers more compact.
In the portsnap results for 13RC1, the variance is too high to conclude
anything, I think.

There was (is) bugs in FreeBSD UFS SU < 13
- some LoR existed in SU code, where it needed to lock a containing directory
  to provide posix guarantees for fsync(), while owning the vnode lock.  I
  do not believe it is observable in a real-world uses
- in some situations UFS SU in < 13 did not performed necessary fsync()
  of the directory, related to the previous item
The end result was that after sucessfull fsync() followed by a system
failure e.g. power or panic, the parent directory for the synced
vnode would not be synced and the vnode dirent' is not written to the
permanent store. This volatiles posix requirement that after fsync, the
data can be read, since you plain cannot open the file.

During the development of the patch to fix both LoR and related
ommission of fsync, a mistake was made resulting in much more aggessive
syncing of directories. It was not exactly that, but approximately, on
most of metadata operations that created or removed directory entry,
the directory was fully synced. This resulted in the significant slow
down, which was eliminated around BETA4..RC1. I.e. most of fixes come to
BETA4, but minor parts were only discovered later and ready for RC1.

There are still more fsync(dir) in 13RC1 than it is in any 12, by the nature
of the bug and its fix, but the current belief is that all fsync calls left
in the flow are required for correctness.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YEK6wbfgMa4vgFTV>