Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Mar 2021 00:56:08 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Kevin Oberman <rkoberman@gmail.com>
Cc:        Adrian Chadd <adrian.chadd@gmail.com>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>, Konstantin Belousov <kib@freebsd.org>
Subject:   Re: Filesystem operations slower in 13.0 than 12.2
Message-ID:  <380B1597-C4E7-4CF0-AE04-085D4745BC65@yahoo.com>
In-Reply-To: <CAN6yY1tC_sDQzUDE6f-XxkBk_9-q8tAKxREYUHZUCv_%2B7J1q0w@mail.gmail.com>
References:  <12705C29-53EA-4484-8291-C409AF4B3DE5.ref@yahoo.com> <12705C29-53EA-4484-8291-C409AF4B3DE5@yahoo.com> <CAN6yY1tT%2Bjoi=eyqmSPYS3apSy3-6WVM13z%2BifEzCzqqHY6oLA@mail.gmail.com> <CANCZdfpkXNUcDyLHXufM3qAVbaBV7RW8Oh6bHCQzv3%2BrafHssg@mail.gmail.com> <CAN6yY1sd-7CzGczu_HK2Q8WbUoDOfGSS2%2Bb-Sr08EBMuke=3Bw@mail.gmail.com> <CANCZdfqYb8VdNbk1tdXQPpJGEaMCmGsuFcj7BR5Q3FiOd7Osag@mail.gmail.com> <CAN6yY1uc4CEv66h=rqaOne0saPEBPFaOkWWq9s5qMNu4SMzA8A@mail.gmail.com> <CAN6yY1uu=sAKFSYHW1gcKFxE3yD7-1JgL0N5j1iQsAViQ80g1w@mail.gmail.com> <F94A457A-D179-4BA7-A8A6-82609777EBFA@yahoo.com> <CAN6yY1tJXGCUK%2BCOoDO_9vGVcseogD%2BCrsdpMH9r=hvDVD3uiQ@mail.gmail.com> <CAJ-Vmo=sHn7X5kejcE-eYeQDKKe1=mrruZp4nQJeghdmzBEpeg@mail.gmail.com> <CAN6yY1tC_sDQzUDE6f-XxkBk_9-q8tAKxREYUHZUCv_%2B7J1q0w@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2021-Mar-22, at 22:51, Kevin Oberman <rkoberman at gmail.com> wrote:

> On Mon, Mar 22, 2021 at 8:19 AM Adrian Chadd <adrian.chadd@gmail.com> =
wrote:
>> On Mon, 15 Mar 2021 at 14:58, Kevin Oberman <rkoberman@gmail.com> =
wrote:
>>=20
>> > >
>> > > It appears that the messages are associated with reading
>> > > the disk(s), not directly with writing them, where the
>> > > reads take more than "hz * 20" time units to complete.
>> > > (I'm looking at main (14) code.) What might contribute
>> > > to the time taken for the pending read(s)?
>> > >
>> > The reference to hz * 20 woke up a few sleeping memory cells. I =
forgot that
>> > I cleaned up my loader.conf. It was largely a copy of the one on my
>> > decade-old T520. I commented out "kern.hz=3D100". I don't recall =
the details,
>> > but I think it was actually from an even older system, my T42 from =
before I
>> > retired.
>> >
>> > In any case, restoring this setting has greatly improved the =
situation. I
>> > now have really bad disk I/O performance on large disk to disk =
activity
>> > (untarring the firefox distro) instead of terrible performance and =
the
>> > system freezes have vanished, though I do see pauses in response to =
clicks
>> > or text entry, but the display remains active and the pauses are =
short... 1
>> > to 15 seconds, I'd guess. No, I have no idea what this indicates.
>>=20
>> ... which drive controller is this? Is it just a laptop ATA disk?
>>=20
>> > I'm still not seeing the performance I was seeing back in February =
when 40
>> > MB/s for extended intervals was common and I once untarred =
firefox.tar.gz2
>> > in under a minute and performance seldom dropped below 1.4 MB/s.
>>=20
>> Did you find a resolution?  I wonder if setting kern.hz is kicking
>> some process(es) to get some time more frequently due to bugs
>> elsewhere in the system (interrupts, IPI handling, wake-ups, etc)
>>=20
>>=20
>>=20
>> -adrian

> No resolution. This is a Lenovo L15 ThinkPad with a 2TB ATAPI drive.

I've not found documentation indicating the "which drive
controller" answer. That may have to be answered from boot
messages or boot -v messages or other such on FreeBSD.
(I've no access to such a machine.)

You might want to put a copy of such a log someplace that
folks could look at it. There may be commands that some
folks would like to see the output of. (I'm not all that
likely to be one that could put such to use but other
folks might be able to.)

Intel=C2=AE Celeron=C2=AE? 10th Generation Intel CoreTM i3? i5? i7?

> The current drive is a Seagate.  All testing has been done since I got =
it back from Lenovo in late January. I can read or write the drive at =
reasonable rates that exceed 50 MB/s. Extracting a tar distribution file =
is painful. I have had firefox extracts take over a half hour. Worse, if =
I do other operations while the extract is taking place, I often see a =
30 second (and, occasionally 60 second) display freezes

I thought that you had reported that use of kern.hz=3D100
had lead to "the system freezes have vanished" and "pauses
are short... 1 to 15 seconds". Did more testing show that
to not be always the case?

> as well as log reports that of "swap_pager: indefinite wait buffer:"

Unfortunately, I do not know how to investigate what is leading to
those message being generated. Figuring that out would seem to be
important but I do not know what to monitor to at least potentially
eliminate some possibilities.

One possible thing to look at is something like "gstat -spod"
output spanning the time of the untar. It would at least
indicate if a large queue backlog was accumulating on the
device. And the ms/r and ms/w columns would give a clue if
commands are sitting in the queues for long periods. (The
"d" may be a waste: no BIO_DELETEs possible? Also, the r/s
vs. ms/r are not rescaled reciprocals but distinct
measurements. Similarly for: the w/s vs. ms/w.)

Given the "indefinite wait buffer" messages, I expect
the ms/r and/or ms/w figures to be large at least some
of the time. Knowing how large may be of use to someone.
But I can not eliminate anything with such information.

>  This is a bit odd as I have 20G of RAM and am pretty close to no swap =
space activity, but, of course, paging does occur.=20

With 20 GiBytes of RAM, what is going on at the time that
leads to paging activity? I'm thinking of just untarring
the firefox file, not building firefox or such. Can you
test such an untar in a context that is not otherwise
paging (nor swapping)? If yes, is the behavior different
in any readily noticeable way?

> This system is CometLake and graphics are not supported on 12. I am =
not absolutely sure that there is not a hardware issue even though the =
main board, the disk, and the keyboard/mouse pad have all been replace =
since I received the system back last June. I now wonder what else could =
go wrong.
=20

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?380B1597-C4E7-4CF0-AE04-085D4745BC65>