Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Aug 2021 14:04:49 +0200
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Alan Somers <asomers@freebsd.org>
Cc:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   Re: sysctl is too slow
Message-ID:  <CAGudoHFGHdkcU7TnGFpXxzFTs2mbS3JdNdckJ=RG4urU9=k5gw@mail.gmail.com>
In-Reply-To: <CAOtMX2iwRCwykfb=sumDGjWMRZ1HeRJGk2POBTDz12CjsihU1A@mail.gmail.com>
References:  <CAOtMX2h7xkDM=GsPVyiWNcqxfRo7euZuuquSMn-y=PY5zRZNjg@mail.gmail.com> <CAGudoHGxWBLW2D6JX8mQCPwgM=ngt%2B3uZmwxK5p7yM6XeXXjsQ@mail.gmail.com> <CAOtMX2jVzFURn6S0W9ygDpAjEK78ApEjz0C8hQYQG6UWPYY-Zw@mail.gmail.com> <CAGudoHG%2BLjJQjxenNdrcfTLtnnkOr2jC-bpcX_BWtO-CSZTYAw@mail.gmail.com> <CAOtMX2iwRCwykfb=sumDGjWMRZ1HeRJGk2POBTDz12CjsihU1A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
So is this something you plan on fixing?

On 8/17/21, Alan Somers <asomers@freebsd.org> wrote:
> Actually, I did get a flamegraph, and only 0.77% of samples were in ZFS.
>
> On Mon, Aug 16, 2021 at 7:19 PM Mateusz Guzik <mjguzik@gmail.com> wrote:
>
>> On 8/16/21, Alan Somers <asomers@freebsd.org> wrote:
>> > Yes, I see what you're talking about now.  There are a bunch of linked
>> > lists in sysctl_find_oid etc.  Good point.
>> > -Alan
>> >
>>
>> You still want to get a flamegraph, chances are most of the problem is in
>> zfs.
>>
>> > On Mon, Aug 16, 2021 at 1:30 PM Mateusz Guzik <mjguzik@gmail.com>
>> > wrote:
>> >
>> >> Last time I checked lookup of a sysctl was very bad with linear scans
>> all
>> >> over.
>> >>
>> >> Short of complete revamp of the entire thing I would start with
>> >> replacing the scans with a RB tree at each level. As is if you indeed
>> >> have 5000 datasets, you are doing increasingly longer walks.
>> >>
>> >> On 8/16/21, Alan Somers <asomers@freebsd.org> wrote:
>> >> > ztop feels very sluggish on a server with 5000 ZFS datasets.  Dtrace
>> >> shows
>> >> > that almost all of its time is spent in sys_sysctl.  ktrace shows
>> >> > that
>> >> both
>> >> > ztop and sysctl(8) call sys_sysctl a total of five times for each
>> >> > sysctl
>> >> > they care about:
>> >> >
>> >> > 1) To get the next oid
>> >> > 2) To get the sysctl's name
>> >> > 3) To get the oidfmt
>> >> > 4) To get the size of the value
>> >> > 5) To get the value itself.
>> >> >
>> >> > Each of these steps takes about equal time, and together all five
>> >> > take
>> >> > about 100us.  If the time per call is mostly syscall overhead, then
>> the
>> >> > process could be sped up by 80% by combining all of these things
>> >> > into
>> a
>> >> > single syscall: return the next oid, its name, its format, the size
>> >> > of
>> >> its
>> >> > value, and optimistically the value itself, assuming the user passed
>> >> > a
>> >> > sufficiently large buffer.
>> >> >
>> >> > Am I missing something?  Is there any other reason why sysctl is so
>> >> > slow?
>> >> > Or should I forget about it, and try to export ZFS's dataset stats
>> >> through
>> >> > devstat instead?
>> >> > -Alan
>> >> >
>> >>
>> >>
>> >> --
>> >> Mateusz Guzik <mjguzik gmail.com>
>> >>
>> >
>>
>>
>> --
>> Mateusz Guzik <mjguzik gmail.com>
>>
>


-- 
Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHFGHdkcU7TnGFpXxzFTs2mbS3JdNdckJ=RG4urU9=k5gw>