Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Nov 2020 10:30:17 -0800
From:      John Baldwin <jhb@FreeBSD.org>
To:        Konstantin Belousov <kostikbel@gmail.com>, "Alexander V. Chernikov" <melifaro@ipfw.ru>
Cc:        freebsd-arch <freebsd-arch@freebsd.org>
Subject:   Re: Versioning support for kernel<>userland sysctl interface
Message-ID:  <efddf8f3-ae00-72f3-a66d-ed603c19a4e1@FreeBSD.org>
In-Reply-To: <20201101183919.GK2654@kib.kiev.ua>
References:  <356181604233241@mail.yandex.ru> <20201101183919.GK2654@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/1/20 10:39 AM, Konstantin Belousov wrote:
> On Sun, Nov 01, 2020 at 12:47:17PM +0000, Alexander V. Chernikov wrote:
>>
>> Hey folks,
>>
>> I would like to propose a change [1] that introduces versioning support for the data structures exposed to userland by sysctl interface.
>>
>> We have dozens of interfaces exposing various statistics and control data by filling in and exporting structures.
>> net.inet6.icmp6.stats or net.inet6.icmp6.nd6_prlist can be a good examples of such interaction.
>>
>> Most of these structure do not have version information embedded, which requires us to break compatibility when changing them.
>>
>> The idea behind the change is really simple: append current structure version number to the sysctl OID to get the desired version of the structure.
>>
>> For example, fetching "net.inet6.icmp6.stats" becomes "net.inet6.icmp6.stats.1" (or, code-wise, something like "net.inet6.icmp6.stats." __XSTRING(ICMP6STAT_VER)).
>>
>> The interface satistifes the following properties:
>> 1) preserving backward compatibility
>> 2) allowing for low-cost kernel ABI maintenance
>> 2) allowing for forward compatibility - application can fetch list of all supported versions of a structure.
>>
>>
>> Example:
>> 11:25 [1] m@devel0 sysctl -o net.inet6.icmp6.stats
>> net.inet6.icmp6.stats.0: Format:S Length:4328 Dump:0x00000000000000000000000000000000...
>> net.inet6.icmp6.stats.1: Format:S Length:4624 Dump:0x00000000000000000000000000000000...
>>
>> 12:42 [1] m@devel0 ~/test net.inet6.icmp6.stats
>> sysctlnametomib("net.inet6.icmp6.stats")=0 -> 4.28.58.1.
>> sysctl("net.inet6.icmp6.stats")=-1 sz=512
>>
>> 12:43 [1] m@devel0 ~/test net.inet6.icmp6.stats.1
>> sysctlnametomib("net.inet6.icmp6.stats.1")=0 -> 4.28.58.1.1.
>> sysctl("net.inet6.icmp6.stats.1")=-1 sz=512
>>
>>
>>
>>
>> Some downside of this change would be the potential need to duplicate structures definitions to be 100% sure we don't break API. For example, rebuilding & running 3rd-party software may result in error fetching the necessary structure. Unmodified application build with the latest structure version will request an oldest version of a structure.
>>
>> I see multiple approaches to address it:
>> 1) duplicate structure with a new name (appending postfix like _v) - works the best for small structure
>> 2) do nothing specific - will mostly work for append-only statistics structures
>> 3) rely on kernel warning on calling unversioned sysctls to identify & fix the problematic customers
>>
>> Please take a look at [1] for a more detailed technical description of a change. 
>>
>> Any feedback is highly appreciated.
>>
>> [1] https://reviews.freebsd.org/D27035
> 
> There was some desire to provide backward ABI-compat shims for sysctls during
> ino64 work, https://reviews.freebsd.org/D10439.
> 
> Most prominent idea from that time, AFAIR, was to have another MIB tree,
> that would be have all the same MIBs but rooted with osrel.  In other words,
> if you accessed e.g. MIB 1.2.3.4, libc internally translates that to MIB
> 1024.<osrel>.1.2.3.4, and kernel applies whatever shims it knows about that
> osrel version.  If there is no compat, call goes directly to 1.2.3.4 handler.
> The osrel value can be taken from the binary ABI note, as an example.
> 
> There was some discussion, but after more work done on this, it appeared
> that not much sysctls need ABI shims at all, and interesting cases could
> be adequately handled simply by checking passed buffer length.
> 
> The osrel approach has a drawback that it ignores possibly different ABI
> of the loaded shared library which might make the call.  On the other hand,
> it avoids introducing additional burden of requiring consumers to learn
> new MIBs and manually handle versions.

Some other thoughts were along the lines of having a kind of "sysctl tree"
version that would get bumped when there was an ABI breakage of a node
and then have associated versioned symbols of sysctl() and related symbols
in libc to handle the shared library problem.  However, you'd want to avoid
an explosion of symbol versions.

One of the goals was to keep API compat as much as possible.  I think we
might have also considered having 'sysctl_ver()' that takes an explicit
__FreeBSD_version value and having 'sysctl()' become a macro that passes
in '_FreeBSD_version' to sysctl_ver() so that you encode the desired ABI
in each invocation.  You would have to keep existing symbols in libc that
would pass in a version of 0.  One of the concerns with this approach is
it removes the public 'sysctl' symbol which might break configure, etc.
scripts.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?efddf8f3-ae00-72f3-a66d-ed603c19a4e1>