Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 May 2023 11:56:17 +0200
From:      Hans Petter Selasky <hps@selasky.org>
To:        Konstantin Belousov <kostikbel@gmail.com>, Warner Losh <imp@bsdimp.com>
Cc:        "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
Subject:   Re: [RFC] An idea for general kernel post-processing automation in FreeBSD
Message-ID:  <394775eb-1000-60d9-2ddd-5c2aca6c99ea@selasky.org>
In-Reply-To: <ZGq03_RdpqXppA-h@kib.kiev.ua>
References:  <b0461320-fb92-95cf-609b-3550eb30a589@selasky.org> <CANCZdfowk1oQnw30hk%2BA-pSq0_QK52dahF6x3AUi=A76J18R9w@mail.gmail.com> <ZGq03_RdpqXppA-h@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 5/22/23 02:18, Konstantin Belousov wrote:
> On Sun, May 21, 2023 at 05:34:26PM -0600, Warner Losh wrote:
>> On Sun, May 21, 2023 at 2:13 PM Hans Petter Selasky <hps@selasky.org> wrote:
>>
>>> However, if the data in question is sorted at compile time, then zero
>>> time will be spent sorting the data in the kernel. When a kernel module
>>> is loaded and the sysinit/constructors are already sorted by their
>>> subsystem/order/priority, all you need to do is to merge two sorted
>>> lists into another sorted list, and this is pretty much linear.
>>>
>>> The question is, who can sort the sysinits:
>>>
>>
>> The bigger question is "Do we even want to do that?"

Hi,

>> Switching to a faster sort gets us from milliseconds to microseconds
>> without adding a lot of hair.

When you can more than double the benefit by simple means, why not just 
do that?

I know embedded CPUs run at kHz speeds when booting up, and only later 
on when clocks and such are properly setup, you get the GHz speeds.

And Linux pre-sorts the initializers too, by a simple "level". It looks 
like they encode it into the symbol name itself.

>>
>>
>>> 1) "ld" can sort symbols by name, but encoding the priority into the
>>> symbol name is difficult, especially when arithmetic expressions are
>>> used. The C-pre-processor can only concat stuff. No, character
>>> arithmetic is possible. This solution is not possible.
>>>
>>> 2) The constructor attribute in "C" can take a priority argument,
>>> probably 32-bit and we need 64-bits. Sounds great, but how can you edit
>>> these lists and merge the constructors? What about destructors and
>>> priority. Maybe possible.
>>>
>>> 3) The compiler can output strings to a magic file during compilation,
>>> like the name of global variables to collect and sort. The C-compiler
>>> has #error and #warning, but there is no #write_to_file, simply
>>> speaking. Another solution is to store the file output into a separate
>>> section in the object files and use objcopy to extract the text later
>>> on, and process it.
>>>
>>
>> These are all fragile. I don't think the benefit makes the fragility
>> worth it.
> I agree.  Linker tricks are cute but often depend on minor features of
> linkers that break often.

Yes, the linker supported sorting tricks are fragile. My proposal is to 
use the already existing named sections for this. This is supported all 
over the place, gcc, clang and so on.
>>> It may also be another way to fetch PCI/USB device ID information,
>>> instead of using linker hints. Currently when probing USB devices, devd
>>> has a linear list of hints it iterates. That means for every device
>>> being plugged, you need to search all products supported for a match.
>>> This is slow. Instead a tool could at least sort the entries, so a
>>> binary search type function could be used, resulting in O(log2(N))
>>> searching time, instead of O(N).
>>>
>>
>> Except that data is pathologically heterogeneous. There's nothing to sort
>> in any meaningful way. And it's all about the runtime environment, which
>> is impossible to know at build time (today we do it at run time). The linker
>> hints file to know which of many things to load is almost optimal... Each
>> file is different, especially for PCI, which is why we pre-process it once
>> and put it into the linker hints.... So it's pretty good once the system is
>> running, but at startup we parse it multiple times and likely would do more
>> as we move more towards MINIMAL. It's been suggested that we move
> BTW, if we are moving to MINIMAL (which I quite like), then with the
> proposed approach we need to merge at least N + 1 lists, where N is the number
> of modules.  Can the merge be done in place without incurring n**2
> behaviour?

Currently when klds are added during load, a new list is created, 
replacing the current one. Technically not sorted in-place. This leads 
to N**2 processing, but N is the number of modules loaded, and not the 
number of sysinits. You can also merge sysinits from two and two 
modules, leading to log2(N) processing time. Something like bulk-loading 
modules.

> 
>> this into devd, so the data persists in its most useful form for a long
>> time,
>> and that's not a bad suggestion for dealing with the growing number of
>> execs to make this all work... It's very Unix Tooly, but also there's likely
>> a good case to be made to optimize here... There's some ordering issues
>> that would need to be worked out too...
> 
> Overall, my feel is the same: if better sort can be used to speed this up,
> it is better to not create a monster build system.  In kernel, we are not
> limited by ABI or POSIX constraints.
> 

Prototypes need to start somewhere. This is what we want to do and these 
are the requirements. Having a separate debug channel directly from the 
compiler, maybe a debug_printf() attribute, seeing compile time 
constants the same way the compiler does, is more clean.

What comes first? The debug_printf() attribute or the section data hack? 
I guess it is easier to start with an egg, than a chicken.

Linux guys use a perl-script invoking "nm" to read out all initializers 
and generates a linker script file:

https://elixir.bootlin.com/linux/latest/source/scripts/generate_initcall_order.pl

My solution does not require anything more than standard in-base UNIX 
utilities. And given our linker is based on LLVM and not GNU, it's 
probably better to generate C-code than linker script code?

--HPS



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?394775eb-1000-60d9-2ddd-5c2aca6c99ea>