Date: Tue, 29 Sep 2015 09:19:42 -0600 From: Ian Lepore <ian@freebsd.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: freebsd-arm@freebsd.org Subject: Re: Shared page and related goodies for ARMv7 Message-ID: <1443539982.1224.433.camel@freebsd.org> In-Reply-To: <20150929132332.GH11284@kib.kiev.ua> References: <20150929132332.GH11284@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 2015-09-29 at 16:23 +0300, Konstantin Belousov wrote: > As an exercise to get myself more familiar with the ARM architecture, > I added the shared page for FreeBSD/ARMv7. This provides the standard > features tied to the shared page, in particular, a non-executable stack > for the compatible binaries, and fast gettimeofday() and clock_gettime() > functions. For reference, the measurements on my RPI2 done by > tools/tools/syscall_timing, show > for userspace gettimeofday: > % ./syscall_timing gettimeofday > Clock resolution: 0.000000053 > test loop time iterations periteration > gettimeofday 0 1.009965385 2743838 0.000000368 > gettimeofday 1 1.009899240 2743629 0.000000368 > gettimeofday 2 1.009952833 2538253 0.000000397 > gettimeofday 3 1.009918198 2404272 0.000000420 > gettimeofday 4 1.009875126 2404567 0.000000419 > gettimeofday 5 1.009950700 2405196 0.000000419 > gettimeofday 6 1.009859555 2623534 0.000000384 > gettimeofday 7 1.009911534 2743249 0.000000368 > gettimeofday 8 1.009928618 2743240 0.000000368 > gettimeofday 9 1.009920910 2743227 0.000000368 > for syscall: > gettimeofday 0 1.009994949 659319 0.000001531 > gettimeofday 1 1.009869846 583343 0.000001731 > gettimeofday 2 1.009899950 583384 0.000001731 > gettimeofday 3 1.009873232 636420 0.000001586 > gettimeofday 4 1.009909639 669715 0.000001507 > gettimeofday 5 1.009941201 669640 0.000001508 > gettimeofday 6 1.009930733 669051 0.000001509 > gettimeofday 7 1.009890005 669064 0.000001509 > gettimeofday 8 1.009915474 669168 0.000001509 > gettimeofday 9 1.009918860 668739 0.000001510 > > The patch is pretty much straightforward, interesting details are > listed below. > > - The shared page is only enabled for ARMv7 kernels. From my reading of > VMSA chapters for ARMv6 and ARMv7, only v7 ensures that there is no > cache aliasing for multiple-times mapped page, while v6 requires coloring. > Shared page is mapped both at the top of UVA and somewhere in KVA. > - There is a bug in the generic timer setup, it seems. The CNTKCTL CP15 > register is core-private, which means that the in-tree code only sets access > permissions on the BSP. APs CNTKCTL are left in undefined state, possibly > set up to some value by loader. This might allow userspace to reprogram > timers on APs. I fixed this by using rendezvous after SMP is started. > - I have to add explicit directives to create .note.GNU-stack sections in > some __eabi files from libcompiler-rt which are linked into libc. > Upstream refused to do global change adding the stack note for all asms, > recommending to live with --noexecstack assembler option. But we cannot > do this for files linked into libc. > - arm64 would require some additions, I did not tested the build. > > It would be useful to test the patch on ARMv6 to ensure that signals and > gettimeofday() work. Some things, in no particular order... I can't do anything with an inline email patch (my mail client destroys whitespace). Can you send it as an attachment, or put it somewhere on freefall or something please? There is no difference between armv6 and armv7 in our world. The only armv6 chip we support is the one used in the original rpi and it has a 16K 4-way L1 cache which means the page coloring issue disappears and we can treat it the same as an armv7 chip (different cache ops, but the caches behave the same). I just skimmed through the patch quickly and the main thing that jumps out at me is that what you've done works only on rpi2 and aarch64, because those are the only platforms that support that timer hardware. (That means I can't test it, but once I get your patch in a usable form I can have a shot at implementations for other timers). It's not clear to me that this scheme can even work on most armv7 hardware because of the timer hardware involved. I think it would mean giving userland read access to a whole page worth of IO space and in some cases there are registers in that range where reads have side effects whose consequences could be dire (such as pending-interrupt registers). -- Ian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1443539982.1224.433.camel>