Date: Thu, 25 Nov 2021 20:52:04 -0500 From: Shawn Webb <shawn.webb@hardenedbsd.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: David Chisnall <theraven@freebsd.org>, freebsd-current@freebsd.org Subject: Re: VDSO on amd64 Message-ID: <20211126015204.5pbzmctz2fqha6o4@mutt-hbsd> In-Reply-To: <YZ/pr4RmMUuD7Rm/@kib.kiev.ua> References: <YZ72kgvfGR5D%2Bzs2@kib.kiev.ua> <7e7f4ba7-16b3-fa6e-fa1d-e9df957e91f1@FreeBSD.org> <YZ/pr4RmMUuD7Rm/@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
--22hc2edju7gzauni Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 25, 2021 at 09:53:19PM +0200, Konstantin Belousov wrote: > On Thu, Nov 25, 2021 at 09:35:53AM +0000, David Chisnall wrote: > > Great news! > >=20 > > Note that your example of throwing an exception from a signal handler w= orks > > because the signal is delivered during a system call. The compiler > > generates correct unwind tables for calls because any call may throw. > The syscalls itself are not annotated, I consider fixing this after vdso > lands. >=20 > >=20 > > If you did something like a division by zero to get a SIGFPE or a > > null-pointer dereference to get a SIGSEGV then the throw would probably= not > > work (or, rather, would be delivered to the right place but might corru= pt > > some register state). Neither clang nor GCC currently supports non-call > > exceptions by default. > Well, yes, the part of it was that the signal was synchronous. I was alw= ays > curious, how good are unwind tables generated by -fasynchronous-unwind-ta= bles > with this regard. >=20 > But still, the fact that unwinder stepped over the signal frame amused me. >=20 > >=20 > > This mechanism is more useful for Java VMs and similar. Some Linux-bas= ed > > implementations (including Android) use this to avoid null-pointer chec= ks in > > Java. > >=20 > > The VDSO mechanism in Linux is also used for providing some syscall > > implementations. In particular, getting the current approximate time a= nd > > getting the current CPU (either by reading from the VDSO's data section= or > > by doing a real syscall, without userspace knowing which). It also prov= ides > > the syscall stub that is used for the kernel transition for all 'real' > > syscalls. This doesn't matter so much on amd64, but on i386 it lets th= em > > select between int 80h, syscall or sysenter, depending on what the hard= ware > > supports. > >=20 > >=20 > > A few questions about future plans: > >=20 > > - Do you have plans to extend the VDSO to provide system call entry po= ints > > and fast-path syscalls? It would be really nice if we could move all o= f the > > libsyscalls bits into the VDSO so that any compartmentalisation mechani= sm > > that wanted to interpose on syscalls just needed to provide a replaceme= nt > > for the VDSO. > No. >=20 > Moving syscall entry point to VDSO is pointless: > - it would add one more level of indirection before SYSCALL, > - we do not have slow syscall entry point on amd64 so there is nothing to > choose. >=20 > And optimizing 32bit binaries (where we could implement slightly faster > syscall entry) is past its importance. >=20 > Basically, we do not have to split libc into libc proper and VDSO, as > Linux has. We can implement features for syscall boundary from both > sides of kernel, because libc and kernel are developed under the same > project. Usermode timehands, fast signal blocks, upcoming rseq support, > just to name a few of them, all benefit from this model. >=20 > VDSO is only needed for us to provide the unwind annotations on the signal > trampoline, in a way expected by unwinders. >=20 > >=20 > > - It looks as if the Linux VDSO mechanism isn't yet using this. Do you > > plan on moving it over? > No. >=20 > >=20 > > - I can't quite tell from kern_sharedpage.c (this file has almost no > > comments) - is the userspace mapping of the VDSO randomised? This has = been > > done on Linux for a while because the VDSO is an incredibly high-value > > target for code reuse attacks (it can do system calls and it can restor= e the > > entire register state from the contents of an on-stack buffer if you can > > jump into it). > Not now. Randomizing shared page location is not too hard, but there are > some ABI issues to sort out. We live with fixed-mapped shared page for > more than 10 years. As a point of reference, HardenedBSD's PaX-inspired ASLR implementation has randomized the shared page for more than half a decade now without issue. I suspect FreeBSD will find, if applied properly, randomization of the shared page (now VDSO) likely won't break anything. Thanks, --=20 Shawn Webb Cofounder / Security Engineer HardenedBSD https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A= 4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc --22hc2edju7gzauni Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEA6TL67gupaZ9nzhT/y5nonf44foFAmGgPcIACgkQ/y5nonf4 4fqBoA//S2YLTbp/fAnoPQ9FoD0S6VFdZAqoFkC/EmJqhi2f6/3y5JaSrOX8odi6 G0oBmkIBemL9yUpNnV99+KD/AJwzGlI9ZkeGfFYqFjtB6JwS0ejjff/AYXzy/Pyp ZF5EOo/fRZhOtQGAu1ix798otFPC33UCH579hSOcCpDGg3jxmUdXxXQgegxduQkw zOYLz8UXNv0dD9MkY1VbhZtnhUfN8uEwr9zdCwF8yuwHXlKcWywSxeZP3BUY4fkr 89jBoUQWhOw6e/tfauX5aKBtnuIikkYYVihBrmtt/Qad5/n2dzECpmeI7UW+klc2 Mwqa0ef+Yp+WNqCoFaj8Kc98oFdpoSucfFxERJpHQ7CaE6+P1bll9fgiZEMIz+uN fJTC64yIp+Kjn/O37Zijh47sruQKC3tR+5+soOG3GV8ZcflGmeBcUBdnQ8IcqD5g X/kmmYS48ejyhRzD7OGOMcVpIL+5eGi+gsprhbPmTCtQVGHYh32nD6VXYWwXXXs5 wGMT2pxxgwOEiYE1RQhnhLIEwxpj1D+3MjyzpqN3S9lGFoFk25vxnoz3FGyZaX4h 7wvFxKzVu2UDWwI574bit6v8VTiGzI8l9TCthQSnTcOd/tpPkPGKhGh5916Gt/Yi sfMbAbCOtVLwWYxGf1UW1m3t+XaXn0yKW3wID6iDcWP+RZEBN7g= =ydTv -----END PGP SIGNATURE----- --22hc2edju7gzauni--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20211126015204.5pbzmctz2fqha6o4>