Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Nov 2021 20:52:04 -0500
From:      Shawn Webb <shawn.webb@hardenedbsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        David Chisnall <theraven@freebsd.org>, freebsd-current@freebsd.org
Subject:   Re: VDSO on amd64
Message-ID:  <20211126015204.5pbzmctz2fqha6o4@mutt-hbsd>
In-Reply-To: <YZ/pr4RmMUuD7Rm/@kib.kiev.ua>
References:  <YZ72kgvfGR5D%2Bzs2@kib.kiev.ua> <7e7f4ba7-16b3-fa6e-fa1d-e9df957e91f1@FreeBSD.org> <YZ/pr4RmMUuD7Rm/@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

--22hc2edju7gzauni
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, Nov 25, 2021 at 09:53:19PM +0200, Konstantin Belousov wrote:
> On Thu, Nov 25, 2021 at 09:35:53AM +0000, David Chisnall wrote:
> > Great news!
> >=20
> > Note that your example of throwing an exception from a signal handler w=
orks
> > because the signal is delivered during a system call.  The compiler
> > generates correct unwind tables for calls because any call may throw.
> The syscalls itself are not annotated, I consider fixing this after vdso
> lands.
>=20
> >=20
> > If you did something like a division by zero to get a SIGFPE or a
> > null-pointer dereference to get a SIGSEGV then the throw would probably=
 not
> > work (or, rather, would be delivered to the right place but might corru=
pt
> > some register state).  Neither clang nor GCC currently supports non-call
> > exceptions by default.
> Well, yes, the part of it was that the signal was synchronous.  I was alw=
ays
> curious, how good are unwind tables generated by -fasynchronous-unwind-ta=
bles
> with this regard.
>=20
> But still, the fact that unwinder stepped over the signal frame amused me.
>=20
> >=20
> > This mechanism is more useful for Java VMs and similar.  Some Linux-bas=
ed
> > implementations (including Android) use this to avoid null-pointer chec=
ks in
> > Java.
> >=20
> > The VDSO mechanism in Linux is also used for providing some syscall
> > implementations.  In particular, getting the current approximate time a=
nd
> > getting the current CPU (either by reading from the VDSO's data section=
 or
> > by doing a real syscall, without userspace knowing which). It also prov=
ides
> > the syscall stub that is used for the kernel transition for all 'real'
> > syscalls.  This doesn't matter so much on amd64, but on i386 it lets th=
em
> > select between int 80h, syscall or sysenter, depending on what the hard=
ware
> > supports.
> >=20
> >=20
> > A few questions about future plans:
> >=20
> >  - Do you have plans to extend the VDSO to provide system call entry po=
ints
> > and fast-path syscalls?  It would be really nice if we could move all o=
f the
> > libsyscalls bits into the VDSO so that any compartmentalisation mechani=
sm
> > that wanted to interpose on syscalls just needed to provide a replaceme=
nt
> > for the VDSO.
> No.
>=20
> Moving syscall entry point to VDSO is pointless:
> - it would add one more level of indirection before SYSCALL,
> - we do not have slow syscall entry point on amd64 so there is nothing to
>   choose.
>=20
> And optimizing 32bit binaries (where we could implement slightly faster
> syscall entry) is past its importance.
>=20
> Basically, we do not have to split libc into libc proper and VDSO, as
> Linux has. We can implement features for syscall boundary from both
> sides of kernel, because libc and kernel are developed under the same
> project.  Usermode timehands, fast signal blocks, upcoming rseq support,
> just to name a few of them, all benefit from this model.
>=20
> VDSO is only needed for us to provide the unwind annotations on the signal
> trampoline, in a way expected by unwinders.
>=20
> >=20
> >  - It looks as if the Linux VDSO mechanism isn't yet using this.  Do you
> > plan on moving it over?
> No.
>=20
> >=20
> >  - I can't quite tell from kern_sharedpage.c (this file has almost no
> > comments) - is the userspace mapping of the VDSO randomised?  This has =
been
> > done on Linux for a while because the VDSO is an incredibly high-value
> > target for code reuse attacks (it can do system calls and it can restor=
e the
> > entire register state from the contents of an on-stack buffer if you can
> > jump into it).
> Not now.  Randomizing shared page location is not too hard, but there are
> some ABI issues to sort out.  We live with fixed-mapped shared page for
> more than 10 years.

As a point of reference, HardenedBSD's PaX-inspired ASLR
implementation has randomized the shared page for more than half a
decade now without issue. I suspect FreeBSD will find, if applied
properly, randomization of the shared page (now VDSO) likely won't
break anything.

Thanks,

--=20
Shawn Webb
Cofounder / Security Engineer
HardenedBSD

https://git.hardenedbsd.org/hardenedbsd/pubkeys/-/raw/master/Shawn_Webb/03A=
4CBEBB82EA5A67D9F3853FF2E67A277F8E1FA.pub.asc

--22hc2edju7gzauni
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEA6TL67gupaZ9nzhT/y5nonf44foFAmGgPcIACgkQ/y5nonf4
4fqBoA//S2YLTbp/fAnoPQ9FoD0S6VFdZAqoFkC/EmJqhi2f6/3y5JaSrOX8odi6
G0oBmkIBemL9yUpNnV99+KD/AJwzGlI9ZkeGfFYqFjtB6JwS0ejjff/AYXzy/Pyp
ZF5EOo/fRZhOtQGAu1ix798otFPC33UCH579hSOcCpDGg3jxmUdXxXQgegxduQkw
zOYLz8UXNv0dD9MkY1VbhZtnhUfN8uEwr9zdCwF8yuwHXlKcWywSxeZP3BUY4fkr
89jBoUQWhOw6e/tfauX5aKBtnuIikkYYVihBrmtt/Qad5/n2dzECpmeI7UW+klc2
Mwqa0ef+Yp+WNqCoFaj8Kc98oFdpoSucfFxERJpHQ7CaE6+P1bll9fgiZEMIz+uN
fJTC64yIp+Kjn/O37Zijh47sruQKC3tR+5+soOG3GV8ZcflGmeBcUBdnQ8IcqD5g
X/kmmYS48ejyhRzD7OGOMcVpIL+5eGi+gsprhbPmTCtQVGHYh32nD6VXYWwXXXs5
wGMT2pxxgwOEiYE1RQhnhLIEwxpj1D+3MjyzpqN3S9lGFoFk25vxnoz3FGyZaX4h
7wvFxKzVu2UDWwI574bit6v8VTiGzI8l9TCthQSnTcOd/tpPkPGKhGh5916Gt/Yi
sfMbAbCOtVLwWYxGf1UW1m3t+XaXn0yKW3wID6iDcWP+RZEBN7g=
=ydTv
-----END PGP SIGNATURE-----

--22hc2edju7gzauni--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20211126015204.5pbzmctz2fqha6o4>