Date: Sun, 29 Jun 2014 19:51:11 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Alan Cox <alc@rice.edu> Cc: freebsd-hackers@freebsd.org, Oliver Pinter <oliver.pntr@gmail.com>, Shawn Webb <lattera@gmail.com> Subject: Re: Help With ASLR Message-ID: <20140629165111.GI93733@kib.kiev.ua> In-Reply-To: <53AD976A.2000503@rice.edu> References: <20140626232727.GB1825@pwnie.vrt.sourcefire.com> <53ACE5B4.8070700@rice.edu> <20140627132810.GA8396@pwnie.vrt.sourcefire.com> <53AD976A.2000503@rice.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
--abrXHEFNt7TvgPF5
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Amusingly, Google classified the thread as spam.
On Fri, Jun 27, 2014 at 11:10:18AM -0500, Alan Cox wrote:
> On 06/27/2014 08:28, Shawn Webb wrote:
> > On Jun 26, 2014 10:32 PM -0500, Alan Cox wrote:
> >> On 06/26/2014 18:27, Shawn Webb wrote:
> >>> Hey All,
> >>>
> >>> I've exchanged a few helpful emails with kib@. He has glanced over the
> >>> ASLR patch we have against 11-CURRENT (which is also attached to this
> >>> email as a reference point). He has suggested some changes to the VM
> >>> subsystem that I'm having trouble getting my head wrapped around.
> >>>
> >>> Essentially, he suggests that instead of doing the randomization in
> >>> various places like the ELF loader and sys_mmap, we should modify
> >>> vm_map_find() in sys/vm/vm_map.c to apply the randomization there.
> >>>
> >>> Here's his suggestion in full (I hope he doesn't mind me directly
> >>> quoting him):
> >>>
> >>> The mapping address randomization can only be performed in the
> >>> vm_map_find(). This is the place where usermode mapping flags are
> >>> already parsed, and the real decision for the placement is done, with
> >>> all the contextual information collected, the fixed requests are not
> >>> passed there. Doing the 'randomization' in vm_mmap() means that the
> >>> mmap command must be parsed twice, which presented patch fails to do
> >>> in any sane way. Only mappings in the usermode maps should be
> >>> randomized.
> >>>
> >>> Current patch does not randomize the mapping location at all, it only
> >>> randomizes the hint address passed from the userspace, which is
> >>> completely useless.
> >>
> >> Kostik, I'm not sure that I understand what you're saying here. The
> >> randomly generated offset is added to the variable "addr" after this
> >> block of code has been executed:
> >>
> >> } else {
> >> /*
> >> * XXX for non-fixed mappings where no hint is
> provided or
> >> * the hint would fall in the potential heap space,
> >> * place it after the end of the largest possible heap.
> >> *
> >> * There should really be a pmap call to determine a
> >> reasonable
> >> * location.
> >> */
> >> PROC_LOCK(td->td_proc);
> >> if (addr =3D=3D 0 ||
> >> (addr >=3D round_page((vm_offset_t)vms->vm_taddr) =
&&
> >> addr < round_page((vm_offset_t)vms->vm_daddr +
> >> lim_max(td->td_proc, RLIMIT_DATA))))
> >> addr =3D round_page((vm_offset_t)vms->vm_daddr=
+
> >> lim_max(td->td_proc, RLIMIT_DATA));
> >> PROC_UNLOCK(td->td_proc);
> >> }
> >>
> >> So, the point at which the eventual call to vm_map_findspace() starts
> >> looking for free space is determined by the randomization, and thus the
> >> chosen address will be influenced by the randomization. The first
> >> mmap() that I do in one execution of the program will be unlikely to
> >> have the same address in the next execution.
I noted this in my text. The model implemented just shifts the address
space. A payload which can infers its current address would be able to
infer the shifted address of the interesting location.
Even if implementing the 'shift' model, vm_map_find() is the proper place,
and not the syscall.
> >>
> >> That said, I think that this block of code would be a better place for
> >> the pax_aslr_mmap() call than were it currently resides. Moving it he=
re
> >> would address the problem with MAP_32BIT mentioned below.
> >>
> >>
> >>> ... It also fails to correct the address to honour
> >>> the placement flags like MAP_32BIT or MAP_ALIGNMENT_MASK.
> >>
> >>
> >> I think that MAP_32BIT is broken, but not MAP_ALIGNMENT_MASK. Whatever
> >> address the patch causes to be passed into vm_map_find(), we will round
> >> it up as necessary to achieve the requested alignment.
> >>
> >> In some sense, the real effect of the map alignment directives,
> >> including automatic alignment for superpages, is going to be to chop o=
ff
> >> address bits that Shawn has worked to randomize. More on this below.
> >> See [*].
> >>
> >>> ... What must
> >>> be done is vm_map_find() requesting vm_map_findspace() for the address
> >>> hole of the size of the requested mapping + (number of address bits to
> >>> randomize << PAGE_SHIFT). Then, the rng value should be obtained and
> >>> final location for the mapping calculated as return value + (rng <<
> >>> PAGE_SHIFT).
> >>
> >>
> >> This seems to be trying to implement a different and more complex sche=
me
> >> than what Shawn is trying to implement. Specifically, suppose that we
> >> have a program that creates 5 mappings with mmap(2). Call them M1, M2,
> >> M3, M4, and M5, respectively. Shawn is perfectly happy for M1 to alwa=
ys
> >> come before M2 in the address space, M2 to always come before M3, and =
so
> >> on. He's not trying to permute their order in the address space from
> >> one execution of the program to the next. He's only trying to change
> >> their addresses from one execution to the next. The impression that I
> >> got during the BSDCan presentation was that this is what other operati=
ng
> >> systems were doing, and it was considered strike a reasonable balance
> >> between run-time overhead and the security benefits.
> >>
> >>
> >>>
> >>> If no address space hole exists which can satisfy the enlarged
> >>> request, either a direct fallback to the initial length should be
> >>> performed (I prefer this), or exponential decrease of the length up to
> >>> the initial size done, and findspace procedure repeated.
> >>>
> >>> Also, the vm_map_find() knows about the superpages hint for the
> >>> mapping being performed, which allows to not break superpages. When
> >>> superpage-aligned mapping is requested, SUPERPAGE_SHIFT (available
> >>> from pagesizes[1]) should be used instead of PAGE_SHIFT in the formula
> >>> above, and probably, a different amount of address bits in the page
> >>> table page level 2 to randomize, selected.
> >>
> >>
> >> [*] I agree with this last observation. I mentioned this possibility
> >> to Shawn after his talk. On machines where pagesizes[1] is defined,
> >> i.e., non-zero, it makes little sense to randomize the low-order addre=
ss
> >> bits that will be "rounded away" by aligning to pagesizes[1] boundary.
> >>
> >>
> >>>
> >>> =3D=3D=3D end of kib@ suggestions =3D=3D=3D
> >>>
> >>> I have a few concerns, though:
> >>>
> >>> 1) vm_map_find is also used when loading certain bits of data in the
> >>> kernel (like KLDs, for example);
I mentioned this in the text you quoted, saying that=20
Only mappings in the usermode maps should be randomized.
See vm_map_t system_map field.
> >>> 2) It's not possible to tell in vm_map_find if we're creating a mappi=
ng
> >>> for the stack, the execbase, or any other suitable mmap call. We apply
> >>> ASLR differently based on those three aspects;
It is possible, I also mentioned it, referencing r267254.
The cow argument contains flags which allow to identify stack mappings.
The image activators specify the fixed base for mapping.
> >>> 3) vm_map_find is also used in initializing the VM system upon boot.
> >>> What impacts would this cause?
No impact, due to #1.
> >>>
> >>> I would really appreciate any feedback. Thank you in advance for any
> >>> help or guidance.
> >>
> >> Shawn, while I'm here, I'll mention that what you were doing with stac=
ks
> >> in pax_aslr_mmap() seemed odd. Can you explain?
> >>=20
> >>
> >
> > Thank you for your input, Alan. From what it sounds like, you're
> > suggesting we don't implement the ASLR functionality in vm_map_find()
> > and we stick with what we have. Please correct me if I'm wrong.
>=20
>=20
> Yes. I interpreted Kostik's email as implicitly proposing a different
> specification for how mmap()-created mappings should be handled than you
> started from. In other words, if you're not trying to permute the order
> in which mappings appear in the address space from execution to
> execution, but only shift the entire group of mappings around a bit,
> then you don't need to modify vm_map_find(). You can stick with
> changing sys_mmap().
>=20
> That said, I'm inferring how you intended mmap()-created mappings to be
> handled from looking at the code and statements that you made during
> your presentation. Are my two emails correctly describing what you
> intended to do with mmap()-created mappings?
>=20
> I do, however, think that you're calling pax_aslr_mmap() at an
> inopportune point in sys_mmap(). Hopefully, that was clear in my
> previous email.
>=20
>=20
> >
> > Did our changes break MAP_32BIT or was MAP_32BIT broken all along? If
> > our ASLR patch breaks it, what would be the best way to fix it?
>=20
>=20
> I suspect but I'm not certain that your patch breaks calls to
> mmap(MAP_32BIT). But, if you move the call to pax_aslr_mmap() to the
> place that I mentioned before, then pax_alsr_mmap() simply won't be
> called when MAP_32BIT or MAP_FIXED are passed to mmap().
>=20
>=20
> >
> > I'm' a little unsure of your question regarding stacks in
> > pax_aslr_mmap(). I don't believe we're doing anything with stacks in
> > that function. We do have the function pax_aslr_stack(), but we're
> > working on providing a more robust stack randomization (which isn't in
> > the patchset I provided).
>=20
> mmap(MAP_STACK) is used to create thread stacks. The following code
> didn't make sense to me:
>=20
> + if (!(td->td_proc->p_vmspace->vm_map.flags &
> MAP_ENTRY_GROWS_DOWN))
> + *addr +=3D td->td_proc->p_vmspace->vm_aslr_delta_=
mmap;
> + else
> + *addr -=3D td->td_proc->p_vmspace->vm_aslr_delta_=
mmap;
>=20
> MAP_ENTRY_GROWS_DOWN is not a valid vm map flag. Moreover, the
> subtraction case doesn't make sense to me.
>=20
>=20
>=20
--abrXHEFNt7TvgPF5
Content-Type: application/pgp-signature
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAEBAgAGBQJTsEP/AAoJEJDCuSvBvK1Bow8P/0PTiM5jsLImsh3qu54gIKvc
0gyS/4TE56fSZK0cQ9Kush0eFRve4z/QRfe++6IM39Xgb3uzhWSEhzpHsMqmtCGf
8QzL3I9wrF0pnMj263z9OEOSpRl3YNAt05TTByyms/7TVL+yRsXgmSBTWIG0+XTp
LctPk02oVyW0E6qqfYdzDes0dvf+5DGJXlzqnxD6lZ+D1M80o56aghSRPNe8iCSA
3cHM/2q9+8ZyQW1XQWmNU/CQZJKevjPwjDiuKgpn0oukkk0pFXQxnk6g8c/ONwWV
JLn+O4maNoq1ByS0COr1gYYPC/HP5YeScTYXzpES4BefKvAY23VCSfDng4ZaeTZg
mJcm+rbwrlWkEWhs8Le+ZS9kqARMVA281I6M1NoDY9p2SpyysGrlJHc2nXR1eMFd
FH7DjgDkO8EKqwsKElZRhctcIytnQc0pHR/HbxU8cU/XSokE05f1DXsbQ87F+x/J
aFtAF8sLu+8I6t3Xgn4BOkpmk8EgR+IX34kPitgsLlV1BWCbAYd4bZPlW3Vzjcr+
BifnX3YZL1arGBq4hw/zvWqik10wPHZrL2M6+BpMWUbd5VTpyhgO8Tr63nns3QAz
hMU96OF8C9AMWJfVx7VcC+EzMb2K/RqmWck9wffQxHWhXV6rJuWCU9rLZLkktDm7
9D+QpgnewnIdvP8OFstq
=Bpsi
-----END PGP SIGNATURE-----
--abrXHEFNt7TvgPF5--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140629165111.GI93733>
