From owner-freebsd-arch@freebsd.org Fri Jan 5 23:10:56 2018 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 86E96EA5987; Fri, 5 Jan 2018 23:10:56 +0000 (UTC) (envelope-from eric@metricspace.net) Received: from mail.metricspace.net (mail.metricspace.net [IPv6:2001:470:1f11:617::107]) by mx1.freebsd.org (Postfix) with ESMTP id 2E8C064698; Fri, 5 Jan 2018 23:10:55 +0000 (UTC) (envelope-from eric@metricspace.net) Received: from [192.168.43.57] (mobile-166-171-187-244.mycingular.net [166.171.187.244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: eric) by mail.metricspace.net (Postfix) with ESMTPSA id F00C08AB6; Fri, 5 Jan 2018 23:10:54 +0000 (UTC) Subject: Re: A more general possible meltdown/spectre countermeasure To: Warner Losh Cc: "freebsd-hackers@freebsd.org" , "freebsd-arch@freebsd.org" References: <33bcd281-4018-7075-1775-4dfcd58e5a48@metricspace.net> <4ec1f3b1-f4b0-80ab-0e68-0dd679dd9e37@metricspace.net> <72f6097e-c71e-b53f-6885-cfe5a5a56586@metricspace.net> From: Eric McCorkle Message-ID: Date: Fri, 5 Jan 2018 18:10:53 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jan 2018 23:10:56 -0000 I'm not sure what you mean by direct map. Do you mean TLB? On 01/05/2018 18:08, Warner Losh wrote: > Wouldn't you have to also unmap it from the direct map for this to be > effective? > > Warner > > > On Fri, Jan 5, 2018 at 3:31 PM, Eric McCorkle > wrote: > > Well, the only way to find out would be to try it out. > > However, unless I'm missing something, if you're trying to pull a > meltdown attack, you try and fetch from the kernel.  If that location > isn't cached (or if your cache is physically indexed), you need the > physical address (otherwise you don't know where to look), and thus have > to go through address translation, at which point you detect that the > page isn't accessible and fault.  In the mean time, you can't > speculatively execute any of the operations that load up the > side-channels, because you don't have the sensitive data. > > The reason you can pull off a meltdown attack at all is that a > virtually-indexed cache lets you get the data in parallel with address > translation (breaking the dependency between address translation and > fetching data), which takes 1000s of cycles for a TLB miss, during which > you have the data and can launch a whole bunch of transient ops. > > Again, these are uncharted waters we're in; so it's entirely possible > I'm missing something here. > > On 01/05/2018 17:22, Warner Losh wrote: > > While you might be right, I've seen no indication that a cache miss > > would defeat these attacks in the public and non-public data I've looked > > at, even though a large number of alternatives to the published > > workarounds have been discussed. I'm therefore somewhat skeptical this > > would be effective. I'm open, however, to data that changes that > > skepticism... > > > > Warner > > > > On Fri, Jan 5, 2018 at 3:15 PM, Eric McCorkle > > >> wrote: > > > >     Right, but you have to get the value "foo" into the pipeline in order > >     for it to affect the side-channels.  This technique attempts to stop > >     that from happening. > > > >     Unless I made a mistake, non-cached memory reads force address > >     translation to happen first, which detects faults and blocks the > >     meltdown attack. > > > >     It also stops spectre with very high probability, as it's very unlikely > >     that an uncached load will arrive before the speculative thread gets > >     squashed. > > > >     On 01/05/2018 17:10, Warner Losh wrote: > >     > I think this is fatally flawed. > >     > > >     > The side channel is the cache. Not the data at risk. > >     > > >     > Any mapped memory, cached or not, can be used to influence the cache. > >     > Storing stuff in uncached memory won't affect the side channel one bit. > >     > > >     > Basically, all attacks boil down to tricking the processor, at elevated > >     > privs, to doing something like > >     > > >     > a = foo[offset]; > >     > > >     > where foo + offset are designed to communicate information by populating > >     > a cache line. offset need not be cached itself and can be the result of > >     > simple computations that depend on anything accessible at all in the kernel. > >     > > >     > Warner > >     > > >     > On Fri, Jan 5, 2018 at 3:02 PM, Eric McCorkle > > > >     > > >>> wrote: > >     > > >     >     Re-posting to -hackers and -arch.  I'm going to start working on > >     >     something like this over the weekend. > >     > > >     >     -------- Forwarded Message -------- > >     >     Subject: A more general possible meltdown/spectre countermeasure > >     >     Date: Thu, 4 Jan 2018 23:05:40 -0500 > >     >     From: Eric McCorkle > >     > > > >     >>> > >     >     To: freebsd-security@freebsd.org > >     > > >     >      > >     >> > > >     > > >     >      > >      >>> > >     > > >     >     I've thought more about how to deal with > meltdown/spectre, and > >     I have an > >     >     idea I'd like to put forward.  However, I'm still in > something > >     of a > >     >     panic mode, so I'm not certain as to its effectiveness.  > >     Needless to > >     >     say, I welcome any feedback on this, and I may be completely > >     off-base. > >     > > >     >     I'm calling this a "countermeasure" as opposed to a > >     "mitigation", as > >     >     it's something that requires modification of code as > opposed to a > >     >     drop-in patch. > >     > > >     >     == Summary == > >     > > >     >     Provide a kernel and userland API by which memory allocation > >     can be done > >     >     with extended attributes.  In userland, this could be > >     accomplished by > >     >     extending MMAP flags, and I could imagine a > >     malloc-with-attributes flag. > >     >      In kernel space, this must already exist, as drivers > need to > >     allocate > >     >     memory with various MTRR-type attributes set. > >     > > >     >     The immediate aim here is to store sensitive information > that must > >     >     remain memory-resident in non-cacheable memory locations > (or, > >     if more > >     >     effective attribute combinations exist, using those > instead).  > >     See the > >     >     rationale for the argument why this should work. > >     > > >     >     Assuming the rationale holds, then the attack surface should > >     be greatly > >     >     reduced.  Attackers would need to grab sensitive data > out of stack > >     >     frames or similar locations if/when it gets copied there for > >     faster use. > >     >      Moreover, if this is done right, it could dovetail > nicely into a > >     >     framework for storing and processing sensitive assets in > more > >     secure > >     >     hardware[0] (like smart cards, the FPGAs I posted > earlier, or > >     other > >     >     options). > >     > > >     >     The obvious downside is that you take a performance hit > >     storing things > >     >     in non-cacheable locations, especially if you plan on > doing heavy > >     >     computation in that memory (say, encryption/decryption).  > >     However, this > >     >     is almost certainly going to be less than the projected > 30-50% > >     >     performance hit from other mitigations.  Also, this > technique > >     should > >     >     work against spectre as well as meltdown (assuming the > >     rationale holds). > >     > > >     >     The second downside is that you have to modify code for this > >     to work, > >     >     and you have to be careful not to keep copies of sensitive > >     information > >     >     around too long (this gets tricky in userland, where you > might get > >     >     interrupted and switched out). > >     > > >     > > >     >     [0]: Full disclosure, enabling open hardware implementations > >     of this > >     >     kind of thing is something of an agenda of mine. > >     > > >     >     == Rationale == > >     > > >     >     (Again, I'm tired, rushed, and somewhat panicked so my logic > >     could be > >     >     faulty at any point, so please point it out if it is) > >     > > >     >     The rationale for why this should work relies on > assumptions about > >     >     out-of-order pipelines that cannot be guaranteed to > hold, but are > >     >     extremely likely to be true. > >     > > >     >     As background, these attacks depend on out-of-order > execution > >     performing > >     >     operations that end up affecting cache and branch-prediction > >     state, > >     >     ultimately storing information about sensitive data in these > >     >     side-channels before the fault conditions are detected and > >     acted upon. > >     >     I'll borrow terminology from the paper, using "transient > >     instructions" > >     >     to refer to speculatively executed instructions that will > >     eventually be > >     >     cancelled by a fault. > >     > > >     >     These attacks depend entirely on transient instructions > being > >     able to > >     >     get sensitive information into the processor core and then > >     perform some > >     >     kind of instruction on them before the fault condition > cancels > >     them. > >     >     Therefore, anything that prevents them from doing this > >     *should* counter > >     >     the attack.  If the actual sensitive data never makes it to > >     the core > >     >     before the fault is detected, the dependent memory > >     accesses/branches > >     >     never get executed and the data never makes it to the > >     side-channels. > >     > > >     >     Another assumption here is that CPU architects are going to > >     want to > >     >     squash faulted instructions ASAP and stop issuing along > those > >     >     speculative branches, so as to reclaim execution units.  So > >     I'm assuming > >     >     once a fault comes back from address translation, then > transient > >     >     execution stops dead. > >     > > >     >     Now, break down the cases for whether the address containing > >     sensitive > >     >     data is in cache and TLB or not.  (I'm assuming here that > >     caches are > >     >     virtually-indexed, which enables cache lookups to bypass > address > >     >     translation.) > >     > > >     >     * In cache, in TLB: You end up basically racing between the > >     cache and > >     >     TLB, which will very likely end up detecting the fault > before > >     the data > >     >     arrives, but at the very worst, you get one or two cycles of > >     transient > >     >     instruction execution before the fault. > >     > > >     >     * In cache, not in TLB: Virtually-indexed tagged means > you get > >     a cache > >     >     lookup racing a page-table walk.  The cache lookup beats the > >     page table > >     >     walk by potentially hundreds (maybe thousands) of cycles, > >     giving you a > >     >     bunch of transient instructions before a fault gets > >     triggered.  This is > >     >     the main attack case. > >     > > >     >     * Not in cache, in TLB: Memory access requires address > >     translation, > >     >     which comes back almost immediately as a fault. > >     > > >     >     * Not in cache, not in TLB: You have to do a page table walk > >     before you > >     >     can fetch the location, as you have to go out to physical > >     memory (and > >     >     therefore need a physical address).  The page table walk > will > >     come back > >     >     with a fault, stopping the attack. > >     > > >     >     So, unless I'm missing something here, both non-cached cases > >     defeat the > >     >     meltdown attack, as you *cannot* get the data unless you do > >     address > >     >     translation first (and therefore detect faults). > >     > > >     >     As for why this defeats the spectre attack, the logic is > >     similar: you've > >     >     jumped into someone else's executable code, hoping to > scoop up > >     enough > >     >     information into your branch predictor before the fault > kicks > >     you out. > >     >     However, to capture anything about sensitive information > in your > >     >     side-channels, the transient instructions need to > actually get > >     it into > >     >     the core before a fault gets detected.  The same case > analysis > >     as above > >     >     applies, so you never actually get the sensitive info > into the > >     core > >     >     before a fault comes back and you get squashed. > >     > > >     > > >     >     [1]: A physically-indexed cache would be largely immune to > >     this attack, > >     >     as you'd have to do address translation before doing a cache > >     lookup. > >     > > >     > > >     >     I have some ideas that can build on this, but I'd like > to get some > >     >     feedback first. > >     >     _______________________________________________ > >     >     freebsd-security@freebsd.org > > >      > > >      > >     >> > >     >     mailing list > >     >     https://lists.freebsd.org/mailman/listinfo/freebsd-security > > >      > > >     >      > >      >> > >     >     To unsubscribe, send any mail to > >     >     "freebsd-security-unsubscribe@freebsd.org > > >      > > >     >      > >      >>" > >     >     _______________________________________________ > >     >     freebsd-arch@freebsd.org > > > >      >> > >     mailing list > >     >     https://lists.freebsd.org/mailman/listinfo/freebsd-arch > > >      > > >     >      > >      >> > >     >     To unsubscribe, send any mail to > >     >     "freebsd-arch-unsubscribe@freebsd.org > > >      > > >     >      > >      >>" > >     > > >     > > > > > > >