Date: Sun, 17 Jan 2010 01:44:09 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Attilio Rao <attilio@freebsd.org> Cc: FreeBSD Arch <arch@freebsd.org>, Ed Maste <emaste@freebsd.org> Subject: Re: [PATCH] Statclock aliasing by LAPIC Message-ID: <20100116235558.E64689@delplex.bde.org> In-Reply-To: <3bbf2fe11001160409w1dfdbb9j36458c52d596c92a@mail.gmail.com> References: <3bbf2fe10911271542h2b179874qa0d9a4a7224dcb2f@mail.gmail.com> <200911301305.30572.jhb@freebsd.org> <3bbf2fe11001150706y765159a2jbd37c7ae4cf378f0@mail.gmail.com> <20100116205752.J64514@delplex.bde.org> <3bbf2fe11001160409w1dfdbb9j36458c52d596c92a@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-201474786-1263653049=:64689 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Sat, 16 Jan 2010, Attilio Rao wrote: > 2010/1/16 Bruce Evans <brde@optusnet.com.au>: >> On Fri, 15 Jan 2010, Attilio Rao wrote: >>> ... There is an updated patch: >>> >>> http://www.freebsd.org/~attilio/Sandvine/STABLE_8/statclock_aliasing/st= atclock_aliasing4.diff >> >> It seems to have the same fundamental bugs as the previous version. >> The atrtc interrupt is too slow to use for anything, so it should never >> be used if there is something better like the lapic timer available >> (even the i8254 is better), and using it here doesn't even fix the >> problem (malicious applications can very easily hide from statclock >> by default since the default hz is much larger than the default stathz, >> and malicious applications can not so easily hide from statclock >> irrespective >> of the misconfiguration of hz, since statclock is not random). =C2=A0See= my >> previous reply and ftp://ftp.ee.lbl.gov/papers/statclk-usenix93.ps.Z for >> more details. > > Well, the primary things I wanted to fix is not the hiding of > malicious programs but the clock aliasing created when handling all > the clocks by the same source. Those could easily be a scheduler bug, or at least fixable there. > About the slowness -- I'm fine with whatever additional source to > LAPIC we would eventually use thus would you feel better if i8254 is > used replacing atrtc? You can just probably just use the LAPIC with programmed pseudo-not-very- randomness (to delivery, not to the LAPIC interrupts which probably need to remain periodic). > Also note that atrtc is the default if LAPIC cannot be used. I don't > understand why another source, even simpler (eg. i8254) would have > been used in that specific case by the 'old' code. The i8254 restarts itself automatically. It only needs an EOI, which is fairly efficient if it is on APIC. Thus interrupting at say 2 KHz with i8254 hopefully has less overhead than interrupting at 128 Hz with atrtc, provided the i8254 interrupt handler does no more than the lapic_timer one. The i8254 is also programmable, so you can change its period easily though not efficiently to program randomness with a resolution of a few microseconds. > What I mean, then is: I see your points, I'm not arguing that at all, > but the old code has other problems that gets fixed with this patch > (having different sources make the whole system more flexible) while > the new things it does introduce are secondarilly (but still: I'm fine > with whatever second source is picked up for statclock, profclock) if > you really see a concern wrt atrtc slowness. Did you see the points in my more detailed review? We want to remove support for the old clock sources eventually, not have more code to select them and more non-default configuration to avoid them again. I think I understand the actual bug now. It is in lapic_handle_timer(). Statclock interrupts should never be delivered on the same lapic timer interrupt as a hardclock interrupt (this is possible, at least with default hz's, since hardclock interrupts are delivered every second lapic timer interrupt), but they are. This at best results in every second statclock interrupt being in perfect sync with some hardclock interrupt (I think it actually gives bunches of about lapic_timer_hz/stathz= /2 (default 7 or 8) in or out of perfect sync. Maybe the bunches are what makes the problem serious). To fix this, statclock interrupts should be delayed until the next lapic timer interrupt if a hardclock interrupt was just delivered, or done early if the next delay would be more than half a lapic timer period. This delay/advance also gives some free pseudo-randomness. See my more detailed review about statistics utilities not liking any randomness. Too bad for them. The non-divisibility of lapic_timer_hz by stathz with defaults already gives large delays. You can even reduce the maximum jitter using delay/advance instead of the current method (from almost 1 lapic timer period (always late) to +- 1/2 of 1 lapic timer period (early or late)). I don't see any reason to keep using stathz =3D 133 or even a non-multiple or non-divisor or hz (but it needs to remain nearly 128 for historical reasons until other layers are changed). The non-multiple is to ensure that independent clocks don't stay in sync for long. Programmed non-sync ensures this better. Malicious programs just have different problems predicting the 2 types of pseudo-not-very-randomness. With statclock ticks occurring exactly half-way between hardclock ticks, malicious programs can a bit too easily wake up on a hardclock tick and run for a half less epsilon of a hardclock tick without getting accounted. Oops, non-malicious programs can also do this a bit too easily -- you can have a thundering herd wake up and all finish before the accounting. More pseudo-randomness seems to be needed. I don't see a good way to handle the thundering herd case. For that you actually want the statclock tick immediately after the hardclock tick (but not in sync) quite often. The following might work except for inefficiency (time/power): make lapic_timer_hz say 10 times larger and distribute statclock delivery randomly about hardclock delivery in the 39 slots 0, +-lapic_timer_period, ... +-19*lapic_timer_period, instead of only in the 2 +-lapic_timer_period slots. First try using the 0 slot with just these 2. Perhaps similarly for profclock, but when it is fixed it should be much larger than hz (10-10000 kHz according to machine speed and/or sysctl), so lapic_timer_hz would have to be enormous to give a small relative jitter for profclock. Bruce --0-201474786-1263653049=:64689--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100116235558.E64689>