From owner-freebsd-current@FreeBSD.ORG Fri Mar 27 22:37:25 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A0570106566C; Fri, 27 Mar 2009 22:37:25 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (lor.one-eyed-alien.net [69.66.77.232]) by mx1.freebsd.org (Postfix) with ESMTP id 406F28FC0A; Fri, 27 Mar 2009 22:37:25 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (localhost [127.0.0.1]) by lor.one-eyed-alien.net (8.14.3/8.14.3) with ESMTP id n2RMaAO9058121; Fri, 27 Mar 2009 17:36:10 -0500 (CDT) (envelope-from brooks@lor.one-eyed-alien.net) Received: (from brooks@localhost) by lor.one-eyed-alien.net (8.14.3/8.14.3/Submit) id n2RMaAM9058120; Fri, 27 Mar 2009 17:36:10 -0500 (CDT) (envelope-from brooks) Date: Fri, 27 Mar 2009 17:36:10 -0500 From: Brooks Davis To: Robert Watson Message-ID: <20090327223610.GA58090@lor.one-eyed-alien.net> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <17560ccf0903271348p52351481v4cc83c14037e8836@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oyUTqETQ0mS9luUI" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (lor.one-eyed-alien.net [127.0.0.1]); Fri, 27 Mar 2009 17:36:10 -0500 (CDT) Cc: freebsd-current@freebsd.org, Prashant Vaibhav Subject: Re: Improving the kernel/i386 timecounter performance (GSoC proposal) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2009 22:37:26 -0000 --oyUTqETQ0mS9luUI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Mar 27, 2009 at 10:19:35PM +0000, Robert Watson wrote: >=20 > On Sat, 28 Mar 2009, Prashant Vaibhav wrote: >=20 >> Actually OS X is more similar than that: the shared page also contains= =20 >> functions that can be called by user applications, though their entry=20 >> points are fixed and they're not in any particular format like elf/mach-= o.=20 >> Userspace implementations of gettimeofday, bcopy etc. are provided in th= e=20 >> kernel itself, which is a nice design imo as the specific version to loa= d=20 >> is chosen by the kernel at boot time depending on processor capabilities. >=20 > One cute thing about Linux exporting the page as ELF is that the dynamic= =20 > linker just finds and links libc against it for the system call path. EL= F=20 > is a fairly straight-forward format, so it's not a bad approach, although= =20 > it does make the kernel side more complex. One downside, of course, is= =20 > that it means the kernel has to export 32-bit code to 32-bit processes,= =20 > 64-bit code to 64-bit processes, etc, if you want the higher performance= =20 > stuff for 32-bit processes on 64-bit kernels, you have to build the expos= ed=20 > code as non-native code. Either way, I suspect we really want a function based interface because the= n we have a layer of insulation between the kernel and userspace. Without this, we're stuck providing any bits in the shared page forever to support old binaries. -- Brooks > Robert N M Watson > Computer Laboratory > University of Cambridge >=20 >=20 >>=20 >>=20 >>=20 >> On Fri, Mar 27, 2009 at 11:53 PM, Robert Watson wr= ote: >>=20 >> On Fri, 27 Mar 2009, Scott Long wrote: >>=20 >> I've been talking about this for years. ??All I need >> is help with the VM magic to create the page on >> fork. ??I also want two pages, one global for >> gettimeofday (and any other global data we can think >> of) and one per-process for static data like >> getpid/getgid. >>=20 >>=20 >> FWIW, there are some variations in schemes across OS's -- one extreme >> is the Linux approach, which actually exports a mini shared library in >> ELF format on the shared page, providing implementations of various >> services (such as entering system calls), time stuff, etc. ??Less >> extreme are the shared pages offered on Mac OS X, etc. >>=20 >> Robert N M Watson >> Computer Laboratory >> University of Cambridge >>=20 >>=20 >>=20 >> Scott >>=20 >>=20 >> Sergey Babkin wrote: >> ?? (Sorry for the top quoting). Probably the >> best implementation of >> ?? gettimeofd=3Dy() is to have >> ?? a page in the kernel mapped read-only to all >> the user pr=3Dcesses. Put >> ?? the kernel's idea of time >> ?? into this page. Then getting the =3Dime >> becomes a simple read (OK, two >> ?? reads, to make sure that >> ?? no update =3Das happened in between). >> ?? The TSC can then be used to add the >> precis=3Don between the ticks of >> ?? the kernel timer: >> ?? i.e. remember the value of TS=3D when the last >> tick happen, and the >> ?? highest rate at which >> ?? TSC may be ti=3Dking at this CPU, and export >> in the same page. This >> ?? would guarantee thatthe time is not moving >> back. >> ?? However there are more issues with TS=3D. TSC >> is guaranteed to have >> ?? the same value >> ?? on all the processors that s=3Dare the same >> system bus. But if the >> ?? machine is built of multiple >> ?? buses =3Dith bridges between them, all bets >> are off. Each bus may be >> ?? stopped, resta=3Dted >> ?? and clocked separately. There is no way to >> tell, on which CPU is th=3D >> ?? process currently >> ?? runnning, and it may be rescheduled do a >> different C=3DU right before >> ?? or after the RDTSC >> ?? instruction. >> ?? -SB >> ?? Ma=3D 26, 2009 06:55:04 PM, >> [1]phk@phk.freebsd.dk wrote: >> ?? ?? ?? ??In message >> <[2]17560ccf0903260551v1f5cba9eu8 >> 7727c0bae7baa3@mail.gmail.com>, Prasha >> ?? ?? nt Vaibhav writes: >> ?? ?? =3DThe gettimeofday() function's >> implementation will then be >> ?? ?? >change=3D to read the timestamp counter >> (TSC) from the processor, >> ?? ?? and use the >> ?? ?? &g=3D;reading in conjunction with the timing >> info exported by the >> ?? ?? kernel to >> ?? ?? =3Dcalculate and return the time info in >> proper format. >> ?? ?? I take it a=3D read, that you know that >> there are other relvant >> ?? ?? functions than gettim=3Dofday() and that >> these must provide a >> ?? ?? monotonic timescale when queried >> =3Dnterleaved ? >> ?? ?? Be aware that the TSC may not be, and may >> not stay syn=3Dhronized >> ?? ?? across multiple cores. >> ?? ?? Further more, the TSC is not con=3Dtant >> frequency and in particular >> ?? ?? not "known frequency" at all times. >> ?? ?? There are a lot of nasty cases to check, >> and a nasty interpolation >> ?? ?? =3Dequired, which, in my tests some years >> back, totally negated any >> ?? ?? speedu=3D from using the TSC in the first >> place. >> ?? ?? At the very minimum, you wi=3Dl have to add >> a quirk table where >> ?? ?? known good {CPU+MOBO+BIOS} combinatio=3Ds >> can be entered, as we >> ?? ?? find them. >> ?? ?? >This will also pave way f=3Dr optionally >> making the >> ?? ?? >FreeBSD kernel tickless, >> ?? ?? Rubbish. T=3Dmecounters are not even closely >> associated with the >> ?? ?? tick or ticklessnes=3D of the kernel. [1] >> ?? ?? > - The TSC frequency might change on >> cert=3Din processors with >> ?? ?? non-constant >> ?? ?? > TSC rate (because of SpeedStep, =3Dynamic >> freq scaling etc.). The >> ?? ?? only way to >> ?? ?? > combat this is that t=3De kernel be >> notified every time the >> ?? ?? processor >> ?? ?? > frequency changes.=3Dvery cpu frequency >> driver will need to be >> ?? ?? updated to >> ?? ?? > notify the=3Dernel before and after a cpu >> freq change. >> ?? ?? That is not good enough=3D the bios may >> autonomously change the cpu >> ?? ?? speed >> ?? ?? and the skew from not k=3Dowing exactly >> _when_ and _how_ the cpu >> ?? ?? clock >> ?? ?? changed, is a significant =3Dumber of >> microseconds, plenty of time >> ?? ?? to make strange things happen. >> ?? ?? You will want to study carefully Dave >> Mills work to tame the alpha >> ?? ?? =3Dhips wandering SAW clocks. >> ?? ?? Poul-Henning >> ?? ?? [1] In my mind, rewo=3Dking the callout >> system in the kernel would >> ?? ?? be a much better more neded=3Dnd much more >> worthwhile project. >> ?? ?? -- >> ?? ?? Poul-Henning Kamp | =3DNIX since Zilog Zeus >> 3.20 >> ?? ?? [3]phk@FreeBSD.ORG | TCP=3DIP since RFC 956 >> ?? ?? FreeBSD committer | BSD since 4.3-tahoe >> ?? ?? N=3Dver attribute to malice what can >> adequately be explained by >> ?? ?? >> incompetence.<=3Dr>_________________________________________= ______ >> ?? ?? [4]freebsd-hackers@freebsd.org mailing >> list >> ?? ?? >> [5]http://lists.freebsd.org/mailman/listinfo/freebsd-hackers= To >> ?? ?? unsubscribe, send any mail to "[6]fre >> ebsd-hackers-unsubscribe@freebsd.org" >>=20 >> References >>=20 >> ?? 1. 3D"mailto:phk@phk.freebsd.dk" >> ?? 2. file://localhost/tmp/3D ?? 3. >> 3D"mailto:phk@FreeBSD.ORG" >> ?? 4. 3D"mailto:fre ?? 5. 3D"http://lists.=3D/ >> ?? 6.3D"mailto:freebsd-hackers-unsub________________________= ____________________ >> ___ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to >> "freebsd-current-unsubscribe@freebsd.org" >>=20 >>=20 >> _______________________________________________ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to >> "freebsd-current-unsubscribe@freebsd.org" >>=20 >>=20 >>=20 >>=20 > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" --oyUTqETQ0mS9luUI Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iD8DBQFJzVTZXY6L6fI4GtQRAoh1AJ9mQRsh0uYKBYCYjRq4z2qdAe+oZQCfZaTy qqkeccekwxbFjWvg+wKGpqI= =dQO7 -----END PGP SIGNATURE----- --oyUTqETQ0mS9luUI--