Date: Wed, 1 Apr 2009 06:02:22 +1100 From: Peter Jeremy <peterjeremy@optushome.com.au> To: Maxim Sobolev <sobomax@FreeBSD.org> Cc: freebsd-hackers@FreeBSD.org, freebsd-current@FreeBSD.org, David Xu <davidxu@FreeBSD.org>, prashant.vaibhav@gmail.com Subject: Re: Improving the kernel/i386 timecounter performance (GSoC proposal) Message-ID: <20090331190222.GA2816@server.vk2pj.dyndns.org> In-Reply-To: <49D1725A.1020005@FreeBSD.org> References: <11609492.9579.1238167614335.JavaMail.root@vms070.mailsrvcs.net> <49CD0405.1060704@samsco.org> <49CD30E9.7030501@elischer.org> <49CEC261.4010803@freebsd.org> <20090329182219.GC38985@server.vk2pj.dyndns.org> <49D1725A.1020005@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--82I3+IH0IqGh5yIs Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2009-Mar-30 18:45:30 -0700, Maxim Sobolev <sobomax@freebsd.org> wrote: >You don't really need to do it on every execve() unconditionally. It=20 >could be done on demand in libc, so that only when thread pass certain=20 >threshold, the "common page optimization code" kicks in and does its=20 >open/mmap/etc magic. Otherwise, "normal" syscall is performed. This "optimisation" is premature. First step is to implement an approach that always maps (or whatever) the data and then gather some information about its overheads in the real world. If they are deemed excessive, only then do we start looking at how to improve things. And IMO, the first step would be to lazily map the page - so it's not mapped by default but mapped the first time any of the information in it is used. >that for example gettimeofday() only gets optimized if threads calls it=20 >more frequently that 1 call/sec. Whilst this thread started talking about timecounters, once you have a shared page, there is a variety of other information that could be exported - PID being the most obvious. If the page is exported as code rather than data (as has been suggested) then you also have the possibility of exporting CPU-dependent optimised versions of some library functions (ala Solaris). The more stuff you export, the less you gain from supporting an export threshold. On 2009-Mar-30 18:31:06 -0700, Maxim Sobolev <sobomax@FreeBSD.org> wrote: >It's not that easy, unless you can pin thread to a specific core before=20 >reading that page. I.e. imagine the case when your thread reads per-cpu=20 >page, get preempted and scheduled to a different core, then executes=20 >RDTSC there, still thinking it got TSC reading from the first core. Even= =20 >if it does re-read from that page again after reading TSC to determine=20 >if he has read the correct TSC, still it's possible (though not very=20 >likely) that it has been preempted again and scheduled to the first core= =20 >after reading the TSC. Good point. If you export code, rather than data, then the scheduler can just special-case threads where the return address is inside the magic page (this is a fairly cheap test and only needs to occur once you have decided to re-schedule that thread - so you are already in the "expensive" part of the scheduler and a few more instructions won't be noticable there). The most obvious approach would be to temporarily pin the thread whilst it's executing inside that page. --=20 Peter Jeremy --82I3+IH0IqGh5yIs Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.10 (FreeBSD) iEYEARECAAYFAknSaL4ACgkQ/opHv/APuIedoQCgipQ73bAx0NBwiaR5iZApBWgB GIkAn3H7KyYKduqSfyGKrWD126pk/lyO =xNgP -----END PGP SIGNATURE----- --82I3+IH0IqGh5yIs--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090331190222.GA2816>