Date: Wed, 6 Jun 2007 14:27:34 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Dmitry Morozovsky <marck@rinet.ru> Cc: freebsd-stable@freebsd.org, Ivan Voras <ivoras@fer.hr> Subject: Re: calcru: runtime went backwards, RELENG_6, SMP Message-ID: <200706062127.l56LRYTe090137@apollo.backplane.com> References: <20070606153542.Y76617@woozle.rinet.ru> <f46tmc$rgb$2@sea.gmane.org> <20070606231940.T91939@woozle.rinet.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
:IV> > Upd: on GENERIC/amd64 kernel I got the same errors. :IV> :IV> Do you perhaps run with TSC timecounter? (that's the only cause I've notice :IV> that can generate this message). : :Nope: : :marck@ct-new:~> sysctl kern.timecounter :kern.timecounter.tick: 1 :kern.timecounter.choice: TSC(-100) ACPI-fast(1000) i8254(0) dummy(-1000000) :kern.timecounter.hardware: ACPI-fast :... kgdb your live kernel and 'print cpu_ticks'. See what the cpu ticker is actually pointing at, because it might not be the time counter. It could still be TSC. The TSC isn't synchronized between the cores on a SMP box, not even on multi-core parts. It can't be used to calculate delta times for any thread that has the possibility of migrating between cpu's. Not only will the absolute offset be off between cpus, but the frequency will also be slightly different (at least on SMP multi-core parts), so you get frequency drift too. There is also possibly an issue with tc_cpu_ticks(), which seems to be using a static 64 bit variable to handle rollover instead of a per-cpu variable. I don't see how that could possibly be MP safe, especially if the timecount is not synchronized between cpus and causes multiple rollover events. In fact, I can *barely* use the TSC on DragonFly for KTR logging, and even then I have to have some kernel threads sitting there doing nothing but figuring out the drift between the cpus so it can correct the TSC values when it logs information... and even with all of that I can't get them synchronized any closer then around 500ns from each other. I'd recommend that FreeBSD do what we did years ago with calcru ... stop trying to calculate the time down to the nanosecond and just do it statistically. It works just fine and takes the whole mess out of the critical path. -Matt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200706062127.l56LRYTe090137>