Date: Tue, 5 Feb 2002 09:21:31 -0800 (PST) From: John Polstra <jdp@polstra.com> To: hackers@freebsd.org Cc: phk@critter.freebsd.dk Subject: Re: A question about timecounters Message-ID: <200202051721.g15HLVW03792@vashon.polstra.com> In-Reply-To: <86051.1012909502@critter.freebsd.dk> References: <86051.1012909502@critter.freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
In article <86051.1012909502@critter.freebsd.dk>, Poul-Henning Kamp <phk@critter.freebsd.dk> wrote: > In message <200202050141.g151fxW02520@vashon.polstra.com>, John Polstra writes: > >That's the global variable named "timecounter", right? I did notice > >one potential problem: that variable is not declared volatile. So > >in this part ... > > This may be a problem, I have yet to see GCC make different code for > that but I should probably have committed the "volatile" anyway. It should be committed, but it is not causing the problem in this case. I changed it and then compared MD5s of the object files. The only changes that resulted were unimportant. > >I also noticed this in tco_forward(): > > > > tco = timecounter; > > tc = sync_other_counter(); > > [...] > > if (tco->tc_poll_pps) > > > >But sync_other_counter() loads its own copy of "timecounter", > >and there's no guarantee it hasn't changed from the value that > >tco_forward() saved in its local variable. I'm not sure yet if > >that's a potential problem. It could corrected by passing "tco" as > >an argument to sync_other_counter. I'll try that too. > > This code is actually correct, the tc_poll_pps needs to be done on > the "old" timecounter, because that would be the reference for any > captured hardware timestamps, if I did it on the new timecounter I > might get negative deltas which would complicate things. Also the > new timecounter may have a changed frequency/offset (tickadj/ntpd > and all that). I don't think I follow your reasoning here. If the call to sync_other_counter were inlined, we'd have something like this: tco = timecounter; tco_in_sync_other_counter = timecounter; [...] if (tco->tc_poll_pps) Obviously tco and tco_in_sync_other_counter will have the same value almost all of the time, so the code can't be relying on them being different. Anyway, I realize now that this also isn't the problem, because tco_forward is only ever called at splclock. It can't be interrupted or re-entered, at least not on the uniprocessor -stable systems I'm looking at. > There is one more failure mode which you have overlooked: The individual > timecounters maintain a binary counter of a certain width, if interrupt > latency gets too bad, this may overflow. > > This is a non-issue for the TSC, which is 64bit wide in hardware. Many of the systems where I see this problem are using the TSC as the timecounter. They don't have APM in the kernel, and they aren't running ntpd. I.e., it's not only the i8254 that's the problem. The fastest of these systems is a 1.13 GHz PIII, and it would take the 32 bits of the TSC which are actually used 3.8 seconds to wrap around. > Hope this helps... Yep, thanks. I have some ideas of other things to try. John -- John Polstra John D. Polstra & Co., Inc. Seattle, Washington USA "Disappointment is a good sign of basic intelligence." -- Chögyam Trungpa To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200202051721.g15HLVW03792>