Date: Sun, 8 Nov 1998 19:16:28 +1100 From: Bruce Evans <bde@zeta.org.au> To: phk@critter.freebsd.dk, wpaul@skynet.ctr.columbia.edu Cc: current@FreeBSD.ORG Subject: Re: Grrr... calcru: negative time blah blah blah Message-ID: <199811080816.TAA18086@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
>> >debug.tc_diag_buffer: 11932 11932 11932 11937 [...] >> >[...] 11932 11932 49982 50184 50259 11934 11930 [...] >> >> This is bad, really really bad. Best case sequence is: >> >> 11932: hardclock() >> 11932: hardclock() >> 49982: {micro|nano}[run]time() >> 50184: {micro|nano}[run]time() >> 50259: hardclock() >> 11934: hardclock() >> >> At least 4 calls to hardclock() is missing here. > >Hm... Timecounter strangeness seems to be only a symptom here. I've seen similar strangeness caused by ddb masking interrupts. However, {micro|nano}[run]time() only have obvious problems when tco_delta() overflows a u_int. Overflow occurs after only about 10 seconds for a 400MHz tsc timecounter, but not for one hour for an i8254 timecounter. >> Does the problem also exist for a !SMP case ? > >Okay, I built a kernel with no apm0 device and tried again. It seems >that now I don't get the calcru error messages and the X server actually >runs without exploding. No other processes die, at least not during >the time I had it running. However, the system did get sluggish again >after the X server started. >sysctl kern.timecounter shows this: >kern.timecounter.frequency: 448623175 >kern.timecounter.adjustment: 0 "Lost" (probably actually blocked) interrupts account for the sluggishness, and the the timecounter problems apparently don't occur because the tsc timecounter advances even when you don't look at it. >Running sysctl debug _BEFORE_ triggering the problem by running the >X server shows this: >... >debug.tc_diag_maxforward: -692379806 This apparently gets set to a bad value during initialization. There must be some sign extension bugs for a negative value to be the maximum. >Running sysctl debug _AFTER_ running and stopping the X server shows >this: > >debug.elf_trace: 0 >debug.tc_diag_buffer: 24640675 24640697 24640635 24640686 24640672 24640673 24641091 4938 12316 24640342 24640621 24640649 24640659 24640686 19566403 19641525 19642603 19757035 20530799 20631303 20644237 20684791 20703487 24641328 24640255 24640435 246410 >77 24640268 24640686 24640690 24640656 24640671 24640659 24640790 24640596 24641023 24640295 24640672 24640673 24640672 24640673 24640686 24640659 24640699 24641102 24640216 24640686 24640694 24640638 24640686 24640672 24640673 24640686 24641077 24640254 >Everything seems to have been divided in half. The machine still runs I think that's just because one of {micro|nano}[run]time() is now called about twice per clock tick. >Isn't there any way I can mask a particular interrupt so the dispatcher >just ignores it? Not a great fix I grant you, but it would help prove >the theory. intr_handler[n] = null_routine; Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811080816.TAA18086>