Date: Sun, 22 Nov 1998 15:27:18 +1100 From: Bruce Evans <bde@zeta.org.au> To: bde@zeta.org.au, phk@critter.freebsd.dk Cc: current@FreeBSD.ORG, eivind@yes.no, garman@earthling.net, terbart@aye.net Subject: Re: more dying daemons Message-ID: <199811220427.PAA22732@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
>>>It is as predicted caused by hardclock() interrupts being disabled >>>for far too long. This seems to happen on some specific types of >>>hardware, the PLIP code for the parallel port being the most readily >>>available. >> >>Erm, it is caused by _non_-hardclock() interrupts being disabled for >>for too long. > >No, it is caused by hardclock being called with much smaller than >1/hz in between. Not normally. The normal failure mode is: [running at spl0()] tc = timecounter; use part of tc [hardware, non-hardclock interrupt] run in interrupt mode [hardclock interrupt] change timecounter on return, check if we can handle pending interrupts; perhaps we can, but we can't run softclock [perhaps another type of hardware interrupt] run some more in interrupt mode [hardclock interrupt] change timecounter, corrupting tc if NTIMECOUNTER = 2 ... finally finish hardware interrupt processing handle pending interrupts, including softclock [possibly more hardware interrupts] [possibly more hardclock interrupts] finally finish interrupt processing use another, now inconsistent part of tc This can easily happen without anything being broken. It just takes a transient high interrupt load. Hardclock can only be called soon after the previous call if something is broken. E.g., masking hardclock using splhigh() for (N - epsilon)/hz seconds can cause 2 (not N) hardclock calls separated by about epsilon/hz seconds. Since the number of calls is limited to 2, this bug can be stopped from corrupting the timecounter by using NTIMECOUNTER = 3. However, this form of the bug is unstable -- it is a small step from running the buggy interrupt handler and hardclock for 1/hz seconds to running in interrupt mode for 2/hz seconds. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811220427.PAA22732>