Date: Wed, 28 Jul 2010 21:39:40 +0300 From: Alexander Motin <mav@FreeBSD.org> To: David Naylor <naylor.b.david@gmail.com> Cc: "freebsd-current@freebsd.org" <freebsd-current@freebsd.org> Subject: Re: Interrupt Problems Message-ID: <4C50796C.4070509@FreeBSD.org> In-Reply-To: <201007281953.53131.naylor.b.david@gmail.com> References: <201007281953.53131.naylor.b.david@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
David Naylor wrote: > I have been having interrupt related problems with various subsystems. I > suspect this is related to the changes in the event timer infrastructure. > > The subsystems that have experienced interrupt problems: > - hda: this is the easiest to reproduce and what I used to isolate the > commits. I get ``pcm0: chn_write(): pcm0:virtual:dsp0.vp0: play interrupt > timeout, channel dead'' reported and sound no longer plays. > - nfe: this has happened on occasion with no reliable way to reproduce. > ``watchdog timeouts'' are reported. After this happens all network traffic dies > and doing `ifconfig nfe0 down; ifconfig nfe0 up' panics the computer. > - dc: same thing as above. > - nvidia: has reported interrupt timeouts. This is independent of the > locking problem (that is fixed with recently published patch). No reliable way > to reproduce, appears to happen when under heavy load. X freezes as a result. > - ata: I had a HDD detach twice. I am not sure if this is related. I have > two HDD, each attached to a different controller. > > I tested this by using a kernel built from a cvsup date of 2010/06/20 and > 2010/06/22 (at midnight for both, aka 00:00:00). The former kernel does not > exhibit any problems while the latter does. This problem is also present with > a kernel from today. > > The motherboard is a N650SLI-DS4L with one graphics card. See attached for > more system information. > > Is there anything I can do to help diagnose the problem? Hardly I can explain how timer related changes could cause problems with such a long list of devices, using different IRQs. MCP51 seems to have quite bright history of different problems (at least I know about SATA and HDA MSI problems), so I won't be very surprised if it is some one more hardware-specific issue. Does problem happens randomly or can be triggered somehow? Have you tried to look what happens with interrupts during/after the problem appears? Are all of them dying or selectively each time? Is there way to restore operation after problem? Have you tried to switch to using other event timers? HPET event timers were never used before this, so bugs are not studied yet. PS: Verbose dmesg could be more useful. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C50796C.4070509>