Date: Thu, 27 Apr 2006 17:30:46 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-current@freebsd.org Cc: Bachilo Dmitry <root@solink.ru>, David Greenman-Lawrence <dg@dglawrence.com>, Sergey Matveychuk <sem@freebsd.org> Subject: Re: nve0: device timeout (1) Message-ID: <200604271730.49268.jhb@freebsd.org> In-Reply-To: <20060328063930.GC12815@tnn.dglawrence.com> References: <20060328044432.152CD45047@ptavv.es.net> <4428D0EE.6080603@FreeBSD.org> <20060328063930.GC12815@tnn.dglawrence.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 28 March 2006 01:39, David Greenman-Lawrence wrote: > > Bachilo Dmitry wrote: > > > Patch, by the way, was rejected. I have edited if_nve.c by hands, just changed > > > > Yep. It looks as a workaround, not a fix. > > > Right. It's a reasonable work-around, however, so people shouldn't be > afraid of using it. Here is my original message on this subject: > > > In reply to... > > > It doesn't only run into timeouts, during some of these timeout the > > machine or at least the keyboard hangs for about a minute. > > > > Is there anything I can do to help debug this? > > I ran into this problem recently as well and spent some time diagnosing > it. It's not that the cable isn't plugged in - rather it happens whenever > the traffic levels are low. > The problem is that the nvidia-supplied portion of the driver is defering > the releasing of the completed transmit buffers and this occasionally > results in if_timer expiring, causing the driver watchdog routine to be > called ("device timeout"). The watchdog routine resets the card and the > nvidia-supplied code sits in a high-priority loop waiting for the card > to reset. This can take many seconds and your system will be hung until > it completes. > I have a work-around patch for the problem that I've attached to this > email. It simply disables the watchdog. A real fix would involve accounting > for the outstanding transmit buffers differently (or perhaps not at all - > e.g. always attempt to call the nvidia-supplied code and if a queue-full > error occurs, then wait for an interrupt before trying to queue more > transmit packets). What about the patch just posted to amd64@? It looks like a patch for this issue. It changes the watchdog() routine to detect this condition and if it happens exit the routine early without emitting a printf or resetting the chip. -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200604271730.49268.jhb>