From owner-freebsd-current@FreeBSD.ORG Fri Jul 23 12:08:46 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F231106566B; Fri, 23 Jul 2010 12:08:46 +0000 (UTC) (envelope-from naylor.b.david@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2877E8FC14; Fri, 23 Jul 2010 12:08:44 +0000 (UTC) Received: by wyj26 with SMTP id 26so138533wyj.13 for ; Fri, 23 Jul 2010 05:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:organization:to:subject :date:user-agent:cc:references:in-reply-to:mime-version:content-type :content-transfer-encoding:message-id; bh=UUe55PAa6oyWW2viyml2YOzjbxO1hdTfwCpVX3ZFo6I=; b=GfwsiE6cJvQsKHEo8usxSQq+Im54xTuv88UoMarBTC3tGuZrQ+rFEYMuzRSQ0Ogz/b G8TJmSixJSd+GsiSxPWhdXdyc36W2f+Y0+uB+eckghiUNCudYxb3FOEc+5xjDTSoSrFV VGoYXR4P6305MXTolpCCr8RVx4r+62DT4w5Gs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:organization:to:subject:date:user-agent:cc:references :in-reply-to:mime-version:content-type:content-transfer-encoding :message-id; b=gzMuPM2MaTzUsb/UgVKS1tst59eroC/s7oWZ9RJkgwN5cdRBjqpTG1ZkMJsAv64Yy3 5r3YbbzZOksxDCvZr1x/7zhwktmjFc6hrxNKRXxA6eMJF/ruwPK8l10HoXd+hluPBoq0 Kl+obpUUFbtX/ruslIqKWP3L1jeLrMGTAyJAE= Received: by 10.227.156.21 with SMTP id u21mr3398336wbw.56.1279886923847; Fri, 23 Jul 2010 05:08:43 -0700 (PDT) Received: from dragon.dg (41-132-92-33.dsl.mweb.co.za [41.132.92.33]) by mx.google.com with ESMTPS id e31sm146229wbe.17.2010.07.23.05.08.34 (version=SSLv3 cipher=RC4-MD5); Fri, 23 Jul 2010 05:08:41 -0700 (PDT) From: David Naylor Organization: Private To: Christian Zander Date: Fri, 23 Jul 2010 14:08:36 +0200 User-Agent: KMail/1.13.3 (FreeBSD/9.0-CURRENT; KDE/4.4.3; amd64; ; ) References: <201007021146.46542.naylor.b.david@gmail.com> <201007171624.58434.naylor.b.david@gmail.com> <20100717152527.GA26038@panther.nvidia.com> In-Reply-To: <20100717152527.GA26038@panther.nvidia.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart3933015.Mqv6Xxsr9p"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <201007231408.39453.naylor.b.david@gmail.com> Cc: Christian Zander , "danfe@freebsd.org" , Doug Barton , Yuri Pankov , "freebsd-current@freebsd.org" , Rene Ladan Subject: Re: nvidia-driver crashing kernel on head X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Jul 2010 12:08:46 -0000 --nextPart3933015.Mqv6Xxsr9p Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable On Saturday 17 July 2010 17:25:27 Christian Zander wrote: > On Sat, Jul 17, 2010 at 07:24:54AM -0700, David Naylor wrote: > (...) >=20 > > > >>> These freezes and panics are due to the driver using a spin mutex > > > >>> instead of a > > > >>> regular mutex for the per-file descriptor event_mtx. If you patch > > > >>> the driver > > > >>> to change it to be a regular mutex I think that should fix the > > > >>> problems. > > > >>=20 > > > >> Can you give an example? :) I don't mind creating a patch for all = of > > > >> them if you can illustrate what needs to be changed. > > > >=20 > > > > See the attached patch > > >=20 > > > In order to use 195.36.15 it was necessary to use the patch Rene sent, > > > the suggestion from jhb previously to remove some locks, plus a bit > > > more. The patch that got it working on HEAD for me (specifically > > > r209633) is attached. With that patch I could start X, and run it for= a > > > while, but performance was very poor, even in comparison with the sto= ck > > > nv driver, and it crashed a couple times (although not nearly as bad = as > > > previously). > > >=20 > > > So based on other suggestions I tried the newest release version at > > > nvidia, 256.35. Some of the same locking stuff was needed to patch it, > > > a patch for the port which includes the locking patch is also > > > attached. If you are running an amd64 system you'll have to type 'make > > > makesum' after applying this patch to the port. I'm not sure this > > > patch is complete, or what Alexey might want to do with the update, > > > but it does create an accurate plist which means you can cleanly > > > deinstall/pkg_delete when you're done. > > >=20 > > > With 256.35 performance and stability have both been quite good, > > > comparable even to before the the drama started. The only concern I > > > have at this point is that I'm periodically getting a strange sort of > > > "flash" popping up on my screen that I didn't get while I was running > > > the nv driver recently. It looks sort of like the default X background > > > (the tiny gray crosshatch) is popping through for just a split second. > >=20 > > I've been getting these messages on the console: > >=20 > > NVRM: Xid (0001:00): 16, Head 00000000 Count 000218d5 > > NVRM: Xid (0001:00): 8, Channel 00000000 > > NVRM: Xid (0001:00): 16, Head 00000000 Count 000218d6 > > NVRM: Xid (0001:00): 8, Channel 00000002 > >=20 > > This is preceded by X locking hard. I cannot VT switch to a normal > > console and sometimes the computer needs a hard reset (i.e. does not > > respond to power button). It appears to only trigger when under heavy > > load. eg > > make -C /usr/src -j8 buildworld > >=20 > > This seems to be messing with interrupts with other subsystems as my > > network drivers are less than reliable of late. (Watchdog timeouts). >=20 > The messages indicate that the NVIDIA driver hasn't received > interrupts from the GPU @ PCI:1:00.0 over a significant > period of time. If you are seeing similar problems with other > system components, there's a good chance that the above is > a symptom of some larger problem. I think you are right. I'm not sure if this is a hardware problem or FreeB= SD. =20 I reverted to a kernel from May 01 and the system is solid (~5 days). I'm= =20 using the patched 256.35 driver without problem. =20 > > This happens with 195.36.15 unpatched and 256.35 patched. > >=20 > > I have not checked if booting with WITNESS enabled works. > >=20 > > Regards > >=20 > > * David Naylor > > * 0xFF6916B2 --nextPart3933015.Mqv6Xxsr9p Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEABECAAYFAkxJhkcACgkQUaaFgP9pFrIzyQCdE3KRNNbEW98TTm/XQOA6GF9u ff4An2FLYBBb5Bltf99fspfVW1GuJ93a =lvRd -----END PGP SIGNATURE----- --nextPart3933015.Mqv6Xxsr9p--