Date: Wed, 26 May 2010 22:24:23 +0300 From: Kostik Belousov <kostikbel@gmail.com> To: Alan Cox <alc@cs.rice.edu> Cc: alc@freebsd.org, Garrett Cooper <yanefbsd@gmail.com>, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: nvidia-driver 195.22 use horribly broken on amd64 between r206173 and Message-ID: <20100526192423.GI83316@deviant.kiev.zoral.com.ua> In-Reply-To: <4BFD5D5F.8090106@cs.rice.edu> References: <AANLkTil33IEVGXxsjV1oqfBgKQq-aIJ9Ur1U0Gn8Gplt@mail.gmail.com> <4BFD4AE6.5040105@cs.rice.edu> <20100526165141.GF83316@deviant.kiev.zoral.com.ua> <4BFD5D5F.8090106@cs.rice.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
--gqUrV6DUXoHnB0ku Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, May 26, 2010 at 12:41:51PM -0500, Alan Cox wrote: > Kostik Belousov wrote: > >On Wed, May 26, 2010 at 11:23:02AM -0500, Alan Cox wrote: > > =20 > >>Garrett Cooper wrote: > >> =20 > >>> Just reporting the fact that nvidia-driver 195.22 is horribly > >>>broken between r206173 and r208486 (my machine consistency livelocks > >>>at X11 startup); the latest driver is still broken as well with the > >>>same symptoms. I realize that's a huge revision difference, and I'll > >>>definitely try and track down the root cause via a binary search, but > >>>I wanted to make sure that other folks knew of the issue and don't > >>>upgrade and their systems break horribly as well. > >>> I suspect that the locking changes are causing the issue, but I > >>>don't have any hard data to backup my claim at this time. > >>>=20 > >>> =20 > >>I'm sure they are. The Nvidia driver directly accesses low-level=20 > >>virtual memory structures on which the synchronization rules have=20 > >>changed. (In contrast, the Xorg dri drivers in our source tree are=20 > >>using higher-level interfaces that have remained stable.) > >> > >>I don't think that a binary search is needed. The lock assertion=20 > >>failures should indicate most if not all of the changes that are needed= =20 > >>in the driver. When Kip got this process started, he bumped=20 > >>FreeBSD_version, so it should be possible to condition the locking=20 > >>changes in the driver. > >> > >>Good luck! > >> =20 > > > >I did a quick glance over the driver, try this: > >http://people.freebsd.org/~kib/misc/nvidia-vm_page_lock.1.patch > >I did not even compiled the patched driver. > > =20 >=20 > The second snippet looks weird to me, specifically, seeing an explicit=20 > unwiring before a kmem_free() call. Should the corresponding allocation= =20 > be using kmem_alloc_attr()? I have no idea about lifecycle of the unmanaged pages in nvidia driver. The weird thing is that the pages are wired explicitely by a call to vm_page_wire(9) at all. The pages are allocated by contigmapping(9), that already wires the region. Possibly this is done to signify two references to the page, second one is by sg list that is stuffed with the pages immediately after. My goal was to fix an obvious misusage of the KPI. --gqUrV6DUXoHnB0ku Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkv9dWcACgkQC3+MBN1Mb4g/mACgssoSAXdMO04ZozaB32rEVnzz lXgAoOsdiRkILJZcqFL019XiOqj4q2zy =G5bi -----END PGP SIGNATURE----- --gqUrV6DUXoHnB0ku--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100526192423.GI83316>