From owner-freebsd-current@FreeBSD.ORG Thu Mar 21 13:58:52 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A585D5EE for ; Thu, 21 Mar 2013 13:58:52 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id F3760ABD for ; Thu, 21 Mar 2013 13:58:51 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r2LDwbVs075740; Thu, 21 Mar 2013 15:58:37 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.0 kib.kiev.ua r2LDwbVs075740 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r2LDwZtZ075739; Thu, 21 Mar 2013 15:58:35 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 21 Mar 2013 15:58:35 +0200 From: Konstantin Belousov To: David Wolfskill Subject: Re: Silent reboots in head @r248550 starting xdm with x11/nvidia-driver Message-ID: <20130321135835.GX3794@kib.kiev.ua> References: <20130320160056.GG32811@albert.catwhisker.org> <20130320171340.GE3794@kib.kiev.ua> <20130320173759.GK32811@albert.catwhisker.org> <20130320174458.GG3794@kib.kiev.ua> <20130320180239.GN32811@albert.catwhisker.org> <20130320200857.GN3794@kib.kiev.ua> <20130321013610.GB42912@albert.catwhisker.org> <20130321080441.GS3794@kib.kiev.ua> <20130321133446.GF42912@albert.catwhisker.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="cN0f9BRyJ83ABZok" Content-Disposition: inline In-Reply-To: <20130321133446.GF42912@albert.catwhisker.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Mar 2013 13:58:52 -0000 --cN0f9BRyJ83ABZok Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 21, 2013 at 06:34:46AM -0700, David Wolfskill wrote: > On Thu, Mar 21, 2013 at 10:04:41AM +0200, Konstantin Belousov wrote: > > ... > > This gives me an idea. The only so to say 'vm' change in r248508 was an > > addition of the bio_transient_map submap. The vfs.unmapped_buf_allowed > > tunable did not eliminated the submap creation. Please try r248569 > > with vfs.unmapped_buf_allowed set to 0. >=20 > OK; I believe that worked. >=20 > "Believe" because (in the normal course of things) I updated to: >=20 > FreeBSD g1-235.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #845 r24= 8575M/248575: Thu Mar 21 05:35:06 PDT 2013 root@g1-235.catwhisker.org:/= usr/obj/usr/src/sys/CANARY i386 >=20 > which is a little beyond r248569. (I still have r248508 on a > different slice, and figured I could update that to precisely r248569 > if this test was incorrect or inconclusive.) Not needed. BTW, your system uses UFS, right ? >=20 > In any case: after booting the above (r248575) to verify that it worked > as long as I did not load nvidia.ko first, I then rebooted, escaped to > loader prompt, set vfs.unmapped_buf_allowed=3D0; boot. >=20 > And after that came up OK, I (manually) loaded nvidia.ko, then > re-started X (xdm); the nVidia banner displayed just before the xdm > login screen did. (I have my xdm startup script "prefer" the nvidia > driver, but if nvidia.ko isn't loaded, it reverts to the nv driver > automagically.) >=20 > > If this combination allows the nvidia driver to start, please revert > > the setting of vfs.unmapped_buf_allowed, and instead set > > kern.bio_transient_maxcnt e.g. to 256 or even 128. >=20 > OK; rebooting, escaping to loader, *not* setting vfs.unmapped_buf_allowed, > and setting kern.bio_transient_maxcnt=3D256 also allowed nvidia driver > to be used at r248575. Ok, this is almost not a workaround but a solution (for now). See below. >=20 > > Also, on the machine without the tunables customization, please show > > the output of sysctl kern.nbuf, kern.bio_transient_maxcnt. Also show > > the output of pciconf -lvb. >=20 > OK; I rebooted (to revert the vfs.unmapped_buf_allowed setting) and > obtained the above (augmented a wee bit by some of the others > mentioned; I've attached that as "sysctl.txt". I've also attached > a copy of dmesg.boot, in case that's useful. >=20 > I then tried rebooting r248575 and loading nvidia.ko *without* the > tunable customization, and verified that I still saw (what looks > like) a "reset" when I start X that way (as reported initially). >=20 > > From what I see in your report, you use i386 arch. What is the amount > > of memory installed in the machine ? >=20 > 4GB. >=20 > Is the above what you had in mind, or would you like me to try at > precisely r248569? Anything else? r248569 is fine. > Script started on Thu Mar 21 06:07:41 2013 > g1-235(10.0-C)[1] uname -a > FreeBSD g1-235.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #845 r24= 8575M/248575: Thu Mar 21 05:35:06 PDT 2013 root@g1-235.catwhisker.org:/= usr/obj/usr/src/sys/CANARY i386 > g1-235(10.0-C)[2] sysctl vfs.unmapped_buf_allowed kern.bio_transient_maxc= nt kern.nbuf > vfs.unmapped_buf_allowed: 1 > kern.bio_transient_maxcnt: 697 > kern.nbuf: 7224 Could you, please, do some more measurements in the r248575M ? Please show the kern.nbuf for vfs.unmapped_buf_allowed=3D0 case. Also, from there, run "kgdb /boot/kernel/kernel /dev/mem" and do p *buffer_map. Reboot without applying any unmapped/transient tuning, run the kgdb again, and do p *buffer_map p *bio_transient_map Reboot with kern.bio_transient_maxcnt tunable set to 256 and again print the buffer_map and bio_transient_map from the kgdb. > none1@pci0:0:3:3: class=3D0x070002 card=3D0x02501028 chip=3D0x2a478= 086 rev=3D0x07 hdr=3D0x00 > vendor =3D 'Intel Corporation' > device =3D 'Mobile 4 Series Chipset AMT SOL Redirection' > class =3D simple comms > subclass =3D UART > bar [10] =3D type I/O Port, range 32, base 0xef88, size 8, enabled > bar [14] =3D type Memory, range 32, base 0xf6fda000, size 4096, ena= bled Oh, you do have the serial port on your notebook, usable remotely without serial cable. Your chipset seems to be AMT-capable, and you could use comms/amtterm from other machine to get a serial console. > vgapci0@pci0:1:0:0: class=3D0x030000 card=3D0x02501028 chip=3D0x065c1= 0de rev=3D0xa1 hdr=3D0x00 > vendor =3D 'NVIDIA Corporation' > device =3D 'G96M [Quadro FX 770M]' > class =3D display > subclass =3D VGA > bar [10] =3D type Memory, range 32, base 0xf5000000, size 16777216,= enabled > bar [14] =3D type Prefetchable Memory, range 64, base 0xe0000000, s= ize 268435456, enabled > bar [1c] =3D type Memory, range 64, base 0xf2000000, size 33554432,= enabled > bar [24] =3D type I/O Port, range 32, base 0xdf00, size 128, enabled My current theory is that the nvidia aperture size is 256MB, as indicated by bar at 14, and nvidia driver tries to map the whole aperture into KVA. With 4GB of RAM and i386, available 1GB of the KVA become quite tightly populated, and even small changes in the layout make the mapping of 256MB impossible. If I am right, this is more an issue with nvidia. Still, the layout should have not changed much, if at all. I want the kgdb information listed above to confirm/deny this. If you could configure AMT SOL console, then my theory about nvidia mapping the whole aperture could be confirmed or denied. Thank you. --cN0f9BRyJ83ABZok Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRSxIKAAoJEJDCuSvBvK1BHoYQAIjLimf8Vt4x9N49yPboSYBX ZrF2ZgunfsX8zMtIisTH7nm3n1XCiIqjPYmpuOLqkhoLdzxKPZx/z9WymVVmBSlI y3lYoFQA7w1mw6dRvnDQGa4nWyyN+T9DJgHj4ZUP2Ty1rwzKL+7DlHbJCzoGYu9R hFBc0mT9ElWqSULOtmHMUYiYW982LpehR+/wuCJ6rEOzUkE/vUBkIWmwkme4gmBm q4lA80O+UdqHzBmKdEBSzFuLxAmlCyU18CUy3cl8hHlVeGH4gR/r7Lu29oD3I0zz ZkLj5wW7/ow9aOi8k+bGVr/kp26RwHgVyDLzuCRSiHKTajoQ0pIrBVR7ARttMZ2b m/hE0MIw6jlO33yCxx+7Gi4Yt0lZBv036NKdty3/11orGHvhG5w4n8uoUbpJp+2M hK/z5eehxx/Va9R2ubYCU5ARodvHcruHYrl6xJBIu88N3UbExKk0jl3HBFy3f35f e2cRIBBMOw8n6sZCLizlz8ozbG0KxVIWryqVpxpKjoR2Ij68OLgZOSLwoNFqSH4F HNHeB7kfKpQBl4iBWwzygUqSbnw4ar/xUv3WvK40iUGHM5lRb/FR4D/KTbODVnYs STq1zfX8z6DbE2LMntCgDpBODmdnZd4/ohVIp3iuM/TaJ0i0UNtCG0fDo3k09gHg yQrqRKiudd9LpZsJHS1z =BZ+a -----END PGP SIGNATURE----- --cN0f9BRyJ83ABZok--