From owner-freebsd-current@FreeBSD.ORG Thu Mar 21 16:12:08 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D86773E6 for ; Thu, 21 Mar 2013 16:12:08 +0000 (UTC) (envelope-from lattera@gmail.com) Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com [209.85.220.180]) by mx1.freebsd.org (Postfix) with ESMTP id 97A83706 for ; Thu, 21 Mar 2013 16:12:08 +0000 (UTC) Received: by mail-vc0-f180.google.com with SMTP id m17so2406831vca.39 for ; Thu, 21 Mar 2013 09:12:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=FesjUwFhB5iCu2Zr0NfPGI1g/Wba+V1BRAw+8QaTzCQ=; b=JnAt5L0qSwLBjZYpvUirJzhQtowu8zTuumYDgMjL0yobD31JbQv1jezZiXLemgUuDf mT1tBruYKoK5mBha17cwNaXsD5tSl1Mz0VFLPHjmDhhCeR38aXBYYUbWxbaJudJ+5ytO 3xpBIYPFHttiWsaebdqwPwH902btzVCFpLQzNR/6VhPJ1m3E/bSTeoV9uzhf5WscNrsH g1xAUdj8FjalNQqZCAqCwJlrVTx3TyxpOU4L1QP1Afu/X7t9m+WQukMQ7pQF9/REaPkB VYbmxExkBe0QLVnCIHPHOdv+6AcFs4/oH97C3l+Me28cDkPQS8p1RUmKFkbNECkJez9Z lb9Q== MIME-Version: 1.0 X-Received: by 10.220.113.137 with SMTP id a9mr14100413vcq.11.1363882327889; Thu, 21 Mar 2013 09:12:07 -0700 (PDT) Received: by 10.58.237.163 with HTTP; Thu, 21 Mar 2013 09:12:07 -0700 (PDT) In-Reply-To: References: <20130320160056.GG32811@albert.catwhisker.org> <20130320171340.GE3794@kib.kiev.ua> <20130320173759.GK32811@albert.catwhisker.org> <20130320174458.GG3794@kib.kiev.ua> <20130320180239.GN32811@albert.catwhisker.org> <20130320200857.GN3794@kib.kiev.ua> <20130321013610.GB42912@albert.catwhisker.org> <20130321080441.GS3794@kib.kiev.ua> <20130321133446.GF42912@albert.catwhisker.org> <20130321135835.GX3794@kib.kiev.ua> Date: Thu, 21 Mar 2013 12:12:07 -0400 Message-ID: Subject: Re: Silent reboots in head @r248550 starting xdm with x11/nvidia-driver From: Shawn Webb To: Konstantin Belousov Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Mar 2013 16:12:08 -0000 On Thu, Mar 21, 2013 at 12:04 PM, Shawn Webb wrote: > On Thu, Mar 21, 2013 at 9:58 AM, Konstantin Belousov wrote: > >> On Thu, Mar 21, 2013 at 06:34:46AM -0700, David Wolfskill wrote: >> > On Thu, Mar 21, 2013 at 10:04:41AM +0200, Konstantin Belousov wrote: >> > > ... >> > > This gives me an idea. The only so to say 'vm' change in r248508 was >> an >> > > addition of the bio_transient_map submap. The vfs.unmapped_buf_allowed >> > > tunable did not eliminated the submap creation. Please try r248569 >> > > with vfs.unmapped_buf_allowed set to 0. >> > >> > OK; I believe that worked. >> > >> > "Believe" because (in the normal course of things) I updated to: >> > >> > FreeBSD g1-235.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #845 >> r248575M/248575: Thu Mar 21 05:35:06 PDT 2013 >> root@g1-235.catwhisker.org:/usr/obj/usr/src/sys/CANARY i386 >> > >> > which is a little beyond r248569. (I still have r248508 on a >> > different slice, and figured I could update that to precisely r248569 >> > if this test was incorrect or inconclusive.) >> Not needed. BTW, your system uses UFS, right ? >> >> > >> > In any case: after booting the above (r248575) to verify that it worked >> > as long as I did not load nvidia.ko first, I then rebooted, escaped to >> > loader prompt, set vfs.unmapped_buf_allowed=0; boot. >> > >> > And after that came up OK, I (manually) loaded nvidia.ko, then >> > re-started X (xdm); the nVidia banner displayed just before the xdm >> > login screen did. (I have my xdm startup script "prefer" the nvidia >> > driver, but if nvidia.ko isn't loaded, it reverts to the nv driver >> > automagically.) >> > >> > > If this combination allows the nvidia driver to start, please revert >> > > the setting of vfs.unmapped_buf_allowed, and instead set >> > > kern.bio_transient_maxcnt e.g. to 256 or even 128. >> > >> > OK; rebooting, escaping to loader, *not* setting >> vfs.unmapped_buf_allowed, >> > and setting kern.bio_transient_maxcnt=256 also allowed nvidia driver >> > to be used at r248575. >> Ok, this is almost not a workaround but a solution (for now). See below. >> >> > >> > > Also, on the machine without the tunables customization, please show >> > > the output of sysctl kern.nbuf, kern.bio_transient_maxcnt. Also show >> > > the output of pciconf -lvb. >> > >> > OK; I rebooted (to revert the vfs.unmapped_buf_allowed setting) and >> > obtained the above (augmented a wee bit by some of the others >> > mentioned; I've attached that as "sysctl.txt". I've also attached >> > a copy of dmesg.boot, in case that's useful. >> > >> > I then tried rebooting r248575 and loading nvidia.ko *without* the >> > tunable customization, and verified that I still saw (what looks >> > like) a "reset" when I start X that way (as reported initially). >> > >> > > From what I see in your report, you use i386 arch. What is the amount >> > > of memory installed in the machine ? >> > >> > 4GB. >> > >> > Is the above what you had in mind, or would you like me to try at >> > precisely r248569? Anything else? >> r248569 is fine. >> >> >> > Script started on Thu Mar 21 06:07:41 2013 >> > g1-235(10.0-C)[1] uname -a >> > FreeBSD g1-235.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #845 >> r248575M/248575: Thu Mar 21 05:35:06 PDT 2013 >> root@g1-235.catwhisker.org:/usr/obj/usr/src/sys/CANARY i386 >> > g1-235(10.0-C)[2] sysctl vfs.unmapped_buf_allowed >> kern.bio_transient_maxcnt kern.nbuf >> > vfs.unmapped_buf_allowed: 1 >> > kern.bio_transient_maxcnt: 697 >> > kern.nbuf: 7224 >> Could you, please, do some more measurements in the r248575M ? >> >> Please show the kern.nbuf for vfs.unmapped_buf_allowed=0 case. >> Also, from there, run "kgdb /boot/kernel/kernel /dev/mem" and do >> p *buffer_map. >> >> Reboot without applying any unmapped/transient tuning, run the kgdb >> again, and do >> p *buffer_map >> p *bio_transient_map >> >> Reboot with kern.bio_transient_maxcnt tunable set to 256 and again >> print the buffer_map and bio_transient_map from the kgdb. >> >> > none1@pci0:0:3:3: class=0x070002 card=0x02501028 chip=0x2a478086 >> rev=0x07 hdr=0x00 >> > vendor = 'Intel Corporation' >> > device = 'Mobile 4 Series Chipset AMT SOL Redirection' >> > class = simple comms >> > subclass = UART >> > bar [10] = type I/O Port, range 32, base 0xef88, size 8, enabled >> > bar [14] = type Memory, range 32, base 0xf6fda000, size 4096, >> enabled >> Oh, you do have the serial port on your notebook, usable remotely without >> serial cable. Your chipset seems to be AMT-capable, and you could use >> comms/amtterm from other machine to get a serial console. >> >> > vgapci0@pci0:1:0:0: class=0x030000 card=0x02501028 chip=0x065c10de >> rev=0xa1 hdr=0x00 >> > vendor = 'NVIDIA Corporation' >> > device = 'G96M [Quadro FX 770M]' >> > class = display >> > subclass = VGA >> > bar [10] = type Memory, range 32, base 0xf5000000, size 16777216, >> enabled >> > bar [14] = type Prefetchable Memory, range 64, base 0xe0000000, >> size 268435456, enabled >> > bar [1c] = type Memory, range 64, base 0xf2000000, size 33554432, >> enabled >> > bar [24] = type I/O Port, range 32, base 0xdf00, size 128, enabled >> >> My current theory is that the nvidia aperture size is 256MB, as indicated >> by bar at 14, and nvidia driver tries to map the whole aperture into KVA. >> >> With 4GB of RAM and i386, available 1GB of the KVA become quite tightly >> populated, and even small changes in the layout make the mapping of >> 256MB impossible. If I am right, this is more an issue with nvidia. >> >> Still, the layout should have not changed much, if at all. I want the >> kgdb information listed above to confirm/deny this. >> >> If you could configure AMT SOL console, then my theory about nvidia >> mapping >> the whole aperture could be confirmed or denied. >> >> Thank you. >> > > I appear to be experiencing the same issue. I've been following this > thread. I have a coredump, but it's over 700mb in size. What would be the > best way to get that to you guys? The revision I'm at is r248583 on amd64 > (6GB RAM) with an NVIDIA Quadro FX 580. Relevant lines from `pciconf -lvb`: > > vgapci0@pci0:3:0:0: class=0x030000 card=0x063a10de chip=0x065910de > rev=0xa1 hdr=0x00 > vendor = 'NVIDIA Corporation' > device = 'G96 [Quadro FX 580]' > class = display > subclass = VGA > bar [10] = type Memory, range 32, base 0xf6000000, size 16777216, > enabled > bar [14] = type Prefetchable Memory, range 64, base 0xc0000000, size > 536870912, enabled > bar [1c] = type Memory, range 64, base 0xf4000000, size 33554432, > enabled > bar [24] = type I/O Port, range 32, base 0xdc80, size 128, enabled > > Looks like setting both vfs.unmapped_buf_allowed=0 and kern.bio_transient_maxcnt=512 worked for me. I'm now up and running smoothly.