From owner-freebsd-x11@FreeBSD.ORG Tue Sep 16 20:53:08 2014 Return-Path: Delivered-To: freebsd-x11@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C4724934 for ; Tue, 16 Sep 2014 20:53:08 +0000 (UTC) Received: from mx2.paymentallianceintl.com (mx2.paymentallianceintl.com [216.26.158.171]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx2.paymentallianceintl.com", Issuer "Go Daddy Secure Certification Authority" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 99ECCB2F for ; Tue, 16 Sep 2014 20:53:07 +0000 (UTC) Received: from firewall.mikej.com (162-230-214-65.lightspeed.lsvlky.sbcglobal.net [162.230.214.65]) by mx2.paymentallianceintl.com (8.14.5/8.13.8) with ESMTP id s8GKr4dg006533 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 16 Sep 2014 16:53:05 -0400 (EDT) (envelope-from mikej@mikej.com) Received: from mail.mikej.com ([192.168.6.63]) by firewall.mikej.com (8.14.9/8.14.9) with ESMTP id s8GKqgwu052890 for ; Tue, 16 Sep 2014 16:53:03 -0400 (EDT) (envelope-from mikej@mikej.com) X-Authentication-Warning: firewall.mikej.com: Host [192.168.6.63] claimed to be mail.mikej.com MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Tue, 16 Sep 2014 16:52:42 -0400 From: Michael Jung To: freebsd-x11@freebsd.org Subject: Machine lockup - WAS: Re: drmn0: error: GPU lockup CP stall for more than 10000msec In-Reply-To: <42f077e55c2d4eea5a65f4f83209e26e@mail.mikej.com> References: <42f077e55c2d4eea5a65f4f83209e26e@mail.mikej.com> Message-ID: <7118ea03de5b97785ccbb476f67253ef@mail.mikej.com> X-Sender: mikej@mikej.com User-Agent: Roundcube Webmail/1.0.2 X-BeenThere: freebsd-x11@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: X11 on FreeBSD -- maintaining and support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Sep 2014 20:53:08 -0000 Since starting to use the DRI driver I am having problems with my machine locking up. Remote ssh sessions die, keyboard dead, screen blank. I am asking for pointers on how to try and debug this. My system is ZFS on boot so I created a swap partition on USB /dev/da0s1b and added dumpdev="/dev/da0p1" to rc.conf. After rebooting into single user mode and mounting zfs partitions "zfs mount -a" I tried #savecore /var/crash /dev/da0p1 but savecore complains that no core files were found. Is kernel dump to USB swap not supported? I'm running 10.1-BETA1 with these options added. options INVARIANTS options INVARIANT_SUPPORT options DEBUG_VFS_LOCKS I have not been in front of the box during failure so I am not sure if it exhibited the behavior shown below or not. I have used this box for several years using VESA drivers without issue. I'm posting here first since this seems to related to a DRI issue. Thanks. --mikej On 2014-09-12 14:02, Michael Jung wrote: > This has happened twice so I thought I would report it. I believe all > the required info is available in the links below. > > X.Org X Server 1.12.4 / 10.1-BETA1 #0 r271460 > > I was simply building ports when X11 puked, the console flipped back > to VT and tried to fire up X11 again. This happened over and over > until reboot. I did not try unloading the kernel modules and reloading > them. Please advise what else to do or to collect should this happen > again. > > Lastly, I have been using the VESA driver for a long time on this > hardware. I just starting using the new ATI driver and VT console. > > Thanks. > > http://216.26.158.189/x11/devinfo.txt > http://216.26.158.189/x11/dmesg.fail > http://216.26.158.189/x11/pciconf.txt > http://216.26.158.189/x11/pkg.txt > xorg.conf auto generated > > > rmn0: error: GPU lockup CP stall for more than 10000msec > drmn0: warning: GPU lockup (waiting for 0x000000000000d969) > drmn0: error: failed to get a new IB (-11) > error: [drm:pid5702:radeon_cs_ib_chunk] *ERROR* Failed to get ib ! > drmn0: info: Saved 1591 dwords of commands on ring 0. > drmn0: info: GPU softreset: 0x00000003 > drmn0: info: GRBM_STATUS = 0xA0003828 > drmn0: info: GRBM_STATUS_SE0 = 0x00000007 > drmn0: info: GRBM_STATUS_SE1 = 0x00000007 > drmn0: info: SRBM_STATUS = 0x20000040 > drmn0: info: R_008674_CP_STALLED_STAT1 = 0x00000000 > drmn0: info: R_008678_CP_STALLED_STAT2 = 0x00010000 > drmn0: info: R_00867C_CP_BUSY_STAT = 0x00020106 > drmn0: info: R_008680_CP_STAT = 0x80038647 > drmn0: info: GRBM_SOFT_RESET=0x00007F6B > drmn0: info: GRBM_STATUS = 0x00003828 > drmn0: info: GRBM_STATUS_SE0 = 0x00000007 > drmn0: info: GRBM_STATUS_SE1 = 0x00000007 > drmn0: info: SRBM_STATUS = 0x20000040 > drmn0: info: R_008674_CP_STALLED_STAT1 = 0x00000000 > drmn0: info: R_008678_CP_STALLED_STAT2 = 0x00000000 > drmn0: info: R_00867C_CP_BUSY_STAT = 0x00000000 > drmn0: info: R_008680_CP_STAT = 0x00000000 > drmn0: info: GPU reset succeeded, trying to resume > info: [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). > drmn0: info: WB enabled > drmn0: info: fence driver on ring 0 use gpu addr 0x0000000020000c00 > and cpu addr 0x0xfffff8018adfec00 > drmn0: info: fence driver on ring 3 use gpu addr 0x0000000020000c0c > and cpu addr 0x0xfffff8018adfec0c > info: [drm] ring test on 0 succeeded in 2 usecs > info: [drm] ring test on 3 succeeded in 1 usecs > drmn0: error: GPU lockup CP stall for more than 10000msec